In the October 2011 Patch Tuesday, Microsoft released update MS11-077 to fix a null pointer de-reference vulnerability (CVE-2011-1985). In this paper, we will reverse engineer the patch for MS11-077 (CVE-2011-1985) to get a better understanding of the vulnerability fixed by this patch.
Using binary diff, we can see the changes that were made to the vulnerable file win32k.sys. Figure 1 below shows the TurboDiff results.
Figure 1: TurboDiff Results
As you can see in Figure 1 above, while most of the functions are identical, there are a couple of functions that look ‘suspicious’ and some others that are ‘changed’. The large number of changes is not a surprise because Microsoft has fixed four different vulnerabilities with this patch.
Taking a closer look at all the functions that were changed, you will see that the changes made to functions ‘NtUserfnINLBOXSTRING’, ‘NtUserfnSENTDDEMSG’ and ‘NtUserfnINCBOXSTRING’ are all the same. Figure 2, below shows the changes made.
Figure 2: Binary Diff for function NtUserfnINLBOXSTRING(x,x,x,x,x).
Looking at the binary difference, it is clear that the patch is checking that the arg_0 (first argument passed to the function) is 0xFFFFFFFF and if it is 0xFFFFFFFF, call _UserSetLastError() with 0x578 and return from the function.
This gives us two pointers to exploit the vulnerability. The first is that the arg_0 has to be 0xFFFFFFFF. The second pointer is that the patched function bails out setting system error code to 0x578. This is the system error code for ERROR_INVALID_WINDOW_HANDLE, thus hinting us that the argument is of type HWND.
Everything until now is pretty simple and it looks easy to exploit this vulnerability. However, the really challenge here is finding a user mode function that will call the vulnerable function. It turns out this isn’t very straightforward, and we will need to understand the Windows GUI subsystem.
Win32 GDI Subsystem:
Figure 3: Win32 interfaces and their relation to the kernel components
The GDI (Graphics Device Interface) APIs are implemented in the GDI32.DLL and include all the low-level graphics services such as drawing lines, displaying BMPs etc. The GDI APIs make system calls into the WIN32k.sys to implement most APIs. The User APIs are implemented in USER32.DLL module and include all higher-level GUI-related services such as window management, menus, dialog boxes, user controls etc. USER heavily relies on GDI to do its work.
One of the most important means of communication in Windows is Messages. Windows-based applications are event-driven and act upon messages sent to them. The way you program in Windows is by responding to events. These events are called Messages. Messages can signal many events, caused by the user, the operating system, or another program. Each window, owned by a thread, has a window procedure (function) for processing input messages and dispatching them to the operating system. If a thread accesses any of the user interface or GDI system calls (handled by win32k.sys), the kernel creates a THREADINFO structure which holds three message queues used to process input. These are the input queue, the post queue, and the send queue. The input queue is primarily used for mouse and keyboard messages, while the send and post queues are used for synchronous (send) and asynchronous (post) window messages respectively.
Asynchronous messages are used in one-way communication between window threads and are typically used to notify a window to perform a specific task. Asynchronous messages are handled by the PostMessage APIs and are sent to the post queue of the receiving thread. The sender does not wait for the processing to complete in the receiving thread and thus returns immediately.
Synchronous messages differ from asynchronous messages as the sender typically waits for a response to be provided or a timeout to occur before continuing execution. Thus, they require mechanisms to ensure that the threads are properly synchronized and in the expected state. Synchronous messages use the SendMessage APIs which in turn directs execution to the NtUserMessageCall system call in win32k.sys.
This information is enough for us to take our analysis further.
Hitting the vulnerable function:
As described above, the message mechanism plays an integral role in the user interface component of the Windows operating system. There are many different types of message codes and those less than 0x400 are reserved by operating system. Depending upon the type of message code, NtUserMessageCall() calls a particular function to handle the message. Let’s take a closer look at how NtUserMessageCall, calls the appropriate functions to handle different message types.
Figure 4: Assembly code for NtUserMessageCall()
As seen in the above figure, the function first checks if the Msg code is less than 0x400(EAX has the Msg code) to check if it’s a system message code. Each Message code denotes an index in the win32k!MessageTable byte array. This byte value is than logically AND to 0x3F, since the last 6bits of the byte obtained from win32k!MessageTable determines the function that will handle the Message code. _gapfnMessageCall is a function table that stores address of all the functions that can handle different messages. See Figures below to see how _gapfnMessageCall table looks.
Figure 5: _gapfnMessageCall function table
Thus if we can get the index of our vulnerable function in _gapfnMessageCall, we can easily compute how we can call the vulnerable function. The index of our vulnerable functions are 29(0x1D), 27(0x27) and 43(0x2B) for NtUserfnINLBOXSTRING(),NtUserfnINCBOXSTRING() and NtUserfnSENTDDEMSG() respectively.
Following is the pseudo code to compute Msg codes for hitting the vulnerable function:
for i in range[0x00 to 0x400] if MessageTable[i] & 0x3F == 0x1D //NtUserfnINLBOXSTRING() Hit! if MessageTable[i] & 0x3F == 0x1B //NtUserfnINCBOXSTRING() Hit! if MessageTable[i] & 0x3F == 0x2B //NtUserfnSENTDDEMSG() Hit!
Proof of Concept:
Other Possible Msg codes for hitting vulnerable functions are:
In the Patch Tuesday for August 2011, Microsoft released Security Bulletin MS11-058 (CVE-2011-1966) to fix a unauthenticated remote code execution vulnerability in DNS servers. According to the security advisory, a remote code execution vulnerability exists because the Windows DNS Server improperly handles a specially crafted NAPTR query string in memory. An attacker who successfully exploited this vulnerability could run arbitrary code in the context of the system.
We reverse engineered the patch to get a better understanding of the mechanism of the vulnerability and found this vulnerability can be triggered with a few easy steps. While the proof of concept described below demonstrates a denial of service, attackers with malicious intent may be able to get reliable code execution.
QualysGuard detects this vulnerability with QID: 90726 – Microsoft Windows DNS Server Remote Code Execution Vulnerability (MS11-058). Because of the possibility of a code execution attack, Qualys recommends all our customers to scan their environment for QID 90726 and apply this security update as soon as possible.
We start the analysis by binary-diffing the unpatched and the patched version of the files that were made available by the MS11-058 security update. This helps us understand the changes that were made in order to fix the vulnerabilities by this patch. To perform binary diffing we use TurboDiff, which is a plugin for IDA pro. TurboDiff shows us a list of all the functions that are identical, changed, unmatched and those that look suspicious. Suspicious functions have unchanged function graphs but changed checksums, which indicates a small code change was made. While most of the functions look identical, TurboDiff lists some of these functions as suspicious (Fig. 1).
Figure 1: Diffing results by TurboDiff.
As seen in figure 1, TurboDiff lists four of these functions as suspicious. The vulnerability we are investigating is related to CVE-2011-1966, which is related to Name Authority Pointer (NAPTR) DNS resource record. From the names of the four functions marked as suspicious, it is pretty clear the ‘NaptrWireRead(x,x,x,x)’ has something to do with the NAPTR DNS record and this should be the first function to analyze further.
Taking a closer look at the diffing results for the function NaptrWireRead(x,x,x,x) reveals there is only one change made to the entire function (Figure 2, indicated with green box).
The signed extended instruction “movsx edi, byte ptr[ebx]” is replaced with zero extended instruction “movzx edi, byte ptr[ebx]”. This value is then further used as the number of bytes to copy from the source buffer to the destination buffer for memcpy().
The signed extended move instruction is a trouble maker here. If the byte pointed by “byte ptr[ebx]” is greater than 127(0x7F), the resulting value in the edi register will be a very large number. For example if byte pointed by [ebx] is 128, the resulting value in register edi will be 0xFFFFFF80. The next instruction “LEA EAX, DWORD PTR DS:[EDI+1]” will load EAX with 0XFFFFFF81 which is used as a count for memcpy(). This example will try to copy the entire 4Gb of memory, leading the DNS service to crash.
Figure 2: Binary Diff for function NaptrWireRead(x,x,x,x).
For the proof of concept, you need two DNS servers. Register the domain crasher.test.com on the first server and configure a NAPTR DNS record as shown in figure 3 below. The second DNS server will act as a forwarder DNS server. Of all the fields shown, the “Service String” and “Regular Expression” fields are the ones that can take input greater than 127 characters with no restrictions.
To exploit this vulnerability we make any of the above mentioned fields have more than 128 characters. In this case we set the "regular expression" field to 128 characters.
From the forwarder DNS, type the command “nslookup -type=all crasher.test.com. 127.0.0.1”. This command will crash the DNS server working as the forwarder.
Figure 3: DNS NAPTR form.
To see the vulnerability in action, attach your debugger to the DNS executable and set a break point at the NaptrWireRead(x,x,x,x) function and also set a breakpoint at the memcpy() function in that function.
Figure 4: BreakPoint at memcpy() in NaptrWireRead().
From Figure 4 (see the values passed on the stack when calling memcpy()), it is clear that setting the value greater than 128 has caused the count parameter for memcpy function to be a really large value causing an access violation and crashing the DNS server.
The call Stack Trace for the above vulnerability can be seen in Figure 5 below.
Figure 5: Call Stack Trace.
To analyse the crash via windbg, you can start Windbg with the command “windbg -I” an register it as a default postmortem debugger. When you run the “nslookup -type=all crasher.test.com. 127.0.0.1” again, the DNS server crashes and windbg starts for analysis. Figure 6 shows the output of the !exploitable crasher analyzer.
Figure 6: !exploitable plugin output.
As shown in the analysis above, this vulnerability can be triggered with a few easy steps. While this PoC demonstrates a denial of service, attackers with malicious intent may be able to get reliable code execution. Hence we recommend all our customers to scan their environment for QID 90726 and apply this security update as soon as possible.