FREE BOOK

Chapter 6: Memory Corruption Part II - Heaps

Posted by Addison Wesley Free Book | C# Language November 16, 2009
This chapter discusses a myriad of stability issues that can surface in an application when the heap is used in a nonconventional fashion. Although the stack and the heap are managed very differently in Windows, the process by which we analyze stack- and heap-related problems is the same.

Heap Overruns and Underruns
 
In the introduction to this chapter, we looked at the internal workings of the heap manager and how all heap blocks are laid out. Figure 6.8 illustrated how a heap block is broken down and what auxiliary metadata is kept on a per-block basis for the heap manager to be capable of managing the block. If a faulty piece of code overwrites any of the metadata, the integrity of the heap is compromised and the application will fault. The most common form of metadata overwriting is when the owner of the heap block does not respect the boundaries of the block. This phenomenon is known as a heap overrun or, reciprocally, a heap underrun.
 
Let's take a look at an example. The application shown in Listing 6.6 simply makes a copy of the string passed in on the command line and prints out the copy.
 
 Listing 6.6 Heap-based string copy application
 
 #include <windows.h>
 
#include <stdio.h>
 
#include <conio.h>
 
#define SZ_MAX_LEN 10
 WCHAR* pszCopy = NULL ;
 BOOL DupString(WCHAR* psz);
 
int __cdecl wmain (int argc, wchar_t* pArgs[])
 {
 
int iRet=0;
 
if(argc==2)
 {
 printf("Press any key to start\n");
 _getch();
 DupString(pArgs[1]);
 }
 
else
 
{
 iRet=1;
 }
 
return iRet;
 }
 BOOL DupString(WCHAR* psz)
 {
 BOOL bRet=FALSE;
 
if(psz!=NULL)
 {
 pszCopy=(WCHAR*) HeapAlloc(GetProcessHeap(),
 0,
 SZ_MAX_LEN*sizeof(WCHAR));
 
if(pszCopy)
 {
 wcscpy(pszCopy, psz);
 wprintf(L"Copy of string: %s", pszCopy);
 HeapFree(GetProcessHeap(), 0, pszCopy);
 bRet=TRUE;
 }
 }
 
return bRet;
 }
 
 
The source code and binary for Listing 6.6 can be found in the following folders:
 
 Source code: C:\AWD\Chapter6\Overrun
 Binary: C:\AWDBIN\WinXP.x86.chk\06Overrun.exe
 
When you run this application with various input strings, you will quickly notice that input strings of size 10 or less seem to work fine. As soon as you breach the 10-character limit, the application crashes. Let's pick the following string to use in our debug session:
 
  C:\AWDBIN\WinXP.x86.chk\06Overrun.exe ThisStringShouldReproTheCrash
 
Run the application and attach the debugger when you see the Press any key to start prompt. Once attached, press any key to resume execution and watch how the debugger breaks execution with an access violation.

...
...
...
 0:001> g
 (1b8.334): Access violation - code c0000005 (first chance)
 
 First chance exceptions are reported before any exception handling.
 
 This exception may be expected and handled.
 
 eax=00650052 ebx=00080000 ecx=00720070 edx=00083188 esi=00083180 edi=0000000f
 eip=7c91142e esp=0006f77c ebp=0006f99c iopl=0 nv up ei ng nz na po cy
 cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010283
 ntdll!RtlAllocateHeap+0x653:
 7c91142e 8b39 mov edi,dword ptr [ecx] ds:0023:00720070=????????
 0:000> k
 ChildEBP RetAddr
 0007f70c 7c919f5d ntdll!RtlpInsertFreeBlock+0xf3
 0007f73c 7c918839 ntdll!RtlpInitializeHeapSegment+0x186
 0007f780 7c911c76 ntdll!RtlpExtendHeap+0x1ca
 0007f9b0 7c80e323 ntdll!RtlAllocateHeap+0x623
 0007fa24 7c80e00d kernel32!BasepComputeProcessPath+0xb3
 0007fa64 7c80e655 kernel32!BaseComputeProcessDllPath+0xe3
 0007faac 7c80e5ab kernel32!GetModuleHandleForUnicodeString+0x28
 0007ff30 7c80e45c kernel32!BasepGetModuleHandleExW+0x18e
 0007ff48 7c80b6c0 kernel32!GetModuleHandleW+0x29
 0007ff54 77c39d23 kernel32!GetModuleHandleA+0x2d
 0007ff60 77c39e78 msvcrt!__crtExitProcess+0x10
 0007ff70 77c39e90 msvcrt!_cinit+0xee
 0007ff84 010014c2 msvcrt!exit+0x12
 0007ffc0 7c816fd7 06overrun!__wmainCRTStartup+0x118
 0007fff0 00000000 kernel32!BaseProcessStart+0x23
 
Glancing at the stack, it looks like the application was in the process of shutting down when the access violation occurred. As per our previous discussion, whenever you encounter an access violation in the heap manager code, chances are you are experiencing a heap corruption. The only problem is that our code is nowhere on the stack. Once again, the biggest problem with heap corruptions is that the faulting code is not easily trapped at the point of corruption; rather, the corruption typically shows up later on in the execution. This behavior alone makes it really hard to track down the source of heap corruption. However, with an understanding of how the heap manager works, we can do some preliminary investigation of the heap and see if we can find some clues as to some potential culprits. Without knowing which part of the heap is corrupted, a good starting point is to see if the segments are intact. Instead of manually walking the segments, we use the !heap extension command, which saves us a ton of grueling manual heap work. A shortened version of the output for the default process heap is shown in Listing 6.7.
 
 Listing 6.7 Heap corruption analysis using the heap debugger command
 
 0:000> !heap -s
 Heap Flags Reserv Commit Virt Free List UCR Virt Lock Fast
 (k) (k) (k) (k) length blocks cont. heap
 ---------------------------------------
 00080000 00000002 1024 16 16 3 1 1 0 0 L
 00180000 00001002 64 24 24 15 1 1 0 0 L
 00190000 00008000 64 12 12 10 1 1 0 0
 00260000 00001002 64 28 28 7 1 1 0 0 L
 ---------------------------------------
 0:000> !heap -a 00080000
 Index Address Name Debugging options enabled
 1: 00080000
 Segment at 00080000 to 00180000 (00004000 bytes committed)
 Flags: 00000002
 ForceFlags: 00000000
 Granularity: 8 bytes
 Segment Reserve: 00100000
 Segment Commit: 00002000
 DeCommit Block Thres: 00000200
 DeCommit Total Thres: 00002000
 Total Free Size: 000001d0
 Max. Allocation Size: 7ffdefff
 Lock Variable at: 00080608
 Next TagIndex: 0000
 Maximum TagIndex: 0000
 Tag Entries: 00000000
 08_0321374460_ch06.qxd 10/3/07 10:49 PM Page 289
 290 Chapter 6 Memory Corruption Part II-Heaps
 PsuedoTag Entries: 00000000
 Virtual Alloc List: 00080050
 UCR FreeList: 00080598
 FreeList Usage: 00000000 00000000 00000000 00000000
 FreeList[ 00 ] at 00080178: 00083188 . 00083188
 00083180: 003a8 . 00378 [00] - free
 Unable to read nt!_HEAP_FREE_ENTRY structure at 0065004a
 Segment00 at 00080640:
 Flags: 00000000
 Base: 00080000
 First Entry: 00080680
 Last Entry: 00180000
 Total Pages: 00000100
 Total UnCommit: 000000fc
 Largest UnCommit:000fc000
 UnCommitted Ranges: (1)
 00084000: 000fc000
 Heap entries for Segment00 in Heap 00080000
 00080000: 00000 . 00640 [01] - busy (640)
 00080640: 00640 . 00040 [01] - busy (40)
 00080680: 00040 . 01808 [01] - busy (1800)
 00081e88: 01808 . 00210 [01] - busy (208)
 00082098: 00210 . 00228 [01] - busy (21a)
 000822c0: 00228 . 00090 [01] - busy (84)
 00082350: 00090 . 00030 [01] - busy (22)
 00082380: 00030 . 00018 [01] - busy (10)
 00082398: 00018 . 00068 [01] - busy (5b)
 00082400: 00068 . 00230 [01] - busy (224)
 00082630: 00230 . 002e0 [01] - busy (2d8)
 00082910: 002e0 . 00320 [01] - busy (314)
 00082c30: 00320 . 00320 [01] - busy (314)
 00082f50: 00320 . 00030 [01] - busy (24)
 00082f80: 00030 . 00030 [01] - busy (24)
 00082fb0: 00030 . 00050 [01] - busy (40)
 00083000: 00050 . 00048 [01] - busy (40)
 00083048: 00048 . 00038 [01] - busy (2a)
 00083080: 00038 . 00010 [01] - busy (1)
 00083090: 00010 . 00050 [01] - busy (44)
 000830e0: 00050 . 00018 [01] - busy (10)
 000830f8: 00018 . 00068 [01] - busy (5b)
 00083160: 00068 . 00020 [01] - busy (14)
 00083180: 003a8 . 00378 [00]
 000834f8: 00000 . 00000 [00]
 0
 
The last heap entry in a segment is typically a free block. In Listing 6.7, however, we have a couple of odd entries at the end. The status of the heap blocks (0) seems to indicate that both blocks are free; however, the size of the blocks does not seem to match up. Let's look at the first free block:
 
 00083180: 003a8 . 00378 [00]
 
The heap block states that the size of the previous block is 003a8 and the size of the current block is 00378. Interestingly enough, the prior block is reporting its own size to be 0x20 bytes, which does not match up well. Even worse, the last free block in the segment states that both the previous and current sizes are 0. If we go even further back in the heap segment, we can see that all the heap entries prior to 00083160 make sense (at least in the sense that the heap entry metadata seems intact). One of the potential theories should now start to take shape. The usage of the heap block at location 00083160 seems suspect, and it's possible that the usage of that heap block caused the metadata of the following block to become corrupt. Who allocated the heap block at 00083160? If we take a closer look at the block, we can see if we can recognize the content:
 
 0:000> dd 00083160
 00083160 000d0004 000c0199 00000000 00730069
 00083170 00740053 00690072 0067006e 00680053
 00083180 0075006f 0064006c 00650052 00720070
 00083190 0054006f 00650068 00720043 00730061
 000831a0 00000068 00000000 00000000 00000000
 000831b0 00000000 00000000 00000000 00000000
 000831c0 00000000 00000000 00000000 00000000
 000831d0 00000000 00000000 00000000 00000000
 
Parts of the block seem to resemble a string. If we use the du command on the block starting at address 000830f8+0xc, we see the following:
 
 0:000> du 00083160+c
 0008316c "isStringShouldReproTheCrash"
 
The string definitely looks familiar. It is the same string (or part of it) that we passed in on the command line. Furthermore, the string seems to stretch all the way to address 000831a0, which crosses the boundary to the next reported free block at address 00083180. If we dump out the heap entry at address 00083180, we can see the following:
 
 0:000> dt _HEAP_ENTRY 00083180
 +0x000 Size : 0x6f
 +0x002 PreviousSize : 0x75
 +0x000 SubSegmentCode : 0x0075006
 +0x004 SmallTagIndex : 0x6c 'l'
 +0x005 Flags : 0 "
 +0x006 UnusedBytes : 0x64 'd'
 +0x007 SegmentIndex : 0 
 
The current and previous size fields correspond to part of the string that crossed the boundary of the previous block. Armed with the knowledge of which string seemed to have caused the heap block overwrite, we can turn to code reviewing and figure out relatively easily that the string copy function wrote more than the maximum number of characters allowed in the destination string, causing an overwrite of the next heap block. While the heap manager was unable to detect the overwrite at the exact point it occurred, it definitely detected the heap block overwrite later on in the execution, which resulted in an access violation because the heap was in an inconsistent state. In the previous simplistic application, analyzing the heap at the point of the access violation yielded a very clear picture of what overwrote the heap block and subsequently, via code reviewing, who the culprit was. Needless to say, it is not always possible to arrive at these conclusions merely by inspecting the contents of the heap blocks. The complexity of the system can dramatically reduce your success when using this approach. Furthermore, even if you do get some clues to what is overwriting the heap blocks, it might be really difficult to find the culprit by merely reviewing code. Ultimately, the easiest way to figure out a heap corruption would be if we could break execution when the memory is being overwritten rather than after. Fortunately, the Application Verifier tool provides a powerful facility that enables this behavior. The application verifier test setting commonly used when tracking down heap corruptions is called the Heaps test setting (also referred to as pageheap). Pageheap works on the basis of surrounding the heap blocks with a protection layer that serves to isolate the heap blocks from one another. If a heap block is overwritten, the protection layer detects the overwrite as close to the source as possible and breaks execution, giving the developer the ability to investigate why the overwrite occurred. Pageheap runs in two different modes: normal pageheap and full pageheap. The primary difference between the two modes is the strength of the protection layer. Normal pageheap uses fill patterns in an attempt to detect heap block corruptions. The utilization of fill patterns requires that another call be made to the heap manager post corruption so that the heap manager has the chance to validate the integrity (check fill patterns) of the heap block and report any inconsistencies. Additionally, normal page heap keeps the stack trace for all allocations, making it easier to understand who allocated the memory. Figure 6.10 illustrates what a heap block looks like when normal page heap is turned on.
 
 
 
 Figure 6.10 Normal page heap block layout
 

Total Pages : 11 678910

comments