Tuesday, March 20, 2007

A PE trick, the Thread Local Storage

In our training, when discussing packer's tricks and the PE file format, Pedram and I talk about different ways of executing code in a PE executable before the entry point (the one pointed to by the AddressOfEntryPoint field of the Optional header) is given control by the Windows loader.

One possible way of achieving it is to use the TLS directory entry of the PE file format's headers. TLS stands for Thread Local Storage and it's meant to be used to allocate storage for thread-specific data. The TLS structure, IMAGE_TLS_DIRECTORY, pointed to by the TLS directory entry has a small number of fields. The one of special interest is the one pointing to a list of callbacks, AddressOfCallBacks.

During the class I got asked how one would implement the functionality required to get code to run, called from the TLS callbacks. I had a rough an idea how the implementation would go but had never tried implementing it myself. So before the training was over I started to look into it and finally, a few days later, got it to work.

So, in order to put this together the first thing I did was to dig up the definition of the IMAGE_TLS_DIRECTORY structure.


typedef struct _IMAGE_TLS_DIRECTORY {
    UINT32 StartAddressOfRawData;
    UINT32 EndAddressOfRawData;
    PUINT32 AddressOfIndex;
    PIMAGE_TLS_CALLBACK *AddressOfCallBacks;
    UINT32 SizeOfZeroFill;
    UINT32 Characteristics;
} IMAGE_TLS_DIRECTORY, *PIMAGE_TLS_DIRECTORY;



Then I started hacking it together with a hexeditor, editing a harmless test PE file in order to have a TLS directory entry that would point to my manually hex-crafted structure.

PE File, TLS construction A

I then added some placeholder code


90
90
90
C2 0C 00
nop
nop
nop
retn 0x0c



which would handle the stack as a TLS callback is expected to

typedef void (MODENTRY *PIMAGE_TLS_CALLBACK) ( PTR DllHandle, UINT32 Reason, PTR Reserved );


and pointed to it from the callback list I created. I then pointed the callback field, AddressOfCallBacks, in the TLS structure to my callback list.

PE File, TLS construction B

And everything should be fine according to the plans... but nope!! and this is where I was stuck for a few days. The best I could do was to get my TLS callback code run on program unload, but never before the entry point was given control. Puzzling...




I dug out a file with a working TLS. Ilfak wrote a while ago about TLS here and had a nice, small example. (And, by the way, IDA does read and handle TLS just fine and marks them as entry points as well. Very convenient!)

I was set to get mine working, so I started looking at what his was doing different from mine (besides him just being sane and not doing it with a hex-editor ;-) )

I took a look at all the PE headers but none of the differences seemed to have anything to do with my sample not working.

I started to grow slightly uncomfortable and decided to bring in the artillery. Taking a look at how the windows loader (residing in NTDLL.DLL) handles both files and seeing what's affecting my TLS callback not being called should help. So I brought up BinNavi and traced the execution path of both binaries being loaded by Windows.

First thing was to trace the execution of Ilfak's example, I wanted to see all functions visited in the windows loader as his executable was being loaded. The TLS callbacks would have to be called by one of these.

Ilfak's callgraph trace

I then recorded the execution path of my test executable and took a look at what functions were being visited in both traces. (All the nodes in the following graph are visited by the working example, the green ones are the ones visited by mine, so there's a lot of superfluous code I can skip looking at)

Ilfak's callgraph trace with mine

Eventually spotted a function called when processing both binaries, _LdrpRunInitializeRoutines, that looked like a good candidate to be the one calling the TLS callbacks and took a look at the execution traces within that specific function.

In the following graph each node represents a basic block, the red one is where the TLS callbacks are called from. That's the node reached in the working example but not in mine. The green nodes are all the ones visited in the case the execution flow reaches the red basic block. The darker ones are the execution trace of my test. Hence I need to figure out which conditions are diverting the execution flow and how they are related to things I could change in my test program.

Ilfak's code and my trace

Now I could see the common parts of the execution path and a couple of branches that were taken differently. Given the visual output, it's extremely easy to see what branches were different and I could now check what affected the flow.

The TLS callbacks were ran immediately after the following condition

The critical branch

Which, tracing it back, comes for an initial check at the beginning of the function

The initial condition

The following article helped me when I was trying to figure out what was going on. According to it, the function _LdrpClearLoad­InProgress returns the number of DLLs currently loaded. That's the value that gets assigned to the variable that gets compared to zero and makes the flow of my test program diverge from Ilfak's working example. Therefore TLS callbacks only get run when a given amount of DLLs have already been loaded and that was the reason my test didn't run on load... I only needed to add one mode DLL to the import table for it to work. Fortunately it was easy to spot with BinNavi.

Thank you to cailin for the proofreading.

8 comments:

Anonymous said...

FYI: virii use TLS' entry points too.

"comparing execution paths..." respect.

Ero Carrera said...

Could you elaborate a bit more on that please?

Ero Carrera said...

According to Symantec the virus W32.Shrug was the first know to use TLS

Alex said...

I don't know if you have read this but here's an interesting article discussing the Shrug aka Chifton Virus LINK

Anonymous said...

http://www.amazon.com/Computer-Virus-Research-Defense-Symantec/dp/0321304543/ref=sr_1_1/002-1556081-5220863?ie=UTF8&s=books&qid=1179727152&sr=8-1

using TLS as entry point. just a quick mention - not sure if it is worth to buy the book just to read about it :)

cdman83 said...

Also, FYI, the Windows loader (which IMHO is as brainded as it can be and tries to load anything that even looks like PE without performing validation - probably in the name of "backwards compatibility") executes TLS even if in the Data Directory the size is specified as zero, but IDA doesn't show it in this case (of course you can always patch it with a hex editor ;-)).

Ero Carrera said...

Thanks for the info. I didn't know that the windows loader was so forgiving in that case.

Anonymous said...

Thanks for sharing this information. I was searching for the explanation of this behavior for 2 days. I thought that my program had the bug.