Monday, August 5, 2013

EBCTF - Binary400 writeup

The challenge was this:
Please upload 5 Windows console executable files with the same MD5 but with different printed outputs (file type: MS Windows, PE32 executable, console)

The output for the files should be:
File1: All Eindbazen are wearing wooden shoes
File2: All Eindbazen live in a windmill
File3: All Eindbazen grow their own tulips
File4: All Eindbazen smoke weed all day
File5: All Eindbazen are cheap bastards
I know of two ways of achieving this but the first one is probably considered cheating:

$ ./file1
All Eindbazen are wearing wooden shoes
$ ./file2
All Eindbazen live in a windmill
$ ./file3
All Eindbazen grow their own tulips
$ ./file4
All Eindbazen smoke weed all day
$ ./file5
All Eindbazen are cheap bastards
$ md5sum file*
291741d9344f70d5b33c52c6e7b041c5  file1
291741d9344f70d5b33c52c6e7b041c5  file2
291741d9344f70d5b33c52c6e7b041c5  file3
291741d9344f70d5b33c52c6e7b041c5  file4
291741d9344f70d5b33c52c6e7b041c5  file5
$ shasum file*
8f69fb6689c44f09422f5c033c97e8978b87c1a2  file1
8f69fb6689c44f09422f5c033c97e8978b87c1a2  file2
8f69fb6689c44f09422f5c033c97e8978b87c1a2  file3
8f69fb6689c44f09422f5c033c97e8978b87c1a2  file4
8f69fb6689c44f09422f5c033c97e8978b87c1a2  file5
Nice huh. But take a look at the code...you'll probably see why this is cheating.

#include <string.h>
#include <stdio.h>
#include <libgen.h>

int main(int argc, char const *argv[]) {
    printf("%s\n",
        strcmp(basename((char*)argv[0]), "file1") == 0 ? "All Eindbazen are wearing wooden shoes" :
        strcmp(basename((char*)argv[0]), "file2") == 0 ? "All Eindbazen live in a windmill" :
        strcmp(basename((char*)argv[0]), "file3") == 0 ? "All Eindbazen grow their own tulips" :
        strcmp(basename((char*)argv[0]), "file4") == 0 ? "All Eindbazen smoke weed all day" :
        strcmp(basename((char*)argv[0]), "file5") == 0 ? "All Eindbazen are cheap bastards" :
                                        "Nope"
    );
    return 0;
}
But fortunately I also have a more correct solution. I am not a cryptologist so the don't expect any details on how to generate md5 collisions, but I will tell you how I got to a solution.

First, MD5 is a state machine. It has an initial state, and for each block of data this state changes. Each time MD5 has a specific state and is given the same block of data, its state will change to the same state.

If two different files have the same MD5 sum you can append the same data to each of then and they will still share the same MD5 sum.

A collision occurs when two different blocks will yield the same state and apparently finding collisions is not that hard. But what can we use that for?

We can make a program something like this:


A1 = "AAAAAAAAAAA"
A2 = "AAAAAAAAAAA"
B1 = "BBBBBBBBBBB"
B2 = "BBBBBBBBBBB"
C1 = "CCCCCCCCCCC"
C2 = "CCCCCCCCCCC"

if (A1 == A2 && B1 == B2 && C1 == C2) {
    print "All Eindbazen are wearing wooden shoes"
} else if (A1 != A2 && B1 == B2 && C1 == C2) {
    print "All Eindbazen live in a windmill"
} else if (A1 == A2 && B1 != B2 && C1 == C2) {
    print "All Eindbazen grow their own tulips"
} else if (A1 != A2 && B1 != B2 && C1 == C2) {
    print "All Eindbazen smoke weed all day"
} else if (A1 == A2 && B1 == B2 && C1 != C2) {
    print "All Eindbazen are cheap bastards"
}
Now, copy this program into five different files and generate three pairs of collisions. In the first program substitute both A1 and A2 with the same block in collision one, B1 and B2 with the same block in collision two, and C1 and C2 with the same block in collision three.
In the second program substitute A1 with the one block of collision one and A2 with the second block of collision one. For B1, B2, C1 and C2 do as in the first program. Now the first and second program will both have the same MD5 sum but exhibit different behaviors. You get the point.
Now, the trick is that the collisions only work for a specific state of the MD5 state machine, so you cannot simply generate three collisions or even use one collision for all three blocks. You have to generate the 'A' collisions first and substitute them in. Then you can generate the 'B' collisions and only then you can generate 'C' collisions.
To do this you can use the Peter Selingers program which can be downloaded from here.
The collisions can be generated faster by using Vlastimil Klimas attack. His code can be gotten from here. It will require some changes however, but you figure it out...that's my challenge to you.
The end result can be seen here:

$ file *
file1.exe: PE32 executable (console) Intel 80386, for MS Windows
file2.exe: PE32 executable (console) Intel 80386, for MS Windows
file3.exe: PE32 executable (console) Intel 80386, for MS Windows
file4.exe: PE32 executable (console) Intel 80386, for MS Windows
file5.exe: PE32 executable (console) Intel 80386, for MS Windows
$ md5sum *
d47eafe3f5ae42e0498ed7769900004c  file1.exe
d47eafe3f5ae42e0498ed7769900004c  file2.exe
d47eafe3f5ae42e0498ed7769900004c  file3.exe
d47eafe3f5ae42e0498ed7769900004c  file4.exe
d47eafe3f5ae42e0498ed7769900004c  file5.exe
$ shasum *
6cd2d31da99b59bc8b94b04d85d0868dc76e7bd0  file1.exe
245a24f647eb5d8fa6e5ff360e42b6f8f3692cfa  file2.exe
1ae718a4a1b8f9906068dd315473aa41927bfe99  file3.exe
0a1c867cf62c35d4283675599d445e29e584c94f  file4.exe
143f0838c8ed0e3800551ad66e01cb1855afa2f3  file5.exe
$ for file in *; do echo "$file: "$(wine $file); done
file1.exe: All Eindbazen are wearing wooden shoes
file2.exe: All Eindbazen live in a windmill
file3.exe: All Eindbazen grow their own tulips
file4.exe: All Eindbazen smoke weed all day
file5.exe: All Eindbazen are cheap bastards
$
You can download my files here.

Monday, July 1, 2013

Understanding Windows shellcode

I learned something new today.

For some time I have tried to understand Skapes paper on Windows shellcode with some success. I had trouble with the finding kernel32.dll using PEB technique and here is why.

The code is this (without Windows 9x version):

push esi
xor  eax, eax
mov  eax, fs:[eax + 0x30] ;Get address of TIB
mov  eax, [eax + 0x0c]    ;Get address of LDR
mov  esi, [eax + 0x1c]    ;Get address of first loaded module descriptor
lodsd                     ;Get address of next loaded module descriptor
mov  eax, [eax + 0x8]     ;Get base address of module (kernel32.dll)
pop  esi
ret

Easy, right ?
Wrong...not if you follow MSDN documentation.

If you google 'PEB_LDR_DATA' the first link you get is http://msdn.microsoft.com/en-us/library/windows/desktop/aa813708(v=vs.85).aspx

So according to Skapes shellcode the addres of the first module descriptor is 0x1c (28) bytes into that structure. But according to MSDN the structure is exactly 28 bytes long...which is a damn lie!

I asked on a SecurtyFocus mailing list called security-basics and got a reply that the MSDN information was wrong!

An accurate definition can be found at undocumented.ntinternals.net.

I drew an illustration (I like visualizing data structures) which helps when following the code:


The FS register points to the Thread Information Block and from there on you can follow the pointer at offset 0x30 to the PEB. Then at offset 0x0c you find a pointer to the PEB loader data.
At offset 0x1c you find the linked list of loaded modules in initialization order. Note that the forward link points INSIDE the LDR module structure and NOT to the base address of the structure. This means that at address zero relative to that forward link you find the next forward link to the kernel32 module structure...still an address inside the structure.
Eight bytes past that you find a pointer to the base address of kernel32.dll.

Tadaaa!!

Monday, April 8, 2013

ASLR and mapped libraries

I learned something new recently.

While working on an exploit for a challenge ('postit-hardened' at http://treasure.pwnies.dk/), I had to find out how libc was mapped in an ASLR environment and I were pleasantly surprised (from an attackers viewpoint).

The service is running in a 32 bit environment, so I started by gathering statistics on how well libc was randomized. I did that with the following C program:

#include <stdio.h>
#include <string.h>

long libc_mapping() {
    long address = 0;
    char buffer[1024] = {0};
    FILE * f = fopen("/proc/self/maps", "r");
    if (f) {
        while (fgets(buffer, sizeof(buffer) - 1, f)) {
            if (strstr(buffer, "libc") && strstr(buffer, "r-xp")) {
                sscanf(buffer, "%lx", &address);
                break;
            }
        }
        fclose(f);
    }
    return address;
}

int main(int argc, char const *argv[]) {
    printf("%ld", libc_mapping());
    return 0;
}
It simply looks up the address where libc is loaded for itself by looking into '/proc/self/maps'. We are interested in the executable mapping.

Together with this shell script we can get some statistics:
#!/bin/bash

comp_addr=0
aslr_bitmap=0

while true; do
    addr=$(./libc_addr)
    if [ "${comp_addr}" == "0" ]; then
        comp_addr=${addr}
    else
        diff=$((comp_addr^addr))
        aslr_bitmap=$((aslr_bitmap|diff))
        printf "%x\n" $aslr_bitmap
    fi
done

It runs the program repeatedly and finds a bitwise difference between the current mapping and the first mapping. Then it writes out all changed bits.

Runing this for a short while we get this mask on my Ubuntu machine: 3ff000

This means that only 10 bits are randomized giving us 1024 possible locations for libc. That is very brute forcable!

Just for fun I tried the same with 64 bits and got this mask: 1fffffff000
That is 29 bits which is more than 500 million possible locations. Much better (from a defender viewpoint).

BUT! Another thing...the service used fork to spawn a new process to handle our connection. How is libc randomized when the process is forked?

Lets see.

I modified the program to fork out and write the address of libc mapped into the child:

#include <stdio.h>
#include <string.h>
#include <unistd.h>

long libc_mapping() {
    long address = 0;
    char buffer[1024] = {0};
    FILE * f = fopen("/proc/self/maps", "r");
    if (f) {
        while (fgets(buffer, sizeof(buffer) - 1, f)) {
            if (strstr(buffer, "libc") && strstr(buffer, "r-xp")) {
                sscanf(buffer, "%lx", &address);
                break;
            }
        }
        fclose(f);
    }
    return address;
}

int main(int argc, char const *argv[]) {
    pid_t pid;
    int i, status;

    printf("Parent: %lx\n", libc_mapping());

    for (i = 1; i <= 10; i++) {
        pid = fork();
        if (pid) {
            waitpid(pid, &status, 0);
        } else {
            printf("Child %d: %lx\n", i, libc_mapping());
            return;
        }
    }

    return 0;
}


This is a sample run:
$ ./forked_libc_addr
Parent: f758a000
Child 1: f758a000
Child 2: f758a000
Child 3: f758a000
Child 4: f758a000
Child 5: f758a000
Child 6: f758a000
Child 7: f758a000
Child 8: f758a000
Child 9: f758a000
Child 10: f758a000

Again I tried the same on a 64 bit machine with the same result:
$ ./forked_libc_addr
Parent: 7f1953193000
Child 1: 7f1953193000
Child 2: 7f1953193000
Child 3: 7f1953193000
Child 4: 7f1953193000
Child 5: 7f1953193000
Child 6: 7f1953193000
Child 7: 7f1953193000
Child 8: 7f1953193000
Child 9: 7f1953193000
Child 10: 7f1953193000
So...when forking libc (and all other mappings) stay.

This makes perfect sense when you think about it. The process is cloned, all memory must be intact. How should the operating system know, if your program have pointers into libc which needs updating?

It cannot, so libraries must stay. This is very interesting especially for my exploit since the program has an info leak vulnerability (as well as some others) but when I have received the info the connection is cut. So I have to make a new connection.

But since the info I want is the location of libc this apparently is not a problem because the info is still valid.

WIN!!!