0%

CSAPP——BombLabⅠ

因为对汇编很感兴趣,加上之后想搞ctf的pwn和reverse方向,我开始读《深入理解计算机系统》(CSAPP),并做配套的lab。读完第三章的大部分内容后,今天完成了Bomb Lab(这是下载地址)的前两个Phase。该文章记录了我的思考过程,并将其重新梳理写成解答。

Phase 1

在开始做Bomb Lab之前,我完全想不到正确的字符串会以怎样的方式呈现。于是我花了一段时间看ELF文件(用readelf命令),直接阅读bomb的十六进制表示(用hexdump),没有找到很多有用的信息,但这让我回忆起了一些Section Header与各个Section的作用。(回过头来想想,在该实验中汇编代码比ELF文件重要。)

之后看了bomb的源码bomb.c:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
/***************************************************************************
* Dr. Evil's Insidious Bomb, Version 1.1
* Copyright 2011, Dr. Evil Incorporated. All rights reserved.
*
* LICENSE:
*
* Dr. Evil Incorporated (the PERPETRATOR) hereby grants you (the
* VICTIM) explicit permission to use this bomb (the BOMB). This is a
* time limited license, which expires on the death of the VICTIM.
* The PERPETRATOR takes no responsibility for damage, frustration,
* insanity, bug-eyes, carpal-tunnel syndrome, loss of sleep, or other
* harm to the VICTIM. Unless the PERPETRATOR wants to take credit,
* that is. The VICTIM may not distribute this bomb source code to
* any enemies of the PERPETRATOR. No VICTIM may debug,
* reverse-engineer, run "strings" on, decompile, decrypt, or use any
* other technique to gain knowledge of and defuse the BOMB. BOMB
* proof clothing may not be worn when handling this program. The
* PERPETRATOR will not apologize for the PERPETRATOR's poor sense of
* humor. This license is null and void where the BOMB is prohibited
* by law.
***************************************************************************/

#include <stdio.h>
#include <stdlib.h>
#include "support.h"
#include "phases.h"

/*
* Note to self: Remember to erase this file so my victims will have no
* idea what is going on, and so they will all blow up in a
* spectaculary fiendish explosion. -- Dr. Evil
*/

FILE *infile;

int main(int argc, char *argv[])
{
char *input;

/* Note to self: remember to port this bomb to Windows and put a
* fantastic GUI on it. */

/* When run with no arguments, the bomb reads its input lines
* from standard input. */
if (argc == 1) {
infile = stdin;
}

/* When run with one argument <file>, the bomb reads from <file>
* until EOF, and then switches to standard input. Thus, as you
* defuse each phase, you can add its defusing string to <file> and
* avoid having to retype it. */
else if (argc == 2) {
if (!(infile = fopen(argv[1], "r"))) {
printf("%s: Error: Couldn't open %s\n", argv[0], argv[1]);
exit(8);
}
}

/* You can't call the bomb with more than 1 command line argument. */
else {
printf("Usage: %s [<input_file>]\n", argv[0]);
exit(8);
}


initialize_bomb();

printf("Welcome to my fiendish little bomb. You have 6 phases with\n");
printf("which to blow yourself up. Have a nice day!\n");

/* Hmm... Six phases must be more secure than one phase! */
input = read_line(); /* Get input */
phase_1(input); /* Run the phase */
phase_defused(); /* Drat! They figured it out!
* Let me know how they did it. */
printf("Phase 1 defused. How about the next one?\n");

/* The second phase is harder. No one will ever figure out
* how to defuse this... */
input = read_line();
phase_2(input);
phase_defused();
printf("That's number 2. Keep going!\n");

/* I guess this is too easy so far. Some more complex code will
* confuse people. */
input = read_line();
phase_3(input);
phase_defused();
printf("Halfway there!\n");

/* Oh yeah? Well, how good is your math? Try on this saucy problem! */
input = read_line();
phase_4(input);
phase_defused();
printf("So you got that one. Try this one.\n");

/* Round and 'round in memory we go, where we stop, the bomb blows! */
input = read_line();
phase_5(input);
phase_defused();
printf("Good work! On to the next...\n");

/* This phase will never be used, since no one will get past the
* earlier ones. But just in case, make this one extra hard. */
input = read_line();
phase_6(input);
phase_defused();

/* Wow, they got it! But isn't something... missing? Perhaps
* something they overlooked? Mua ha ha ha ha! */

return 0;
}

流程大概是读取用户输入,phase函数们判断是否正确,如果正确进入下一个阶段,否则炸弹爆炸,程序退出。不过从最后一段注释看来,还有一些隐藏的东西。

重点在于phase函数们是如何判断的,输入下面的命令查看bomb的反汇编代码:

1
jkilopu@ubuntu:~/Workspace/csapp/Assembly/bomb$ objdump -d bomb

下面是phase_1的反汇编代码:

1
2
3
4
5
6
7
8
9
0000000000400ee0 <phase_1>:
400ee0: 48 83 ec 08 sub $0x8,%rsp
400ee4: be 00 24 40 00 mov $0x402400,%esi
400ee9: e8 4a 04 00 00 callq 401338 <strings_not_equal>
400eee: 85 c0 test %eax,%eax
400ef0: 74 05 je 400ef7 <phase_1+0x17>
400ef2: e8 43 05 00 00 callq 40143a <explode_bomb>
400ef7: 48 83 c4 08 add $0x8,%rsp
400efb: c3 retq

第三行调用了strings_not_equal函数,第四、五行的意思是,若该函数的返回值不等于0,调用explode_bomb函数,这意味着拆除失败。

在调用strings_not_equal函数前,值0x402400被传送至作为第二个参数的寄存器esi,于是根据常识可以推测第一个参数的值为输入的字符串的地址。

不妨假设strings_not_equal人如其名,如果传入的两个字符串相等,则返回0。那接下来的任务就是找到0x402400地址中的字符串了。

下面开始gdb调试,随便输入一串字符后,直接进入phase_1函数:

1
2
3
4
5
6
7
8
9
10
11
(gdb) disassemble 
Dump of assembler code for function phase_1:
=> 0x0000000000400ee0 <+0>: sub $0x8,%rsp
0x0000000000400ee4 <+4>: mov $0x402400,%esi
0x0000000000400ee9 <+9>: callq 0x401338 <strings_not_equal>
0x0000000000400eee <+14>: test %eax,%eax
0x0000000000400ef0 <+16>: je 0x400ef7 <phase_1+23>
0x0000000000400ef2 <+18>: callq 0x40143a <explode_bomb>
0x0000000000400ef7 <+23>: add $0x8,%rsp
0x0000000000400efb <+27>: retq
End of assembler dump.

以字符形式打印0x402400处的值:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
(gdb) p *(char *) 0x402400
$1 = 66 'B'
(gdb) p *(char *) 0x402401
$2 = 111 'o'
(gdb) p *(char *) 0x402402
$3 = 114 'r'
(gdb) p *(char *) 0x402403
$4 = 100 'd'
(gdb) p *(char *) 0x402404
$5 = 101 'e'
(gdb) p *(char *) 0x402405
$6 = 114 'r'
(gdb) p *(char *) 0x402406
$7 = 32 ' '

看起来像是一个单词或一句话。由于字符串常量都在.rodata段,先找到.rodata的位置:

1
2
3
4
5
6
7
8
jkilopu@ubuntu:~/Workspace/csapp/Assembly/bomb$ readelf -S bomb
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
···
[15] .rodata PROGBITS 00000000004022b0 000022b0
00000000000004e5 0000000000000000 A 0 0 16
···

看起来.rodata段从文件地址0x22b0开始。打印bomb的全部字节:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
jkilopu@ubuntu:~/Workspace/csapp/Assembly/bomb$ hexdump -C bomb
···
000022b0 01 00 02 00 72 00 25 73 3a 20 45 72 72 6f 72 3a |....r.%s: Error:|
000022c0 20 43 6f 75 6c 64 6e 27 74 20 6f 70 65 6e 20 25 | Couldn't open %|
000022d0 73 0a 00 55 73 61 67 65 3a 20 25 73 20 5b 3c 69 |s..Usage: %s [<i|
000022e0 6e 70 75 74 5f 66 69 6c 65 3e 5d 0a 00 54 68 61 |nput_file>]..Tha|
000022f0 74 27 73 20 6e 75 6d 62 65 72 20 32 2e 20 20 4b |t's number 2. K|
00002300 65 65 70 20 67 6f 69 6e 67 21 00 48 61 6c 66 77 |eep going!.Halfw|
00002310 61 79 20 74 68 65 72 65 21 00 47 6f 6f 64 20 77 |ay there!.Good w|
00002320 6f 72 6b 21 20 20 4f 6e 20 74 6f 20 74 68 65 20 |ork! On to the |
00002330 6e 65 78 74 2e 2e 2e 00 57 65 6c 63 6f 6d 65 20 |next....Welcome |
00002340 74 6f 20 6d 79 20 66 69 65 6e 64 69 73 68 20 6c |to my fiendish l|
00002350 69 74 74 6c 65 20 62 6f 6d 62 2e 20 59 6f 75 20 |ittle bomb. You |
00002360 68 61 76 65 20 36 20 70 68 61 73 65 73 20 77 69 |have 6 phases wi|
00002370 74 68 00 00 00 00 00 00 77 68 69 63 68 20 74 6f |th......which to|
00002380 20 62 6c 6f 77 20 79 6f 75 72 73 65 6c 66 20 75 | blow yourself u|
00002390 70 2e 20 48 61 76 65 20 61 20 6e 69 63 65 20 64 |p. Have a nice d|
000023a0 61 79 21 00 00 00 00 00 50 68 61 73 65 20 31 20 |ay!.....Phase 1 |
000023b0 64 65 66 75 73 65 64 2e 20 48 6f 77 20 61 62 6f |defused. How abo|
000023c0 75 74 20 74 68 65 20 6e 65 78 74 20 6f 6e 65 3f |ut the next one?|
000023d0 00 00 00 00 00 00 00 00 53 6f 20 79 6f 75 20 67 |........So you g|
000023e0 6f 74 20 74 68 61 74 20 6f 6e 65 2e 20 20 54 72 |ot that one. Tr|
000023f0 79 20 74 68 69 73 20 6f 6e 65 2e 00 00 00 00 00 |y this one......|
00002400 42 6f 72 64 65 72 20 72 65 6c 61 74 69 6f 6e 73 |Border relations|
00002410 20 77 69 74 68 20 43 61 6e 61 64 61 20 68 61 76 | with Canada hav|
00002420 65 20 6e 65 76 65 72 20 62 65 65 6e 20 62 65 74 |e never been bet|
00002430 74 65 72 2e 00 00 00 00 57 6f 77 21 20 59 6f 75 |ter.....Wow! You|
00002440 27 76 65 20 64 65 66 75 73 65 64 20 74 68 65 20 |'ve defused the |
00002450 73 65 63 72 65 74 20 73 74 61 67 65 21 00 66 6c |secret stage!.fl|
00002460 79 65 72 73 00 00 00 00 00 00 00 00 00 00 00 00 |yers............|
···

在这一段的倒数第七行,地址为0x2400的位置,找到了目标字符串:”Border relations with Canada have never been better.”。注意字符串的结尾以’\0’为标志。

输入后通过Phase 1:

1
2
3
4
5
jkilopu@ubuntu:~/Workspace/csapp/Assembly/bomb$ ./bomb 
Welcome to my fiendish little bomb. You have 6 phases with
which to blow yourself up. Have a nice day!
Border relations with Canada have never been better.
Phase 1 defused. How about the next one?

Phase 2

phase_2函数的反汇编代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
0000000000400efc <phase_2>:
400efc: 55 push %rbp
400efd: 53 push %rbx
400efe: 48 83 ec 28 sub $0x28,%rsp
400f02: 48 89 e6 mov %rsp,%rsi
400f05: e8 52 05 00 00 callq 40145c <read_six_numbers>
400f0a: 83 3c 24 01 cmpl $0x1,(%rsp)
400f0e: 74 20 je 400f30 <phase_2+0x34>
400f10: e8 25 05 00 00 callq 40143a <explode_bomb>
400f15: eb 19 jmp 400f30 <phase_2+0x34>
400f17: 8b 43 fc mov -0x4(%rbx),%eax
400f1a: 01 c0 add %eax,%eax
400f1c: 39 03 cmp %eax,(%rbx)
400f1e: 74 05 je 400f25 <phase_2+0x29>
400f20: e8 15 05 00 00 callq 40143a <explode_bomb>
400f25: 48 83 c3 04 add $0x4,%rbx
400f29: 48 39 eb cmp %rbp,%rbx
400f2c: 75 e9 jne 400f17 <phase_2+0x1b>
400f2e: eb 0c jmp 400f3c <phase_2+0x40>
400f30: 48 8d 5c 24 04 lea 0x4(%rsp),%rbx
400f35: 48 8d 6c 24 18 lea 0x18(%rsp),%rbp
400f3a: eb db jmp 400f17 <phase_2+0x1b>
400f3c: 48 83 c4 28 add $0x28,%rsp
400f40: 5b pop %rbx
400f41: 5d pop %rbp
400f42: c3 retq

在第四行,空出了一片很大的栈空间。再看第五、六行,推测创建了一个长度为6的数组,它的首地址被作为参数传入 read_six_numbers函数中。

再来看read_six_numbers函数的反汇编代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
000000000040145c <read_six_numbers>:
40145c: 48 83 ec 18 sub $0x18,%rsp
401460: 48 89 f2 mov %rsi,%rdx
401463: 48 8d 4e 04 lea 0x4(%rsi),%rcx
401467: 48 8d 46 14 lea 0x14(%rsi),%rax
40146b: 48 89 44 24 08 mov %rax,0x8(%rsp)
401470: 48 8d 46 10 lea 0x10(%rsi),%rax
401474: 48 89 04 24 mov %rax,(%rsp)
401478: 4c 8d 4e 0c lea 0xc(%rsi),%r9
40147c: 4c 8d 46 08 lea 0x8(%rsi),%r8
401480: be c3 25 40 00 mov $0x4025c3,%esi
401485: b8 00 00 00 00 mov $0x0,%eax
40148a: e8 61 f7 ff ff callq 400bf0 <__isoc99_sscanf@plt>
40148f: 83 f8 05 cmp $0x5,%eax
401492: 7f 05 jg 401499 <read_six_numbers+0x3d>
401494: e8 a1 ff ff ff callq 40143a <explode_bomb>
401499: 48 83 c4 18 add $0x18,%rsp
40149d: c3 retq

可以看到,在x86-64平台上用于传递参数的6个寄存器:%rdi、%rsi、%rdx、%rcx、%r8、%r9都被使用,因此有两个参数只能存放在该函数的栈空间中(对应的操作为三至七行)。结合之前的分析strings_not_equal函数的经验,猜测rdi寄存器保存输入字符串的首地址,rsi寄存器保存phase_2传入数组的首地址,其余6个储存位置保存数组中每个元素的地址。

然后调用sscanf函数,sscanf读取格式化的字符串中的数据,下面是其函数声明:

1
int sscanf(const char *str, const char *format, ...)

重点在于format参数,根据第十行的地址0x4025c3找出字符串:”%d %d %d %d %d %d”,验证了之前的猜测。还要注意的是sscanf的返回值(成功匹配和赋值的个数)必须大于5。

了解read_six_numbers函数如何解析输入的字符串后,开始调试,输入字符串”1 2 3 4 5 6”。

1
2
Phase 1 defused. How about the next one?
1 2 3 4 5 6

退出read_six_numbers函数后,检查一下6个局部变量的值:

1
2
3
(gdb) x /6x $rsp
0x7fffffffdf10: 0x00000001 0x00000002 0x00000003 0x00000004
0x7fffffffdf20: 0x00000005 0x00000006

嗯,看起来没问题。接下来是一堆比较和跳转指令,大概浏览后推测数组中的元素必须满足某种关系,否则炸弹爆炸。

稍加思索后得出规律如下(假设数组名为a):

  1. a[0] = 1
  2. a[i - 1] = 2 * a[i] (i >= 1)

得出答案,通过Phase 2:

1
2
1 2 4 8 16 32 awewa
That's number 2. Keep going!

因为这里的sscanf函数只会匹配前六个整数,所以后面怎么乱搞也没问题。


事后总结

感受

  1. 刚开始时很茫然,没信心,一步步做下去后感觉就来了。
  2. 看到乱糟糟的ELF文件和反汇编代码后,继续做的意愿就减弱了…
  3. 缺乏经验。

知识点

  1. 看了这篇文章我才知道,在gdb中输入下面的命令就能打印首地址为0x402400的字符串:
1
2
(gdb) x /s 0x402400
0x402400: "Border relations with Canada have never been better."