The
buffer overflow attack has long been one of the most common
vulnerabilities exploited to gain access to or control of a
system. The standard form, explained very well in
destrius' writeup at
that node (read it first), has the
attacker submit as input a
string consisting of
machine code at the beginning and the
address at which the machine code is stored at the end, such that this address overwrites the
return address stored at the top of the stack frame; this way, when the
function returns, it actually jumps to the beginning of the buffer, executing the (arbitrary) machine code that the attacker placed there. Usually this is code to execute a
shell. Pictorially, where each line represents four bytes, and the function declares char buffer[6] somewhere:
pre attack: post attack:
| argument 2 | | argument 2 |
| argument 1 | | argument 1 |
| return address | | buffer address |
| saved ebp | <-- ebp points here --> | filler |
| random stuff | | filler |
| more random | | filler |
| b | | filler |
| u | | filler |
| f | | machine |
| f | | code |
| e | | to |
| r | <-- buffer points here -->| exploit |
| more junk | <-- esp points here --> | more junk |
Note that addresses grow up, but the stack grows down. Now, when the function returns, it returns to the address of buffer, which contains machine code that starts a shell or whatever.
The thing is, in most programs, code and data (i.e., the stack) are kept in totally different places. Recognizing this, CPU designers added the capability to mark parts of memory as NX, or non-executable. The idea is, mark the parts of memory that contain the stack as NX, and then when someone tries a buffer-overflow attack, the system refuses to execute the injected machine code. And this is where a return-to-libc attack comes in. libc is the standard C library, which contains functions like printf(), or, more to the point here, system(). Obviously, libc can't be marked NX (it contains functions, that are meant to be executed). The system() function takes one argument, a string that refers to an executable file, and executes it in a new process. It returns when the process completes. So now, we just change our attack to overwrite the return address to the address of system(), and somehow pass it a string that looks like "/bin/bash"! Pictorially, again:
pre attack: post attack:
| argument 2 | | pointer to string |
| argument 1 | | filler2 |
| return address | | address of system() |
| saved ebp | <-- ebp points here --> | filler1 |
| random stuff | | filler |
| more random | | filler |
| b | | filler |
| u | | filler |
| f | | filler |
| f | | filler |
| e | | filler |
| r | <-- buffer points here -->| filler |
| more junk | <-- esp points here --> | more junk |
Let's delve into a little detail of what happens here. The last few
instructions of the function are always the same. First, we move ebp into esp:
| pointer to string |
| filler2 |
| address of system() |
| filler1 | <-- ebp and esp both point here
Then, we pop the "saved ebp" back into ebp:
| pointer to string |
| filler2 |
| address of system() | <-- esp points here
Now ebp points to filler1, off in neverland. Then we execute ret, which essentially pops the address off the top of the stack into the
instruction pointer:
| pointer to string |
| filler2 | <-- esp points here
Now we're executing at the top of system(), which expects the value at the top of the stack to be the return address from which it was called, and the value above that to be the first argument, i.e. the address of a string we want it to execute. Now it moves esp into ebp and starts executing. Perfect.
There are a few additional complications. First, where do we find a convenient string to execute? Fortunately, in UNIX, every program has a SHELL environment variable, which has as its value the shell from which the program was executed. So we can use a pointer to this string. Also, when we exit our shell, system() will return, and it will return to the non-address filler2, probably resulting in a segfault, which will tip off the admin that something is amiss. A better idea would be to substitute for filler2 the address of exit(), so the program will at least cleanly exit.