Malware and Viruses
How does malware actually get into a system? It usually exploits a vulnerability. The most famous vulnerability in computer science history is the Buffer Overflow.
1. Anatomy of a Buffer Overflow (Stack Smashing)
Programs store local variables and the Return Address (where to go after a function finishes) on the Stack.
If a program blindly copies user input into a small buffer without checking the size:
- The input fills the buffer.
- The input overflows into the neighboring memory.
- The input overwrites the Return Address.
- When the function returns, the CPU jumps to the address the hacker provided (usually pointing to malicious shellcode).
2. Interactive: Stack Smashing Visualizer
See what happens when you pour 12 bytes of data into an 8-byte buffer.
Stack Memory (High Addr)
Ret Addr: 0x4005
Buffer (8 bytes)
[EMPTY]
(Low Addr)
User Input
Waiting for input...
3. Malware Types
- Virus: Attaches to a host file (needs you to click it).
- Worm: Self-replicating network traveler (needs no human action).
- Ransomware: Encrypts files, demands Bitcoin.
- Rootkit: Modifies the OS kernel to hide itself from Task Manager.
4. Code Example: Unsafe vs Safe Memory
Why languages like C are dangerous and Go/Java are safe.
C
Go
#include <stdio.h>
#include <string.h>
void vulnerable_function(char *input) {
char buffer[8];
// DANGER: strcpy does not check size!
// If input is > 8 bytes, it smashes the stack.
strcpy(buffer, input);
}
int main() {
vulnerable_function("ThisStringIsTooLong");
return 0;
}
package main
import "fmt"
func main() {
// Go slices track their length.
// Arrays have fixed size.
var buffer [8]byte
input := []byte("ThisStringIsTooLong")
// If we try to copy more than 8 bytes,
// Go's built-in copy function only copies what fits.
copied := copy(buffer[:], input)
fmt.Printf("Copied %d bytes: %s\n", copied, buffer)
// Output: Copied 8 bytes: ThisStri
// No crash. No memory corruption.
}
5. Defenses
- NX Bit (No-Execute): Marks the Stack as “Data Only”. CPU refuses to execute code there.
- ASLR (Address Space Layout Randomization): Randomizes memory locations so hackers can’t guess where the shellcode is.
- Canaries: The compiler places a random secret value before the Return Address. If the canary is dead (changed), the program aborts.