- Name: El Mehdi Hamte - MrFall
- Date: 26-11-2026
This project implements file compression and decompression using the Huffman algorithm in the C programming language.
The goal is to minimize the number of bits needed to represent characters by assigning shorter codes to frequent characters and longer codes to rare characters.
We count how many times each character appears in the input file.
- Leaves = characters and their frequencies
- Internal nodes = sum of children
- Always merge the two smallest nodes first using a Min-Heap
Each path from root to leaf generates a code:
- Left → 0
- Right → 1
- Replace each character with its Huffman code
- Bits are grouped into bytes
- Last byte may contain padding → stored in a
.metafile
- Rebuild the Huffman tree from
codes.map - Decode the compressed bit stream
- Remove padding
- Restore the original text exactly
| File | Description |
|---|---|
huffman.h |
Node structure + Huffman function prototypes |
huffman.c |
Huffman tree creation, code generation, encode/decode logic |
minheap.h |
Heap structure and prototypes |
minheap.c |
Min-Heap implementation (insert, extract-min, build-heap) |
main.c |
CLI interface: encode and decode commands |
Makefile |
Automates compilation (make) |
echo "ABRACADABRA" > input.txtgcc main.c huffman.c minheap.c -o huff(or simply)
make./huff encode input.txt compressed.huf codes.mapThis generates:
compressed.huf→ compressed binary filecodes.map→ Huffman codes usedcompressed.huf.meta→ padding info
./huff decode compressed.huf compressed.huf.meta codes.map mrfall.txtcat mrfall.txtExpected:
ABRACADABRA
| File | Original Size | Compressed Size | Compression Ratio |
|---|---|---|---|
input.txt (ABRACADABRA) |
11 bytes | 3 bytes | 72% saved |
Compression becomes more effective with larger and more repetitive input data.
Learn Huffman Coding visually:
