- January 16, 2026
Jonas Merhej
Jonas Merhej
This blog post is inspired by one of Daniel Cumming’s lessons at Rektoff. Specifically, we will delve into one of the buffer overflow examples from that lesson and broaden our examination of how different compiler targets can influence Rust's behavior.
Rust is renowned for its compile-time checks that largely eliminate common memory errors. However, the unsafekeyword offers an escape hatch, allowing developers to perform low-level operations not subject to Rust’s usual safety guarantees. Unsafe blocks allow you to do the following five actions in unsafe Rust that you can’t in safe Rust:
- Dereference a raw pointer
- Call an unsafe function or method
- Access or modify a mutable static variable
- Implement an unsafe trait
- Access fields of a
union
This power comes with significant responsibility: when unsafe is used, the developer is acknowledging that the compiler cannot guarantee memory safety for the operations inside, and that it is their responsibility to ensure their code cannot cause undefined behaviour, as exemplified by the classic "buffer overflow".
Buffer Overflow and Undefined Behavior
A buffer overflow occurs when a program attempts to write data beyond the allocated boundaries of a fixed-size memory buffer. This can lead to overwriting adjacent memory locations, potentially corrupting other data or even program instructions. The consequences are often unpredictable, ranging from program crashes to subtle, hard-to-trace bugs, or even enabling malicious code execution.
Consider this simplified example demonstrating a buffer overflow on the stack:
fn main() {
let mut buffer = [0u8; 5]; // 5-byte buffer on the stack
let not_in_buffer = 56789; // Another variable on the stack
unsafe {
let ptr = buffer.as_mut_ptr();
// 🚨 UB: Writing 6 bytes into a 5-byte buffer.
for i in 0..6 {
*ptr.add(i) = i as u8;
}
}
println!("buffer: {:?}", buffer);
// Will `not_in_buffer` still be 56789?
println!("not_in_buffer: {}", not_in_buffer);
}
Figure 1: Example Rust program that intentionally overruns a 5-byte stack buffer to illustrate undefined behavior (writes 6 bytes). The following figures walk through what the writes may corrupt.
In this code, a 5-byte buffer and an i32 variable not_in_buffer are declared. The for loop attempts to write 6 bytes (from 0 to 5) into the 5-byte buffer.
First, let’s get more information about our host architecture:
rustc 1.89.0 (29483883e 2025-08-04)
binary: rustc
commit-hash: 29483883eed69d5fb4db01964cdf2af4d86e9cb2
commit-date: 2025-08-04
host: aarch64-unknown-linux-gnu
release: 1.89.0
LLVM version: 20.1.7
Figure 2: Host architecture output used to show the current target (machine/arch). This affects ABI, alignment and endianness.
We’re on a 64-bit Linux system running on the moby/buildkit Docker Image. Due to memory alignment and padding on this specific target, the not_in_buffer variable might not be immediately adjacent to the buffer on the stack. This means the overflow might write into padding bytes or other seemingly “empty” spaces, leading to the program appearing to function correctly even though an overflow has occurred. The printed value of not_in_buffer could remain unchanged.
Currently, this is how the stack looks before executing the loop:
+———————————————————————————————————+————————+————————+————————+————————+
| 1 byte | 1 byte | 1 byte | 1 byte | 1 byte |--------|--------|--------|
+———————————————————————————————————+————————+————————+————————+————————+
| 4 bytes |--------|--------|--------|--------|
+———————————————————————————————————+————————+————————+————————+————————+
Figure 3: Visual representation of the stack layout before the loop runs — a 5-byte buffer followed by not_in_buffer (4 bytes) separated by padding on this 64-bit target.
Let’s compile and run the program using rustc and inspect the results:
# rustc main.rs
# ./main
buffer: [0, 1, 2, 3, 4]
not_in_buffer: 56789
Figure 4: Program output showing that, on this layout, the initial overflow did not visibly alter not_in_buffer (value remains 56789).
As expected, due to the memory layout, we didn’t see any overflow in the result. However, this does not mean that an overflow didn’t happen. The memory layout now looks as follow:
+———————————————————————————————————+————————+————————+————————+————————+
| 1 byte | 1 byte | 1 byte | 1 byte | 1 byte | 1 byte |--------|--------|
+———————————————————————————————————+————————+————————+————————+————————+
| 4 bytes |--------|--------|--------|--------|
+———————————————————————————————————+————————+————————+————————+————————+
Figure 5: Memory layout after the first overflow writes (6 bytes); the sixth byte landed in the padding region on this 64-bit target.
However, if the loop’s upper bound is increased (e.g., to 9 or higher), the impact becomes visible. The overflow can then reach and corrupt the not_in_buffer variable, changing its value in an unpredictable way. Let’s write into 9 bytes, i.e., 4 bytes passed the buffer:
- for i in 0..6 {
+ for i in 0..9 {
*ptr.add(i) = i as u8;
}
Figure 6: Diff showing the loop bound increased from 0..6 to 0..9 so the writes reach into adjacent memory and will eventually overwrite not_in_buffer.
If we compile and run the program again, we get:
# rustc main.rs
# ./main
buffer: [0, 1, 2, 3, 4]
not_in_buffer: 56584
Figure 7: Program output after increasing the loop bound; not_in_buffer has been corrupted and now prints 56584.
Our value not_in_buffer changed from 56789 to 56584. What happened here?
| Loop i | Memory Offset from buffer Start | Action | Value Written (Hex) | Memory Contents of not_in_buffer |
|---|---|---|---|---|
| – | – | – | [ D5, DD, 00, 00 ] |
|
| 0–4 | +0 to +4 | Writes to buffer |
0x00 to 0x04 |
[ D5, DD, 00, 00 ] |
| 5–7 | +5 to +7 | Writes to padding | 0x05 to 0x07 |
[ D5, DD, 00, 00 ] |
| 8 | +8 | Overflows into not_in_buffer |
0x08 |
[ 08, DD, 00, 00 ] |
Figure 8: Table summarizing where each write lands (buffer, padding, then not_in_buffer) for the example in figure 7.
In other words, our memory now looks like this:
+=======================================================================================+
| BUFFER + PADDING (8 Bytes Total) |
+----------+----------+----------+----------+----------+----------+----------+----------+
| buffer[0]...buffer[4] | <-- PADDING (3 bytes) --> |
+----------+----------+----------+----------+----------+----------+----------+----------+
| 0x00 | 0x01 | 0x02 | 0x03 | 0x04 | 0x05 | 0x06 | 0x07 |
+----------+----------+----------+----------+----------+----------+----------+----------+
| |
V |
+=======================================+===============================================+
| not_in_buffer (4 bytes) |
+---------------------------------------+-----------------------------------------------+
| not_in_buffer[0] | not_in_buffer[1] | not_in_buffer[2] | not_in_buffer[3] |
+--------------------+------------------+-----------------------+-----------------------+
| 0x08 <------ | 0xDD | 0x00 | 0x00 |
+--------------------+------------------+-----------------------+-----------------------+
Step-by-step explanation:
- Initial state:
not_in_bufferis stored little-endian as the byte sequence[D5, DD, 00, 00]. Interpreted as a 32-bit little-endian integer this is0x0000DDD5= 56789 decimal. - Loop iterations 0–4 write the bytes
0x00through0x04into the fivebufferslots. These writes stay inside the buffer and do not touchnot_in_buffer. - Iterations 5–7 write into the padding area between
bufferandnot_in_buffer. Because of the platform’s alignment/padding the first few overflowed bytes land in padding, sonot_in_bufferis still untouched. - Iteration 8 writes the single byte
0x08into the first byte ofnot_in_bufferbecause this architecture stores the least-significant byte at the lowest address. The low-order byte changed from0xD5to0x08, while the other three bytes remained[DD, 00, 00].
Byte-level change (hex):
- Before:
[D5, DD, 00, 00]→0x0000DDD5= 56789 - After:
[08, DD, 00, 00]→0x0000DD08= 56584
Note: on a big-endian system the first byte written after the buffer would be the most significant byte of the integer, producing a very different numeric result.
Why Is This Considered Undefined?
At this point, one can argue that this is not undefined behavior, but rather deterministic unwanted behavior. And they would be right. Let’s try this example once more on a different target then.
For that, we’re gonna use the i386/debian:bullseye Docker image. And we’re going to execute the above steps again.
Again, we’re going to get our kernel’s architecture:
# rustc -vV
rustc 1.89.0 (29483883e 2025-08-04)
binary: rustc
commit-hash: 29483883eed69d5fb4db01964cdf2af4d86e9cb2
commit-date: 2025-08-04
host: i686-unknown-linux-gnu
release: 1.89.0
LLVM version: 20.1.7
Figure 9: Same version and commit hash as before, but DIFFERENT HOST ARCHITECTURE!
We’re again running a Linux distribution, however this time it’s on a 32-bit architecture. Pause and think what should happen? How does the memory layout look like now?
A good guess would be:
+————————+————————+————————+————————+
| 1 byte | 1 byte | 1 byte | 1 byte |
+————————+————————+————————+————————+
| 1 byte |--------|--------|--------| <--- expected padding
+————————+————————+————————+————————+
| not_in_buffer |
+————————+————————+————————+————————+
Figure 10: Expected stack layout before the loop runs on a 32-bit target with padding.
If we compile and run the program again, overflowing the buffer by 1 byte, we’d expect nothing to happen. However, this is the result we get:
# rustc main.rs
# ./main
buffer: [0, 1, 2, 3, 4]
not_in_buffer: 56581
Figure 11: Program output for the 32-bit target after a 6-byte write, showing that not_in_buffer was corrupted and now prints 56581.
Why did we overflow now? This is because the target we’re compiling for asks for memory to be as compact as possible, optimizing for memory size usage instead of performance.
+—————————————+—————————————+—————————————+—————————————+
| 1 byte | 1 byte | 1 byte | 1 byte |
+—————————————+—————————————+—————————————+—————————————+
| 1 byte | first 3 bytes of not_in_buffer | <--- actually no padding
+—————————————+—————————————+—————————————+—————————————+
| last byte of not_in_buffer|-------------|-------------|
+—————————————+—————————————+—————————————+—————————————+
Figure 12: A visualization of the stack layout on a 32-bit architecture without padding between the 5-byte buffer and the not_in_buffer variable.
The loop writes 6 bytes (0..5). Because there’s no padding, the 6th write (i = 5) immediately overwrites the first (least-significant) byte of not_in_buffer.
| Loop i | Stack Offset from buffer Start | Action | Value Written (Hex) | Stack Contents of not_in_buffer |
|---|---|---|---|---|
| – | – | – | *[ D5, DD, 00, 00 ]* |
|
| 0–4 | +0 to +4 | Writes to buffer |
0x00 to 0x04 |
*[ D5, DD, 00, 00 ]* |
| 5 | +5 | Overflows into not_in_buffer |
0x05 |
*[ 05, DD, 00, 00 ]* |
Figure 13: Table for the 32-bit, no-padding example (6-byte write) showing the write that immediately overwrites the LSB of not_in_buffer.
Step-by-step explanation (32-bit, no padding):
- Initial state:
not_in_bufferis little-endian[D5, DD, 00, 00](0x0000DDD5= 56789 decimal). - Loop iterations 0–4 write
0x00..0x04into the five buffer slots — still inside thebuffer. - The 6th write lands immediately on the first byte of
not_in_bufferbecause there is no padding: it writes0x05into the least-significant byte.
Byte-level change (hex):
- Before:
[D5, DD, 00, 00]→0x0000DDD5= 56789 - After:
[05, DD, 00, 00]→0x0000DD05= 56581
This concrete example demonstrates why even a single overwritten byte in an unsafe block can silently corrupt program state on architectures without padding.
In Rust, undefined behavior does not mean random in a cryptographic sense, but rather denotes the absence of any guarantee about what a program will do when certain rules are violated. When undefined behavior is invoked, the language specification provides no requirements or constraints on compiler behavior or outcomes. The actual outcome depends on the specific compiler, operating system, and hardware architecture. As we saw, running the same program multiple times on the same machine might yield consistent results, leading to a false sense of security, but deploying it on a different environment could expose the hidden bug.
Conclusion
In this article we demonstrated one example of undefined behavior in unsafe Rust, and how it’s not always immediately visible, and can be dependent on the target architecture and compiler settings.
While Rust’s robust type system and ownership rules generally prevent undefined behavior in “safe” Rust, understanding unsafe is vital for low-level interoperability and performance optimization.
At OpenZeppelin, understanding unsafe Rust and undefined behavior is paramount for auditing blockchain infrastructure, including client implementations, compilers, and SDKs. A significant number of high-performance blockchain components utilize unsafe Rust, rendering low-level vulnerabilities a critical area of concern. For example, a buffer overflow within a blockchain client could instigate network instability or lead to consensus failures. This reinforces the importance of using unsafe judiciously, with a thorough understanding and rigorous auditing and testing, ensuring that any assumptions made about memory layout and behavior hold across all target environments.
Ready to secure your code?