How Compiler Targets Affect Unsafe Rust's Behavior

- January 16, 2026

Jonas Merhej

This blog post is inspired by one of Daniel Cumming’s lessons at Rektoff. Specifically, we will delve into one of the buffer overflow examples from that lesson and broaden our examination of how different compiler targets can influence Rust's behavior.

Rust is renowned for its compile-time checks that largely eliminate common memory errors. However, the unsafekeyword offers an escape hatch, allowing developers to perform low-level operations not subject to Rust’s usual safety guarantees. Unsafe blocks allow you to do the following five actions in unsafe Rust that you can’t in safe Rust:

Dereference a raw pointer
Call an unsafe function or method
Access or modify a mutable static variable
Implement an unsafe trait
Access fields of a union

This power comes with significant responsibility: when unsafe is used, the developer is acknowledging that the compiler cannot guarantee memory safety for the operations inside, and that it is their responsibility to ensure their code cannot cause undefined behaviour, as exemplified by the classic "buffer overflow".

Buffer Overflow and Undefined Behavior

A buffer overflow occurs when a program attempts to write data beyond the allocated boundaries of a fixed-size memory buffer. This can lead to overwriting adjacent memory locations, potentially corrupting other data or even program instructions. The consequences are often unpredictable, ranging from program crashes to subtle, hard-to-trace bugs, or even enabling malicious code execution.

Consider this simplified example demonstrating a buffer overflow on the stack:

fn main() {
    let mut buffer = [0u8; 5]; // 5-byte buffer on the stack
    let not_in_buffer = 56789; // Another variable on the stack

    unsafe {
        let ptr = buffer.as_mut_ptr();

        // 🚨 UB: Writing 6 bytes into a 5-byte buffer.
        for i in 0..6 {
            *ptr.add(i) = i as u8;
        }
    }

    println!("buffer: {:?}", buffer);
    // Will `not_in_buffer` still be 56789?
    println!("not_in_buffer: {}", not_in_buffer);
}

Figure 1: Example Rust program that intentionally overruns a 5-byte stack buffer to illustrate undefined behavior (writes 6 bytes). The following figures walk through what the writes may corrupt.

In this code, a 5-byte buffer and an i32 variable not_in_buffer are declared. The for loop attempts to write 6 bytes (from 0 to 5) into the 5-byte buffer.

First, let’s get more information about our host architecture:

rustc 1.89.0 (29483883e 2025-08-04)
binary: rustc
commit-hash: 29483883eed69d5fb4db01964cdf2af4d86e9cb2
commit-date: 2025-08-04
host: aarch64-unknown-linux-gnu
release: 1.89.0
LLVM version: 20.1.7

Figure 2: Host architecture output used to show the current target (machine/arch). This affects ABI, alignment and endianness.

We’re on a 64-bit Linux system running on the moby/buildkit Docker Image. Due to memory alignment and padding on this specific target, the not_in_buffer variable might not be immediately adjacent to the buffer on the stack. This means the overflow might write into padding bytes or other seemingly “empty” spaces, leading to the program appearing to function correctly even though an overflow has occurred. The printed value of not_in_buffer could remain unchanged.

Currently, this is how the stack looks before executing the loop:

+———————————————————————————————————+————————+————————+————————+————————+
| 1 byte | 1 byte | 1 byte | 1 byte | 1 byte |--------|--------|--------|
+———————————————————————————————————+————————+————————+————————+————————+
|              4 bytes              |--------|--------|--------|--------|
+———————————————————————————————————+————————+————————+————————+————————+

Figure 3: Visual representation of the stack layout before the loop runs — a 5-byte buffer followed by not_in_buffer (4 bytes) separated by padding on this 64-bit target.

Let’s compile and run the program using rustc and inspect the results:

# rustc main.rs
# ./main
buffer: [0, 1, 2, 3, 4]
not_in_buffer: 56789

Figure 4: Program output showing that, on this layout, the initial overflow did not visibly alter not_in_buffer (value remains 56789).

As expected, due to the memory layout, we didn’t see any overflow in the result. However, this does not mean that an overflow didn’t happen. The memory layout now looks as follow:

+———————————————————————————————————+————————+————————+————————+————————+
| 1 byte | 1 byte | 1 byte | 1 byte | 1 byte | 1 byte |--------|--------|
+———————————————————————————————————+————————+————————+————————+————————+
|              4 bytes              |--------|--------|--------|--------|
+———————————————————————————————————+————————+————————+————————+————————+

Figure 5: Memory layout after the first overflow writes (6 bytes); the sixth byte landed in the padding region on this 64-bit target.

However, if the loop’s upper bound is increased (e.g., to 9 or higher), the impact becomes visible. The overflow can then reach and corrupt the not_in_buffer variable, changing its value in an unpredictable way. Let’s write into 9 bytes, i.e., 4 bytes passed the buffer:

-    for i in 0..6 {
+    for i in 0..9 {
        *ptr.add(i) = i as u8;
    }

Figure 6: Diff showing the loop bound increased from 0..6 to 0..9 so the writes reach into adjacent memory and will eventually overwrite not_in_buffer.

If we compile and run the program again, we get:

# rustc main.rs
# ./main
buffer: [0, 1, 2, 3, 4]
not_in_buffer: 56584

Figure 7: Program output after increasing the loop bound; not_in_buffer has been corrupted and now prints 56584.

Our value not_in_buffer changed from 56789 to 56584. What happened here?

Loop i	Memory Offset from buffer Start	Action	Value Written (Hex)	Memory Contents of not_in_buffer
–		–	–	`[ D5, DD, 00, 00 ]`
0–4	+0 to +4	Writes to `buffer`	`0x00` to `0x04`	`[ D5, DD, 00, 00 ]`
5–7	+5 to +7	Writes to padding	`0x05` to `0x07`	`[ D5, DD, 00, 00 ]`
8	+8	Overflows into `not_in_buffer`	`0x08`	`[ 08, DD, 00, 00 ]`

Figure 8: Table summarizing where each write lands (buffer, padding, then not_in_buffer) for the example in figure 7.

In other words, our memory now looks like this:

+=======================================================================================+
|                        BUFFER + PADDING (8 Bytes Total)                               |
+----------+----------+----------+----------+----------+----------+----------+----------+
|                buffer[0]...buffer[4]                 |   <-- PADDING (3 bytes) -->    |
+----------+----------+----------+----------+----------+----------+----------+----------+
|   0x00   |   0x01   |   0x02   |   0x03   |   0x04   |   0x05   |   0x06   |   0x07   |
+----------+----------+----------+----------+----------+----------+----------+----------+
                                        |                                               |
                                        V                                               |
+=======================================+===============================================+
|                             not_in_buffer (4 bytes)                                   |
+---------------------------------------+-----------------------------------------------+
|  not_in_buffer[0]  | not_in_buffer[1] |   not_in_buffer[2]    |   not_in_buffer[3]    |
+--------------------+------------------+-----------------------+-----------------------+
|     0x08  <------  |     0xDD         |          0x00         |        0x00           |
+--------------------+------------------+-----------------------+-----------------------+

Step-by-step explanation:

Initial state: not_in_buffer is stored little-endian as the byte sequence [D5, DD, 00, 00]. Interpreted as a 32-bit little-endian integer this is 0x0000DDD5 = 56789 decimal.
Loop iterations 0–4 write the bytes 0x00 through 0x04 into the five buffer slots. These writes stay inside the buffer and do not touch not_in_buffer.
Iterations 5–7 write into the padding area between buffer and not_in_buffer. Because of the platform’s alignment/padding the first few overflowed bytes land in padding, so not_in_buffer is still untouched.
Iteration 8 writes the single byte 0x08 into the first byte of not_in_buffer because this architecture stores the least-significant byte at the lowest address. The low-order byte changed from 0xD5 to 0x08, while the other three bytes remained [DD, 00, 00].

Byte-level change (hex):

Before: [D5, DD, 00, 00] → 0x0000DDD5 = 56789
After: [08, DD, 00, 00] → 0x0000DD08 = 56584

Note: on a big-endian system the first byte written after the buffer would be the most significant byte of the integer, producing a very different numeric result.

Why Is This Considered Undefined?

At this point, one can argue that this is not undefined behavior, but rather deterministic unwanted behavior. And they would be right. Let’s try this example once more on a different target then.

For that, we’re gonna use the i386/debian:bullseye Docker image. And we’re going to execute the above steps again.

Again, we’re going to get our kernel’s architecture:

# rustc -vV
rustc 1.89.0 (29483883e 2025-08-04)
binary: rustc
commit-hash: 29483883eed69d5fb4db01964cdf2af4d86e9cb2
commit-date: 2025-08-04
host: i686-unknown-linux-gnu
release: 1.89.0
LLVM version: 20.1.7

Figure 9: Same version and commit hash as before, but DIFFERENT HOST ARCHITECTURE!

We’re again running a Linux distribution, however this time it’s on a 32-bit architecture. Pause and think what should happen? How does the memory layout look like now?

A good guess would be:

+————————+————————+————————+————————+
| 1 byte | 1 byte | 1 byte | 1 byte |
+————————+————————+————————+————————+
| 1 byte |--------|--------|--------| <--- expected padding
+————————+————————+————————+————————+
|           not_in_buffer           |
+————————+————————+————————+————————+

Figure 10: Expected stack layout before the loop runs on a 32-bit target with padding.

If we compile and run the program again, overflowing the buffer by 1 byte, we’d expect nothing to happen. However, this is the result we get:

# rustc main.rs
# ./main
buffer: [0, 1, 2, 3, 4]
not_in_buffer: 56581

Figure 11: Program output for the 32-bit target after a 6-byte write, showing that not_in_buffer was corrupted and now prints 56581.

Why did we overflow now? This is because the target we’re compiling for asks for memory to be as compact as possible, optimizing for memory size usage instead of performance.

+—————————————+—————————————+—————————————+—————————————+
|    1 byte   |    1 byte   |    1 byte   |    1 byte   |
+—————————————+—————————————+—————————————+—————————————+
|    1 byte   |     first 3 bytes of not_in_buffer      | <--- actually no padding
+—————————————+—————————————+—————————————+—————————————+
| last byte of not_in_buffer|-------------|-------------|
+—————————————+—————————————+—————————————+—————————————+

Figure 12: A visualization of the stack layout on a 32-bit architecture without padding between the 5-byte buffer and the not_in_buffer variable.

The loop writes 6 bytes (0..5). Because there’s no padding, the 6th write (i = 5) immediately overwrites the first (least-significant) byte of not_in_buffer.

Loop i	Stack Offset from buffer Start	Action	Value Written (Hex)	Stack Contents of not_in_buffer
–		–	–	`[ D5, DD, 00, 00 ]`
0–4	+0 to +4	Writes to `buffer`	`0x00` to `0x04`	`[ D5, DD, 00, 00 ]`
5	+5	Overflows into `not_in_buffer`	`0x05`	`[ 05, DD, 00, 00 ]`

Figure 13: Table for the 32-bit, no-padding example (6-byte write) showing the write that immediately overwrites the LSB of not_in_buffer.

Step-by-step explanation (32-bit, no padding):

Initial state: not_in_buffer is little-endian [D5, DD, 00, 00] (0x0000DDD5 = 56789 decimal).
Loop iterations 0–4 write 0x00..0x04 into the five buffer slots — still inside the buffer.
The 6th write lands immediately on the first byte of not_in_buffer because there is no padding: it writes 0x05 into the least-significant byte.

Byte-level change (hex):

Before: [D5, DD, 00, 00] → 0x0000DDD5 = 56789
After: [05, DD, 00, 00] → 0x0000DD05 = 56581

This concrete example demonstrates why even a single overwritten byte in an unsafe block can silently corrupt program state on architectures without padding.

In Rust, undefined behavior does not mean random in a cryptographic sense, but rather denotes the absence of any guarantee about what a program will do when certain rules are violated. When undefined behavior is invoked, the language specification provides no requirements or constraints on compiler behavior or outcomes. The actual outcome depends on the specific compiler, operating system, and hardware architecture. As we saw, running the same program multiple times on the same machine might yield consistent results, leading to a false sense of security, but deploying it on a different environment could expose the hidden bug.

Conclusion

In this article we demonstrated one example of undefined behavior in unsafe Rust, and how it’s not always immediately visible, and can be dependent on the target architecture and compiler settings.

While Rust’s robust type system and ownership rules generally prevent undefined behavior in “safe” Rust, understanding unsafe is vital for low-level interoperability and performance optimization.

At OpenZeppelin, understanding unsafe Rust and undefined behavior is paramount for auditing blockchain infrastructure, including client implementations, compilers, and SDKs. A significant number of high-performance blockchain components utilize unsafe Rust, rendering low-level vulnerabilities a critical area of concern. For example, a buffer overflow within a blockchain client could instigate network instability or lead to consensus failures. This reinforces the importance of using unsafe judiciously, with a thorough understanding and rigorous auditing and testing, ensuring that any assumptions made about memory layout and behavior hold across all target environments.