Disclaimer: The technique described in this post applies only to SBPF (Solana's bytecode format) v0, v1, and v2 programs. SBPF v3+ programs do not use relocation processing, making this approach obsolete. The proof of concept was built for SBPF v0 and targets the anza-xyz/sbpf crate at release v0.14.2 (a crate release of the SBPF VM, not related to instruction set versions).
Can you deploy a Solana program that performs arbitrary operations yet contains no code? Can a program behave completely differently than its bytecode suggests? Surprisingly, the answer to both questions is "yes"—and this post explains how.
Before diving into the details, we need to understand how Solana represents programs internally and what operations the Solana runtime performs before execution. Let's begin by exploring these foundations.
ELF (Executable and Linkable Format) is a standard binary file format used across Unix-like operating systems for executables, object code, shared libraries, and core dumps. An ELF file contains multiple sections including headers that describe the file's organization, program segments for runtime loading, and section headers that define code (.text), data (.data), and symbol tables. It is a portable format that allows the operating system's loader to understand how to load and execute the binary in memory.
User-space programs on Solana are internally represented as ELF binaries containing code adhering to the SBPF instruction set, which has been derived from the BPF (Berkeley Packet Filter) instruction set, specifically its extended (eBPF) variant. These ELF files contain the program's executable code along with metadata sections, and they're stored in accounts onchain.
Before a program gets executed, the Solana runtime uses a custom SBPF loader that parses the ELF structure, loads the program and validates its bytecode. We will refer to the former as loading and the latter as verification. Let's describe them in a reverse order.
The execution of user-space programs may be performed either through the Interpreter or by directly executing native code generated from the program's SBPF bytecode via JIT-compilation, referred to as JIT mode. Since JIT mode executes native code not known in advance, it is critical to ensure this code can only perform operations permitted by the VM. This guarantee is partially established during the verification step (the remaining checks, which cannot be performed statically during the verification, are performed during program's runtime, but this is out of scope of this post).
Among many other checks, the verifier ensures that:
No program that fails the verification step can be deployed nor executed on Solana.
Prior to verification, the ELF file containing the program is parsed to ensure structural validity and extract critical metadata, such as the entry point address. This parsing phase also includes processing relocations within the program's bytecode. Before examining the specifics of this process, let's first understand what relocations are and why they are essential.
When a program is compiled, the compiler doesn't always know the final memory addresses where code and data will end up. For example, a function call needs to encode the address of the target function — but if that address isn't known yet, the compiler emits a placeholder value instead, and records a relocation entry alongside it. A relocation entry says: "at this offset in the binary, fix up the placeholder using this formula."
When the loader later prepares the program for execution, it processes all relocation entries and patches the placeholders with the correct runtime values. Common examples include: the address of a string constant loaded into a register, a reference to a global variable, or the target of a function call.
Solana programs support three relocation types. They differ in what kind of reference they fix up and how they compute the patched value:
R_BPF_64_RELATIVEUsed for anonymous read-only data — strings or constants that the compiler emitted without an explicit symbol name. The loader patches the address where this data was placed in memory. If the data's address falls below the base of the SBPF VM's data region (0x100000000), the base is added to bring it into the correct address space.
R_BPF_64_64Used for named symbols (global variables or labelled data). Similar to R_BPF_64_RELATIVE, but the address is computed relative to the symbol's value rather than to a raw offset. The same base adjustment is applied if needed. Because a 64-bit address doesn't fit in a single 32-bit immediate field, the patched value is split across two consecutive instruction slots: the low 32 bits go into the first slot's immediate field, and the high 32 bits go into the second slot's immediate field 8 bytes later. This two-slot encoding is used by the LDDW (Load Double Word) SBPF instruction, which loads a full 64-bit immediate value into a register.
R_BPF_64_32Used for calls that reference a function by symbol name. The SBPF VM dispatches function calls not by raw address but through a registry keyed on 32-bit Murmur3 hashes (Murmur3 is a fast, non-cryptographic hash function). Rather than computing these hashes at compile time, the toolchain emits -1 as a placeholder in the CALL instruction’s immediate field and records a relocation entry; the loader then resolves each symbol to the correct hash.
The hash is computed differently depending on the target:
sol_log_ for logging or sol_sha256 for hashing): Murmur3("SYSCALL_NAME") — the hash of the syscall’s name string.At first glance, it appears that relocations can only modify specific parts of a program's bytecode — namely, the arguments for load and call operations. This interpretation is reinforced by comments in the SBPF VM source code. However, the actual implementation of relocation processing permits writes to any offset within the ELF file's boundaries, without verifying whether the relocation targets a legitimate instruction operand.
This opens the possibility of modifying the program's bytecode arbitrarily. With the background we've established so far, we can now explore how such modifications could be achieved. The following sections detail this technique.
Our goal is to create a program whose bytecode will be entirely overwritten during relocation processing. For simplicity, we'll assume the code is initially zeroed out — that is, the .text section of the ELF file contains only 0x00 bytes. The final program will simply log the message "Hello, OZ!" and exit.
Before presenting our approach, let's evaluate which relocation type best suits our goal.
R_BPF_64_RELATIVE — Unsuitable. This type is unsuitable because it only permits minor adjustments to data. Moreover, these adjustments are applied exclusively when the referenced data represents values below the 0x100000000 offset of the SBPF data region.
R_BPF_64_64 — Possible but complex. This type could work for our purpose, but it has a complication: processing this relocation performs two 4-byte writes into the bytecode — first at the specified offset (incremented by 4 to reach the instruction argument), and second 8 bytes later. This reflects how constant data addresses are encoded in the bytecode (specifically, how 8-byte offsets are encoded in the LDDW assembly instruction). While this would allow writing custom bytecode, accounting for the cascading modification 8 bytes later introduces unnecessary complexity.
R_BPF_64_32 — Our choice. This relocation type initially appears problematic since it doesn't provide a straightforward way to write arbitrary data. Each write spans 4 bytes, and while we can control the input to the Murmur3 hash (either a syscall name or function offset), ensuring the output matches a specific 4-byte word is challenging. The constraints are even tighter: only several dozen syscall names exist, and each referenced internal function must lie within the bytecode, preventing us from using arbitrary offsets without producing oversized ELF files. However, although each modification spans 4 bytes, nothing prevents us from applying relocations to overlapping regions. We can apply relocations sequentially, advancing by 1 byte each time and selecting only offsets where the first byte of the Murmur3 hash matches our target.
This technique is demonstrated on the diagram below, where our target would be to write "OZ" bytes at offset y.
This is the approach we will use to achieve our goal.
To build our proof-of-concept, we'll proceed in five steps:
To accomplish this step, we'll use the sbpf tool, which allows writing programs directly in SBPF assembly. This approach yields a minimal program with the desired behavior. The program itself is straightforward:
This produces the following bytes in the .text and .rodata sections (separated by " "s for readability):
Let's examine this more carefully:
18010000180100000000000000000000): The first LDDW instruction. Bytes 4-7 represent the four least significant bytes of the target address, while bytes 12-15 represent the four most significant bytes. Both values use little-endian encoding, meaning bytes should be interpreted in reverse order — for example, 0x18010000 becomes 0x00000118.180200000a0000000000000000000000): The second LDDW instruction, which loads the value 0xa (10 in decimal) — the length of the "Hello, OZ!" string.85100000ffffffff): The syscall instruction. The last 4 bytes should contain the Murmur3 hash of the target syscall name. Currently, this value is -1, indicating the correct hash will be inserted via relocation (details below).9500000000000000): The EXIT instruction.48656c6c6f2c204f5a21): The ASCII representation of "Hello, OZ!".The file contains two relocations. The first, of type R_BPF_64_RELATIVE, adjusts the first LDDW instruction because the string offset it references must be mapped to the data region offset. The second relocation populates the sol_log_ syscall identifier (its Murmur3 hash) for the CALL instruction. We can verify this by examining the file with readelf (a standard Linux utility for inspecting ELF files):
While we could retain these relocations in our target program, doing so would require manually appending them to the end of the relocations list, introducing unnecessary complexity. Since we already know exactly which bytes these two relocations will write, we can apply them directly and not add them to the final ELF file. Specifically:
0x00000001 (in little-endian format) to the upper half of the first LDDW instruction, since the original address (0x118) is lower than the SBPF data region base address (0x100000000).0x207559bd) in little-endian format to the last 4 bytes of the CALL instruction, replacing 0xffffffff.After applying these relocations manually, we obtain the following bytes:
This is nearly the final byte sequence we need to write, with one remaining adjustment: the offset of the "Hello, OZ!" string. In our minimal file, its offset from the beginning of the .text section is 0x118, but in our final file, this offset will increase. We'll address this in Step 3.
In this step, we need to find a Murmur3 hash input for each byte value such that the hash's least significant byte (which gets written first due to little-endian encoding) matches our target. Since only several dozen syscalls exist, their hashes alone cannot cover all 256 possible byte values, so we'll focus on hashes generated from internal function addresses. While we could limit ourselves to finding inputs only for the bytes we'll actually use, let's calculate inputs for all possible bytes to demonstrate a more general approach applicable to any program.
To find the smallest suitable input for each of the 256 byte values, we can expect to finish our search in approximately 256 × ln(256) ≈ 1420 steps according to the Coupon Collector's Problem. Running a simple brute-force program confirms this: the maximum offset needed is 1229 (for byte 0x30). A partial table of inputs is shown below:
| x0 | x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 | x9 | xA | xB | xC | xD | xE | xF | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0x | 0185 | 01DB | 006E | 003A | 00AF | 0037 | 0095 | 01F9 | 0089 | 0063 | 00AE | 027B | 006C | 0038 | 005A | 00A2 |
| 1x | 003D | 001C | 00BD | 0067 | 0042 | 00E6 | 004C | 00C8 | 020C | 0109 | 0084 | 01A6 | 0034 | 00A1 | 02D2 | 03A8 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
Using this table, we can determine that to write byte 0x00 at a given offset, we need Murmur3 input 0x00000185, while for byte 0x18, we need input 0x0000020C.
At first glance, it appears sufficient to simply zero out the .text and .rodata sections of the file created in Step 1, remove the two original relocations, and apply our new relocations. However, the situation is more complex: to use a specific function offset as a Murmur3 input, that function must reside within the .text section. This requirement stems from the code handling the R_BPF_64_32 relocation type in the SBPF VM, which explicitly requires all offsets to belong to the .text section—otherwise, an ElfError::ValueOutOfBounds error is reported. Since our maximum offset is 1229 (determined in Step 2), we need at least 1230 instructions in the program (we'll use 1240 for good measure).
This introduces two challenges:
.text to be a valid SBPF instruction (as discussed in the Verification section). Zeroed bytes don't correspond to any valid opcode, so the remaining space must be filled with real instructions.The solution to the first challenge is straightforward: we'll use the bytes determined in Step 1 but insert additional code between that chunk and the "Hello, OZ!" string using relocations.
The solution to the second challenge is equally simple: placing an EXIT instruction at the end of each function satisfies the verifier requirement mentioned above.
In summary, the structure of the .text section and the .rodata section immediately following in the resulting ELF file will be:
Rather than programmatically creating the base ELF file from scratch, we can leverage the sbpf tool as in Step 1. This time, however, we won't write the target code — instead, we'll populate it with 1240 placeholder instructions. One of these will be CALL nonexistent_function, whose sole purpose is to ensure the .rel.dyn section (the ELF section that stores relocation entries) exists, since that is where we'll eventually add our relocations, eliminating the need to manually create it.
Our program will look as follows:
After compiling this program, we zero out all data in the .text section. Note that the base ELF file contains the "Goodbye, OZ!" string, which is intentionally different from our target string — while we could zero out this data as well, we'll keep it intact to demonstrate an interesting behavior later. This detail won't affect the final program's behavior, which will correctly log the "Hello, OZ!" message.
With the base ELF file prepared, we can finally apply relocations. Before we can do that, though, we need to determine the offset of the string we want to print. Offsets are represented as absolute byte values from the beginning of the ELF file. The .rodata section starts at 0x27a8, which we can verify by combining the .text section offset (0xe8) with its size (1240 × 8 = 9920 = 0x26c0, since it contains 1240 instructions of 8 bytes each). This is the final modification to our bytecode from Step 1, resulting in the following target:
This is the bytecode we'll write through relocations. To accomplish this, we'll use a Python script with the lief library to add relocations to the base ELF file. Since we need to write 1240 × 8 bytes for instructions and 10 bytes for the string, we require 9930 relocations total. The relevant part of the script is shown below:
Two aspects of this script require explanation:
sym.value is calculatedSymbol Value Calculation: In SBPF assembly, the CALL instruction encodes a relative offset to the target function. Since sym.value must be an absolute address from the ELF file's beginning, we cannot simply use byte_to_offset[target_byte]. Instead, we take the address of the .text section (base) and add the target offset in that section multiplied by 8 (since each SBPF instruction spans 8 bytes):
As shown in the source code, the SBPF VM performs the inverse operation to calculate the offset of the function in the .text section.
Relocation Address Calculation: While relocation offsets (places where relocations should be applied) must also be absolute values from the ELF file's beginning, we cannot use the offsets directly. Instead, we must subtract 4 because the R_BPF_64_32 relocation type is designed to modify a CALL instruction's argument, and the address in the relocation should point to the CALL instruction itself. In a normal CALL instruction, the opcode occupies the first 4 bytes and the immediate argument occupies the next 4, so the SBPF VM adds 4 to the provided offset to reach the argument. Since we're repurposing relocations to write to arbitrary bytecode locations, we subtract 4 to compensate:
After running the script, we will end up with the following ELF file.
Before running programmatic tests, let's examine the final file's structure to confirm that its .text section is filled with 0x00 bytes and that the "Hello, OZ!" string is absent.
Running readelf -x .text on our file returns:
The same command for .rodata produces:
This shows the placeholder string we intentionally embedded instead of the target string.
The relocations appear as follows:
The first relocation is the original nonexistent_function call from our placeholder program. The next 48 relocations encode our bytecode payload, the numerous rel_exit relocations encode EXIT instructions, and the last 10 relocations encode the "Hello, OZ!" string.
sol-azyBefore running our program, we can use sol-azy (at commit 362327a), a disassembler for Solana SBPF programs, to disassemble it and verify whether changes applied through relocations are detected. Running sol-azy reverse --mode both --out-dir ./out/ --bytecodes-file on our target file produces the following disassembly:
Despite the empty bytecode, sol-azy correctly disassembled the code structure. Notably, it reports the incorrect string "Goodbye, OZ!" as being referenced, indicating that it only processed relocations applied to the .text section and did not apply modifications to the .rodata section. This is a meaningful gap for any security tool relying on disassembly alone to assess program behavior.
With this analysis complete, we can finally test our program using the Mollusk library, a lightweight harness for testing Solana programs locally against the SVM. We'll use the test automatically generated by the sbpf tool when we created our initial file:
Running the test produces the following result, confirming that relocation processing alone is sufficient to inject and execute arbitrary bytecode:
This post presented the Relocation-Oriented Programming (ROP) technique, which enables dynamic modification of Solana program bytecode and data solely through relocation processing.
While this technique can be applied to various Solana programs, it comes with significant space overhead. In our example, the original program's ELF file was approximately 1 KB, while the final version ballooned to approximately 411 KB—resulting in roughly 330 bytes per instruction instead of the standard 8 bytes. Although this overhead could be substantially reduced by using the R_BPF_64_64 relocation type (which allows writing 4-byte words at a time and eliminates the need for numerous dummy functions), this technique should be viewed as an interesting curiosity rather than a serious programming approach.
It's worth noting that there are plans to completely remove relocations in newer SBPF versions (starting with v3), which would render this technique obsolete.
While the technique may be impractical and potentially short-lived, its security implications deserve serious attention, as this capability enables programs to behave in unexpected ways. While dynamic code modifications may be visible in disassemblers, data modifications need not be, as we demonstrated. Consider the following scenario:
.rodata section equals 1.This is precisely the kind of attack vector that requires security assessments to go beyond high-level program logic and examine low-level execution details in full. Comprehensive analysis — including relocation processing, bytecode verification, and runtime behavior — is the standard for auditing onchain programs where real value is at stake.
What is Relocation-Oriented Programming (ROP) on Solana?
Relocation-Oriented Programming (ROP) on Solana is a technique that exploits the ELF relocation processing step — which occurs before program execution — to overwrite a program's bytecode and data with arbitrary content. Because the SBPF VM's relocation handler writes to any offset within the ELF file without verifying whether the target is a legitimate instruction operand, an attacker can craft relocation entries that inject an entirely different program at load time. The technique applies to SBPF v0, v1, and v2, and is not possible in SBPF v3.
How does ELF relocation processing work in Solana programs?
When a Solana program is loaded, the runtime processes relocation entries embedded in the ELF binary before execution begins. Each entry specifies an offset in the file and a formula for computing the value to write there — typically to patch placeholder values left by the compiler for function addresses or syscall identifiers. Solana supports three relocation types: R_BPF_64_RELATIVE for anonymous data, R_BPF_64_64 for named symbols, and R_BPF_64_32 for function call targets resolved via Murmur3 hashing. The security issue arises because the runtime does not validate whether a relocation target is a legitimate instruction operand, allowing writes to arbitrary bytecode locations.
Can a Solana program's behavior differ from what its bytecode shows?
Yes — in SBPF v0, v1, and v2. Because relocation processing occurs after the ELF file is stored onchain but before execution, a program's effective bytecode at runtime can differ entirely from what is stored in the .text section. Disassemblers that process only pre-relocation bytecode — or that apply relocations selectively — may report a program's behavior inaccurately. As demonstrated, sol-azy reported the wrong string reference because it did not apply .rodata modifications from relocations. This gap has direct implications for the reliability of static analysis tools used in security assessments.
What are the security risks of Relocation-Oriented Programming (ROP) in Solana smart contract audits?
Relocation-Oriented Programming represents a meaningful obfuscation and code-injection risk for Solana programs running SBPF v0/v1/v2. A malicious program could appear benign under static analysis while using crafted relocation entries to inject entirely different logic at load time — including logic that modifies program data in ways invisible to standard disassemblers. Security assessments of Solana programs must account for this by analyzing the full relocation table, verifying post-relocation bytecode, and not relying solely on disassembly output or source-level review to determine runtime behavior.
Has this vulnerability been fixed in newer versions of Solana's runtime?
Yes. Starting with SBPF v3, relocation processing has been removed entirely from Solana's program loader. This eliminates the Relocation-Oriented Programming attack vector described here. However, programs compiled for SBPF v0, v1, or v2 remain subject to this behavior for as long as they are deployed and those versions are supported by the runtime. Security teams should confirm which SBPF version a program targets as part of any assessment, and treat SBPF v0/v1/v2 programs as requiring relocation-aware analysis.