Disclaimer: The technique described in this post applies only to SBPF (Solana's bytecode format) v0, v1, and v2 programs. SBPF v3+ programs do not use relocation processing, making this approach obsolete. The proof of concept was built for SBPF v0 and targets the anza-xyz/sbpf crate at release v0.14.2 (a crate release of the SBPF VM, not related to instruction set versions).

Can you deploy a Solana program that performs arbitrary operations yet contains no code? Can a program behave completely differently than its bytecode suggests? Surprisingly, the answer to both questions is "yes"—and this post explains how.

Before diving into the details, we need to understand how Solana represents programs internally and what operations the Solana runtime performs before execution. Let's begin by exploring these foundations.

Solana Programs and ELF Files

ELF (Executable and Linkable Format) is a standard binary file format used across Unix-like operating systems for executables, object code, shared libraries, and core dumps. An ELF file contains multiple sections including headers that describe the file's organization, program segments for runtime loading, and section headers that define code (.text), data (.data), and symbol tables. It is a portable format that allows the operating system's loader to understand how to load and execute the binary in memory.

User-space programs on Solana are internally represented as ELF binaries containing code adhering to the SBPF instruction set, which has been derived from the BPF (Berkeley Packet Filter) instruction set, specifically its extended (eBPF) variant. These ELF files contain the program's executable code along with metadata sections, and they're stored in accounts onchain.

Before a program gets executed, the Solana runtime uses a custom SBPF loader that parses the ELF structure, loads the program and validates its bytecode. We will refer to the former as loading and the latter as verification. Let's describe them in a reverse order.

Verification

The execution of user-space programs may be performed either through the Interpreter or by directly executing native code generated from the program's SBPF bytecode via JIT-compilation, referred to as JIT mode. Since JIT mode executes native code not known in advance, it is critical to ensure this code can only perform operations permitted by the VM. This guarantee is partially established during the verification step (the remaining checks, which cannot be performed statically during the verification, are performed during program's runtime, but this is out of scope of this post).

Among many other checks, the verifier ensures that:

  • all operations with statically-known arguments cannot cause division by zero
  • each opcode represents a valid instruction

No program that fails the verification step can be deployed nor executed on Solana.

Loading

Prior to verification, the ELF file containing the program is parsed to ensure structural validity and extract critical metadata, such as the entry point address. This parsing phase also includes processing relocations within the program's bytecode. Before examining the specifics of this process, let's first understand what relocations are and why they are essential.

Relocations in Binary Executables

When a program is compiled, the compiler doesn't always know the final memory addresses where code and data will end up. For example, a function call needs to encode the address of the target function — but if that address isn't known yet, the compiler emits a placeholder value instead, and records a relocation entry alongside it. A relocation entry says: "at this offset in the binary, fix up the placeholder using this formula."

When the loader later prepares the program for execution, it processes all relocation entries and patches the placeholders with the correct runtime values. Common examples include: the address of a string constant loaded into a register, a reference to a global variable, or the target of a function call.

Relocations on Solana

Solana programs support three relocation types. They differ in what kind of reference they fix up and how they compute the patched value:

R_BPF_64_RELATIVE

Used for anonymous read-only data — strings or constants that the compiler emitted without an explicit symbol name. The loader patches the address where this data was placed in memory. If the data's address falls below the base of the SBPF VM's data region (0x100000000), the base is added to bring it into the correct address space.

R_BPF_64_64

Used for named symbols (global variables or labelled data). Similar to R_BPF_64_RELATIVE, but the address is computed relative to the symbol's value rather than to a raw offset. The same base adjustment is applied if needed. Because a 64-bit address doesn't fit in a single 32-bit immediate field, the patched value is split across two consecutive instruction slots: the low 32 bits go into the first slot's immediate field, and the high 32 bits go into the second slot's immediate field 8 bytes later. This two-slot encoding is used by the LDDW (Load Double Word) SBPF instruction, which loads a full 64-bit immediate value into a register.

R_BPF_64_32

Used for calls that reference a function by symbol name. The SBPF VM dispatches function calls not by raw address but through a registry keyed on 32-bit Murmur3 hashes (Murmur3 is a fast, non-cryptographic hash function). Rather than computing these hashes at compile time, the toolchain emits -1 as a placeholder in the CALL instruction’s immediate field and records a relocation entry; the loader then resolves each symbol to the correct hash.

The hash is computed differently depending on the target:

  • Syscall (a runtime function provided by the VM, such as sol_log_ for logging or sol_sha256 for hashing): Murmur3("SYSCALL_NAME") — the hash of the syscall’s name string.
  • Internal function: hash derived from the target function’s SBPF bytecode offset.

Using Relocations to Write Arbitrary Code

At first glance, it appears that relocations can only modify specific parts of a program's bytecode — namely, the arguments for load and call operations. This interpretation is reinforced by comments in the SBPF VM source code. However, the actual implementation of relocation processing permits writes to any offset within the ELF file's boundaries, without verifying whether the relocation targets a legitimate instruction operand.

This opens the possibility of modifying the program's bytecode arbitrarily. With the background we've established so far, we can now explore how such modifications could be achieved. The following sections detail this technique.

Goal

Our goal is to create a program whose bytecode will be entirely overwritten during relocation processing. For simplicity, we'll assume the code is initially zeroed out — that is, the .text section of the ELF file contains only 0x00 bytes. The final program will simply log the message "Hello, OZ!" and exit.

Choosing the Right Relocation Type

Before presenting our approach, let's evaluate which relocation type best suits our goal.

R_BPF_64_RELATIVE — Unsuitable. This type is unsuitable because it only permits minor adjustments to data. Moreover, these adjustments are applied exclusively when the referenced data represents values below the 0x100000000 offset of the SBPF data region.

R_BPF_64_64 — Possible but complex. This type could work for our purpose, but it has a complication: processing this relocation performs two 4-byte writes into the bytecode — first at the specified offset (incremented by 4 to reach the instruction argument), and second 8 bytes later. This reflects how constant data addresses are encoded in the bytecode (specifically, how 8-byte offsets are encoded in the LDDW assembly instruction). While this would allow writing custom bytecode, accounting for the cascading modification 8 bytes later introduces unnecessary complexity.

R_BPF_64_32 — Our choice. This relocation type initially appears problematic since it doesn't provide a straightforward way to write arbitrary data. Each write spans 4 bytes, and while we can control the input to the Murmur3 hash (either a syscall name or function offset), ensuring the output matches a specific 4-byte word is challenging. The constraints are even tighter: only several dozen syscall names exist, and each referenced internal function must lie within the bytecode, preventing us from using arbitrary offsets without producing oversized ELF files. However, although each modification spans 4 bytes, nothing prevents us from applying relocations to overlapping regions. We can apply relocations sequentially, advancing by 1 byte each time and selecting only offsets where the first byte of the Murmur3 hash matches our target.

This technique is demonstrated on the diagram below, where our target would be to write "OZ" bytes at offset y.

Step 1: find `x_1` and `x_2` such that `Murmur3(x_1)[0] == 'O'` and `Murmur3(x_2)[0] == 'Z'`. Assume that `Murmur3(x_1) == "OPQR"` and `Murmur3(x_2) == "ZYXW"`

Step 2: write `Murmur3(x_1)` at offset y
            +------+------+------+------+------+
Offsets:     |  y   | y+1  | y+2  | y+3  | y+4  |
            +------+------+------+------+------+
Bytes:       |  O   |  P   |  Q   |  R   |  -   |
            +------+------+------+------+------+

Step 3: write `Murmur3(x_2)` at offset y+1
            +------+------+------+------+------+
Offsets:     |  y   | y+1  | y+2  | y+3  | y+4  |
            +------+------+------+------+------+
Bytes:       |  O   |  Z   |  Y   |  X   |  W   |
            +------+------+------+------+------+

This is the approach we will use to achieve our goal.

General Plan

To build our proof-of-concept, we'll proceed in five steps:

  1. Write a simple "Hello, OZ!" program in SBPF assembly to determine the exact bytes we need to produce through relocations.
  2. Calculate Murmur3 hash inputs that will be used to write the payload byte by byte.
  3. Determine the exact structure of the target ELF file.
  4. Create a base ELF file without the target code and without the target string.
  5. Add relocations to the base ELF file so that after processing them, the bytecode will log the "Hello, OZ!" message.

Step 1: Writing the Target Program in SBPF Assembly

To accomplish this step, we'll use the sbpf tool, which allows writing programs directly in SBPF assembly. This approach yields a minimal program with the desired behavior. The program itself is straightforward:

.globl entrypoint
entrypoint:
  lddw r1, message  # load message offset to the r1 register
  lddw r2, 10       # load message size to r2
  call sol_log_     # invoke sol_log_ syscall
  exit              # exit program
.rodata
  message: .ascii "Hello, OZ!"

This produces the following bytes in the .text and .rodata sections (separated by " "s for readability):

18010000180100000000000000000000 180200000a0000000000000000000000 85100000ffffffff 9500000000000000 48656c6c6f2c204f5a21

Let's examine this more carefully:

  • Bytes 0-15 (18010000180100000000000000000000): The first LDDW instruction. Bytes 4-7 represent the four least significant bytes of the target address, while bytes 12-15 represent the four most significant bytes. Both values use little-endian encoding, meaning bytes should be interpreted in reverse order — for example, 0x18010000 becomes 0x00000118.
  • Bytes 16-31 (180200000a0000000000000000000000): The second LDDW instruction, which loads the value 0xa (10 in decimal) — the length of the "Hello, OZ!" string.
  • Bytes 32-39 (85100000ffffffff): The syscall instruction. The last 4 bytes should contain the Murmur3 hash of the target syscall name. Currently, this value is -1, indicating the correct hash will be inserted via relocation (details below).
  • Bytes 40-47 (9500000000000000): The EXIT instruction.
  • Bytes 48-57 (48656c6c6f2c204f5a21): The ASCII representation of "Hello, OZ!".

The file contains two relocations. The first, of type R_BPF_64_RELATIVE, adjusts the first LDDW instruction because the string offset it references must be mapped to the data region offset. The second relocation populates the sol_log_ syscall identifier (its Murmur3 hash) for the CALL instruction. We can verify this by examining the file with readelf (a standard Linux utility for inspecting ELF files):

Relocation section '.rel.dyn' at offset 0x238 contains 2 entries:
Offset          Info           Type           Sym. Value    Sym. Name
0000000000e8  000000000008 unrecognized: 8
000000000108  00020000000a R_BPF_64_32       0000000000000000 sol_log_

While we could retain these relocations in our target program, doing so would require manually appending them to the end of the relocations list, introducing unnecessary complexity. Since we already know exactly which bytes these two relocations will write, we can apply them directly and not add them to the final ELF file. Specifically:

  • The first relocation writes 0x00000001 (in little-endian format) to the upper half of the first LDDW instruction, since the original address (0x118) is lower than the SBPF data region base address (0x100000000).
  • The second relocation writes the Murmur3 hash of "sol_log_" (0x207559bd) in little-endian format to the last 4 bytes of the CALL instruction, replacing 0xffffffff.

After applying these relocations manually, we obtain the following bytes:

18010000180100000000000001000000 180200000a0000000000000000000000 85100000bd597520 9500000000000000 48656c6c6f2c204f5a21

This is nearly the final byte sequence we need to write, with one remaining adjustment: the offset of the "Hello, OZ!" string. In our minimal file, its offset from the beginning of the .text section is 0x118, but in our final file, this offset will increase. We'll address this in Step 3.

Step 2: Finding Murmur3 Hash Inputs

In this step, we need to find a Murmur3 hash input for each byte value such that the hash's least significant byte (which gets written first due to little-endian encoding) matches our target. Since only several dozen syscalls exist, their hashes alone cannot cover all 256 possible byte values, so we'll focus on hashes generated from internal function addresses. While we could limit ourselves to finding inputs only for the bytes we'll actually use, let's calculate inputs for all possible bytes to demonstrate a more general approach applicable to any program.

To find the smallest suitable input for each of the 256 byte values, we can expect to finish our search in approximately 256 × ln(256) ≈ 1420 steps according to the Coupon Collector's Problem. Running a simple brute-force program confirms this: the maximum offset needed is 1229 (for byte 0x30). A partial table of inputs is shown below:

  x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 xA xB xC xD xE xF
0x 0185 01DB 006E 003A 00AF 0037 0095 01F9 0089 0063 00AE 027B 006C 0038 005A 00A2
1x 003D 001C 00BD 0067 0042 00E6 004C 00C8 020C 0109 0084 01A6 0034 00A1 02D2 03A8
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...

Using this table, we can determine that to write byte 0x00 at a given offset, we need Murmur3 input 0x00000185, while for byte 0x18, we need input 0x0000020C.

Step 3: Determining Base ELF File Structure

At first glance, it appears sufficient to simply zero out the .text and .rodata sections of the file created in Step 1, remove the two original relocations, and apply our new relocations. However, the situation is more complex: to use a specific function offset as a Murmur3 input, that function must reside within the .text section. This requirement stems from the code handling the R_BPF_64_32 relocation type in the SBPF VM, which explicitly requires all offsets to belong to the .text section—otherwise, an ElfError::ValueOutOfBounds error is reported. Since our maximum offset is 1229 (determined in Step 2), we need at least 1230 instructions in the program (we'll use 1240 for good measure).

This introduces two challenges:

  • We cannot place the "Hello, OZ!" string immediately after the code chunk from Step 1.
  • The verifier requires every opcode in .text to be a valid SBPF instruction (as discussed in the Verification section). Zeroed bytes don't correspond to any valid opcode, so the remaining space must be filled with real instructions.

The solution to the first challenge is straightforward: we'll use the bytes determined in Step 1 but insert additional code between that chunk and the "Hello, OZ!" string using relocations.

The solution to the second challenge is equally simple: placing an EXIT instruction at the end of each function satisfies the verifier requirement mentioned above.

In summary, the structure of the .text section and the .rodata section immediately following in the resulting ELF file will be:

.text[Code chunk from Step 1, excluding the "Hello, OZ!" string | Sequence of `EXIT` instructions] | .rodata["Hello, OZ!"]

Step 4: Creating the Base ELF File

Rather than programmatically creating the base ELF file from scratch, we can leverage the sbpf tool as in Step 1. This time, however, we won't write the target code — instead, we'll populate it with 1240 placeholder instructions. One of these will be CALL nonexistent_function, whose sole purpose is to ensure the .rel.dyn section (the ELF section that stores relocation entries) exists, since that is where we'll eventually add our relocations, eliminating the need to manually create it.

Our program will look as follows:

.globl entrypoint

entrypoint:
  call nonexistent_function
  mov64 r0, r0
  mov64 r0, r0
  [...]                           # 1236 placeholder `mov64 r0, r0` instructions
  mov64 r0, r0

.rodata
  message: .ascii "Goodbye, OZ!"  # different string than our target

  padding: .ascii "_"             # padding, to prevent overwriting the first
                                  # byte of the subsequent section when writing
                                  # the last byte of the "Hello, OZ!" string

After compiling this program, we zero out all data in the .text section. Note that the base ELF file contains the "Goodbye, OZ!" string, which is intentionally different from our target string — while we could zero out this data as well, we'll keep it intact to demonstrate an interesting behavior later. This detail won't affect the final program's behavior, which will correctly log the "Hello, OZ!" message.

Step 5: Adding Relocations to Inject the Payload

With the base ELF file prepared, we can finally apply relocations. Before we can do that, though, we need to determine the offset of the string we want to print. Offsets are represented as absolute byte values from the beginning of the ELF file. The .rodata section starts at 0x27a8, which we can verify by combining the .text section offset (0xe8) with its size (1240 × 8 = 9920 = 0x26c0, since it contains 1240 instructions of 8 bytes each). This is the final modification to our bytecode from Step 1, resulting in the following target:

18010000a82700000000000001000000 180200000a0000000000000000000000 85100000bd597520 9500000000000000 [1234 `EXIT` instructions] 48656c6c6f2c204f5a21

This is the bytecode we'll write through relocations. To accomplish this, we'll use a Python script with the lief library to add relocations to the base ELF file. Since we need to write 1240 × 8 bytes for instructions and 10 bytes for the string, we require 9930 relocations total. The relevant part of the script is shown below:

[...]
# Constants

# Dict mapping bytes to their Murmur3 inputs
byte_to_offset = {
    0x00: 0x00000185,
    0x01: 0x000001DB,
    [...]
    0xFC: 0x00000069,
    0xFD: 0x0000013E,
    0xFE: 0x000000B7,
    0xFF: 0x00000054,
}

target_code = bytearray.fromhex('18010000a82700000000000001000000180200000a000000000000000000000085100000bd5975209500000000000000')
target_string = bytearray.fromhex('48656c6c6f2c204f5a21')

STT_FUNC = 2
STB_LOCAL = 0
STV_DEFAULT = 0
EM_BPF = 247

[...]

def add_relocations(path, count, output):
    binary = lief.parse(path)
    if not binary:
        raise RuntimeError("Failed to parse ELF.")

    text = binary.get_section(".text")

    [...]

    base = text.virtual_address
    text_size = text.size

    for i in range(count):
        sym = lief.ELF.Symbol()
        if i < len(target_code):
            sym.name = "rel" + str(i)
            sym.value = base + 8 * byte_to_offset[target_code[i]]
        elif i < text_size:
            sym.name = "rel_exit"
            sym.value = base + 8 * byte_to_offset[0x95]
        else: # i >= text_size
            sym.name = "rel" + str(i)
            sym.value = base + 8 * byte_to_offset[target_string[i - text_size]]

        sym.size = 0
        sym.type = STT_FUNC
        sym.binding = STB_LOCAL
        sym.visibility = STV_DEFAULT
        sym.shndx = binary.get_section_idx(text)
        sym.information = 0x02
        binary.add_dynamic_symbol(sym)

        reloc = lief.ELF.Relocation(
            address=base+i-4,
            type=lief.ELF.Relocation.TYPE.BPF_64_32,
            encoding=lief.ELF.Relocation.ENCODING.REL,
        )
        reloc.symbol = sym
        binary.add_dynamic_relocation(reloc)

    binary.write(output)

    [...]

Two aspects of this script require explanation:

  • How sym.value is calculated
  • How relocation addresses are calculated

Symbol Value Calculation: In SBPF assembly, the CALL instruction encodes a relative offset to the target function. Since sym.value must be an absolute address from the ELF file's beginning, we cannot simply use byte_to_offset[target_byte]. Instead, we take the address of the .text section (base) and add the target offset in that section multiplied by 8 (since each SBPF instruction spans 8 bytes):

sym.value = base + 8 * byte_to_offset[target_byte]

As shown in the source code, the SBPF VM performs the inverse operation to calculate the offset of the function in the .text section.

Relocation Address Calculation: While relocation offsets (places where relocations should be applied) must also be absolute values from the ELF file's beginning, we cannot use the offsets directly. Instead, we must subtract 4 because the R_BPF_64_32 relocation type is designed to modify a CALL instruction's argument, and the address in the relocation should point to the CALL instruction itself. In a normal CALL instruction, the opcode occupies the first 4 bytes and the immediate argument occupies the next 4, so the SBPF VM adds 4 to the provided offset to reach the argument. Since we're repurposing relocations to write to arbitrary bytecode locations, we subtract 4 to compensate:

address = base + i - 4

After running the script, we will end up with the following ELF file.

Testing the Result

Before running programmatic tests, let's examine the final file's structure to confirm that its .text section is filled with 0x00 bytes and that the "Hello, OZ!" string is absent.

Running readelf -x .text on our file returns:

Hex dump of section '.text':
  0x000000e8 00000000 00000000 00000000 00000000 ................
  0x000000f8 00000000 00000000 00000000 00000000 ................
  [...]
  0x00002788 00000000 00000000 00000000 00000000 ................
  0x00002798 00000000 00000000 00000000 00000000 ................

The same command for .rodata produces:

Hex dump of section '.rodata':
  0x000027a8 476f6f64 6279652c 204f5a21 5f       Goodbye, OZ!_

This shows the placeholder string we intentionally embedded instead of the target string.

The relocations appear as follows:

Relocation section '.rel.dyn' at offset 0x3d7a8 contains 9931 entries:
  Offset          Info           Type           Sym. Value    Sym. Name
0000000000e8  26cb0000000a R_BPF_64_32       0000000000000000 nonexistent_function
0000000000e4  00010000000a R_BPF_64_32       0000000000001148 rel0
0000000000e5  00020000000a R_BPF_64_32       0000000000000fc0 rel1
0000000000e6  00030000000a R_BPF_64_32       0000000000000d10 rel2
0000000000e7  00040000000a R_BPF_64_32       0000000000000d10 rel3
0000000000e8  00050000000a R_BPF_64_32       0000000000000878 rel4
0000000000e9  00060000000a R_BPF_64_32       0000000000000a10 rel5
0000000000ea  00070000000a R_BPF_64_32       0000000000000d10 rel6
0000000000eb  00080000000a R_BPF_64_32       0000000000000d10 rel7
0000000000ec  00090000000a R_BPF_64_32       0000000000000d10 rel8
0000000000ed  000a0000000a R_BPF_64_32       0000000000000d10 rel9
0000000000ee  000b0000000a R_BPF_64_32       0000000000000d10 rel10
0000000000ef  000c0000000a R_BPF_64_32       0000000000000d10 rel11
0000000000f0  000d0000000a R_BPF_64_32       0000000000000fc0 rel12
0000000000f1  000e0000000a R_BPF_64_32       0000000000000d10 rel13
0000000000f2  000f0000000a R_BPF_64_32       0000000000000d10 rel14
0000000000f3  00100000000a R_BPF_64_32       0000000000000d10 rel15
0000000000f4  00110000000a R_BPF_64_32       0000000000001148 rel16
0000000000f5  00120000000a R_BPF_64_32       0000000000000458 rel17
0000000000f6  00130000000a R_BPF_64_32       0000000000000d10 rel18
0000000000f7  00140000000a R_BPF_64_32       0000000000000d10 rel19
0000000000f8  00150000000a R_BPF_64_32       0000000000000658 rel20
0000000000f9  00160000000a R_BPF_64_32       0000000000000d10 rel21
0000000000fa  00170000000a R_BPF_64_32       0000000000000d10 rel22
0000000000fb  00180000000a R_BPF_64_32       0000000000000d10 rel23
0000000000fc  00190000000a R_BPF_64_32       0000000000000d10 rel24
0000000000fd  001a0000000a R_BPF_64_32       0000000000000d10 rel25
0000000000fe  001b0000000a R_BPF_64_32       0000000000000d10 rel26
0000000000ff  001c0000000a R_BPF_64_32       0000000000000d10 rel27
000000000100  001d0000000a R_BPF_64_32       0000000000000d10 rel28
000000000101  001e0000000a R_BPF_64_32       0000000000000d10 rel29
000000000102  001f0000000a R_BPF_64_32       0000000000000d10 rel30
000000000103  00200000000a R_BPF_64_32       0000000000000d10 rel31
000000000104  00210000000a R_BPF_64_32       0000000000000220 rel32
000000000105  00220000000a R_BPF_64_32       00000000000002d0 rel33
000000000106  00230000000a R_BPF_64_32       0000000000000d10 rel34
000000000107  00240000000a R_BPF_64_32       0000000000000d10 rel35
000000000108  00250000000a R_BPF_64_32       0000000000000a28 rel36
000000000109  00260000000a R_BPF_64_32       00000000000016f0 rel37
00000000010a  00270000000a R_BPF_64_32       0000000000001b70 rel38
00000000010b  00280000000a R_BPF_64_32       0000000000001258 rel39
00000000010c  00290000000a R_BPF_64_32       0000000000000260 rel40
00000000010d  002a0000000a R_BPF_64_32       0000000000000d10 rel41
00000000010e  002b0000000a R_BPF_64_32       0000000000000d10 rel42
00000000010f  002c0000000a R_BPF_64_32       0000000000000d10 rel43
000000000110  002d0000000a R_BPF_64_32       0000000000000d10 rel44
000000000111  002e0000000a R_BPF_64_32       0000000000000d10 rel45
000000000112  002f0000000a R_BPF_64_32       0000000000000d10 rel46
000000000113  00300000000a R_BPF_64_32       0000000000000d10 rel47
000000000114  00310000000a R_BPF_64_32       0000000000000260 rel_exit
000000000115  00310000000a R_BPF_64_32       0000000000000260 rel_exit
[...]
0000000027a3  00310000000a R_BPF_64_32       0000000000000260 rel_exit
0000000027a4  26c10000000a R_BPF_64_32       0000000000000278 rel9920
0000000027a5  26c20000000a R_BPF_64_32       00000000000007f0 rel9921
0000000027a6  26c30000000a R_BPF_64_32       0000000000001180 rel9922
0000000027a7  26c40000000a R_BPF_64_32       0000000000001180 rel9923
0000000027a8  26c50000000a R_BPF_64_32       00000000000006a8 rel9924
0000000027a9  26c60000000a R_BPF_64_32       0000000000000a88 rel9925
0000000027aa  26c70000000a R_BPF_64_32       0000000000001258 rel9926
0000000027ab  26c80000000a R_BPF_64_32       0000000000000598 rel9927
0000000027ac  26c90000000a R_BPF_64_32       0000000000000370 rel9928
0000000027ad  26ca0000000a R_BPF_64_32       0000000000000970 rel9929

The first relocation is the original nonexistent_function call from our placeholder program. The next 48 relocations encode our bytecode payload, the numerous rel_exit relocations encode EXIT instructions, and the last 10 relocations encode the "Hello, OZ!" string.

Analyzing with sol-azy

Before running our program, we can use sol-azy (at commit 362327a), a disassembler for Solana SBPF programs, to disassemble it and verify whether changes applied through relocations are detected. Running sol-azy reverse --mode both --out-dir ./out/ --bytecodes-file on our target file produces the following disassembly:

entrypoint:
    lddw r1, 0x1000027a8 --> b"Goodbye, OZ!_\\x00\\x00\\x00\\x1e\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x04\\x00\\x00\\x00…        r1 load str located at 4294977448
    lddw r2, 0xa                                    r2 load str located at 10
    syscall sol_log_                                r0 = sol_log_(r1, r2)
    exit

function_6:
    exit

[...]

function_1238:
    exit

function_1239:
    exit

Despite the empty bytecode, sol-azy correctly disassembled the code structure. Notably, it reports the incorrect string "Goodbye, OZ!" as being referenced, indicating that it only processed relocations applied to the .text section and did not apply modifications to the .rodata section. This is a meaningful gap for any security tool relying on disassembly alone to assess program behavior.

Running the Program

With this analysis complete, we can finally test our program using the Mollusk library, a lightweight harness for testing Solana programs locally against the SVM. We'll use the test automatically generated by the sbpf tool when we created our initial file:

#[cfg(test)]
mod tests {
    use mollusk_svm::{result::Check, Mollusk};
    use solana_sdk::pubkey::Pubkey;
    use solana_sdk::instruction::Instruction;

    #[test]
    fn test_rop() {
        let program_id_keypair_bytes = std::fs::read("deploy/rop-keypair.json").unwrap()
            [..32]
            .try_into()
            .expect("slice with incorrect length");
        let program_id = Pubkey::new_from_array(program_id_keypair_bytes);

        let instruction = Instruction::new_with_bytes(
            program_id,
            &[],
            vec![]
        );

        let mollusk = Mollusk::new(&program_id, "deploy/rop");

        let result = mollusk.process_and_validate_instruction(
            &instruction,
            &[],
            &[Check::success()]
        );
        assert!(!result.program_result.is_err());
    }
}

Running the test produces the following result, confirming that relocation processing alone is sufficient to inject and execute arbitrary bytecode:

running 1 test
[2026-02-05T12:53:25.557367815Z DEBUG solana_runtime::message_processor::stable_log] Program 78ydmiDP62f4WKFVCrQjuTFg4dKCvA9YVBFDsLtfhJoh invoke [1]
[2026-02-05T12:53:25.567241986Z DEBUG solana_runtime::message_processor::stable_log] Program log: Hello, OZ!
[2026-02-05T12:53:25.575017416Z DEBUG solana_runtime::message_processor::stable_log] Program 78ydmiDP62f4WKFVCrQjuTFg4dKCvA9YVBFDsLtfhJoh consumed 104 of 1400000 compute units
[2026-02-05T12:53:25.575093290Z DEBUG solana_runtime::message_processor::stable_log] Program 78ydmiDP62f4WKFVCrQjuTFg4dKCvA9YVBFDsLtfhJoh success
test tests::test_rop ... ok

Conclusion

This post presented the Relocation-Oriented Programming (ROP) technique, which enables dynamic modification of Solana program bytecode and data solely through relocation processing.

Practical Limitations

While this technique can be applied to various Solana programs, it comes with significant space overhead. In our example, the original program's ELF file was approximately 1 KB, while the final version ballooned to approximately 411 KB—resulting in roughly 330 bytes per instruction instead of the standard 8 bytes. Although this overhead could be substantially reduced by using the R_BPF_64_64 relocation type (which allows writing 4-byte words at a time and eliminates the need for numerous dummy functions), this technique should be viewed as an interesting curiosity rather than a serious programming approach.

It's worth noting that there are plans to completely remove relocations in newer SBPF versions (starting with v3), which would render this technique obsolete.

Security Implications

While the technique may be impractical and potentially short-lived, its security implications deserve serious attention, as this capability enables programs to behave in unexpected ways. While dynamic code modifications may be visible in disassemblers, data modifications need not be, as we demonstrated. Consider the following scenario:

  1. A malicious program transfers a given amount of SOL from the caller to itself.
  2. It then transfers twice that amount back — but only if a certain byte in its .rodata section equals 1.
  3. An auditor examines the disassembly, reads the data at the referenced offset, and concludes the program is safe.
  4. A single malicious relocation changes that byte, completely subverting the program's behavior.

This is precisely the kind of attack vector that requires security assessments to go beyond high-level program logic and examine low-level execution details in full. Comprehensive analysis — including relocation processing, bytecode verification, and runtime behavior — is the standard for auditing onchain programs where real value is at stake.

FAQs

What is Relocation-Oriented Programming (ROP) on Solana?

Relocation-Oriented Programming (ROP) on Solana is a technique that exploits the ELF relocation processing step — which occurs before program execution — to overwrite a program's bytecode and data with arbitrary content. Because the SBPF VM's relocation handler writes to any offset within the ELF file without verifying whether the target is a legitimate instruction operand, an attacker can craft relocation entries that inject an entirely different program at load time. The technique applies to SBPF v0, v1, and v2, and is not possible in SBPF v3.

How does ELF relocation processing work in Solana programs?

When a Solana program is loaded, the runtime processes relocation entries embedded in the ELF binary before execution begins. Each entry specifies an offset in the file and a formula for computing the value to write there — typically to patch placeholder values left by the compiler for function addresses or syscall identifiers. Solana supports three relocation types: R_BPF_64_RELATIVE for anonymous data, R_BPF_64_64 for named symbols, and R_BPF_64_32 for function call targets resolved via Murmur3 hashing. The security issue arises because the runtime does not validate whether a relocation target is a legitimate instruction operand, allowing writes to arbitrary bytecode locations.

Can a Solana program's behavior differ from what its bytecode shows?

Yes — in SBPF v0, v1, and v2. Because relocation processing occurs after the ELF file is stored onchain but before execution, a program's effective bytecode at runtime can differ entirely from what is stored in the .text section. Disassemblers that process only pre-relocation bytecode — or that apply relocations selectively — may report a program's behavior inaccurately. As demonstrated, sol-azy reported the wrong string reference because it did not apply .rodata modifications from relocations. This gap has direct implications for the reliability of static analysis tools used in security assessments.

What are the security risks of Relocation-Oriented Programming (ROP) in Solana smart contract audits?

Relocation-Oriented Programming represents a meaningful obfuscation and code-injection risk for Solana programs running SBPF v0/v1/v2. A malicious program could appear benign under static analysis while using crafted relocation entries to inject entirely different logic at load time — including logic that modifies program data in ways invisible to standard disassemblers. Security assessments of Solana programs must account for this by analyzing the full relocation table, verifying post-relocation bytecode, and not relying solely on disassembly output or source-level review to determine runtime behavior.

Has this vulnerability been fixed in newer versions of Solana's runtime?

Yes. Starting with SBPF v3, relocation processing has been removed entirely from Solana's program loader. This eliminates the Relocation-Oriented Programming attack vector described here. However, programs compiled for SBPF v0, v1, or v2 remain subject to this behavior for as long as they are deployed and those versions are supported by the runtime. Security teams should confirm which SBPF version a program targets as part of any assessment, and treat SBPF v0/v1/v2 programs as requiring relocation-aware analysis.