Summary

Type: zkVM/Circuit
Timeline: From 2025-10-13 → 2025-11-14
Languages: PIL2

Findings
Total issues: 13 (0 resolved)
Critical: 1 (0 resolved) · High: 2 (0 resolved) · Medium: 1 (0 resolved) · Low: 3 (0 resolved)

Notes & Additional Information
4 notes raised (0 resolved)

 

Focus

OpenZeppelin performed a review of the 0xPolygonHermez/zisk repository at commit 5104c56.

Our review focused on the following files:

 state-machines
├── binary/pil
│   ├── binary.pil
│   ├── binary_add.pil
│   ├── binary_extension.pil
│   ├── binary_extension_table.pil
│   └── binary_table.pil
└── main/pil
    └── main.pil

System Overview

Zisk is a general-purpose, open-source zkVM that is designed for computing and proving programs. Zisk uses a custom ISA called Zisk ISA. While this ISA is custom, the Zisk team has created a transpiler that turns RISC-V instructions into Zisk instructions. Therefore, any source code that can be compiled to RISC-V can be proven by Zisk. For example, the Zisk team has recently shown promising results in proving Ethereum blocks.

Similar to other zkVMs, Zisk has an emulation step and a proving step. The emulation step in Zisk is implemented in Rust for compatibility and efficiency. The proving step requires taking the created execution trace from the emulation and satisfying the circuit constraints that define the Zisk ISA. While these constraints could have been written in various different circuit languages, Zisk has chosen to do so using the PIL2 language. The focus of our review was on these PIL2 constraints.

This review was divided into two scopes. The first scope contained all the PIL2 files in the binary folder. These have to do with operations such as MIN, MAX, LT, GT, EQ, ADD, SUB, LE, AND, OR, XOR, SLL, SRL, SRA, SE, etc. To ensure the correctness of these instructions, they are validated against a lookup table containing all the 1-byte (8-bit) combinations for each of these operations. Since Zisk works with 8-byte (64-bit) inputs, each operation between two inputs is looked up against this predetermined table byte-by-byte in order to determine the correct output. This portion of the review focused on the correctness of the predetermined table values, as well as the correctness of the constraints stitching these bytes together.

The second scope was centered around the main.pil file. This file takes each instruction from the program's execution trace and directs it to the different constraints for the miscellaneous instructions. The set of instructions constituting the program is trusted and taken from the ROM (read-only memory). The ROM is trusted as it is a public input and committed by the prover. main.pil also handles the reading and writing of registers where the inputs and outputs of the operations are stored. Similarly, this file is also responsible for directing memory writes and reads to the memory constraints, as well as maintaining the control flow of the program by updating the program counter.

One innovation by Zisk is to avoid creating a monolithic proof for the execution by splitting the execution into smaller segments and generating a proof for each segment. The main.pil file provides constraints in order to guarantee consistency between the end of one segment and the beginning of another, as well as between the different steps belonging to a segment. main.pil also has the option to prove execution traces that use a stack. However, since this part of the file is not currently intended for production, it was excluded from the review scope following discussions with the Zisk team. As such, main.pil was fully reviewed except for the parts related to the stack.

Open-Source Reviews

This review is part of an effort from OpenZeppelin to help secure young and promising projects with a strong open-source DNA, aligning with our mission to help protect the open economy.

 

Critical Severity

Incorrect Binary Opcode Evaluations

The has_initial_carry witness is unconstrained. This allows all relevant binary opcodes to be executed with the opposite carry bit, creating an off-by-one error. In particular, when processing the least-significant byte, the carry bit is set to has_initial_carry * INITIAL_CARRY_LT_ABS. For the LT_ABS family of opcodes, the carry bit is used when negating the negative input and should be set to 1. If it is set to 0, the negation will compute an incorrectly decremented value, possibly resulting in an incorrect comparison.

On the other hand, all other opcodes should start with an initial carry bit of 0. If has_initial_carry is set to the inverse of INITIAL_CARRY_LT_ABS, it will cause the initial carry bit to be set to 1, which could result in an incorrect computation.

Consider constraining has_initial_carry so it is 1 for the LT_ABS family of opcodes and 0 otherwise.

High Severity

Incorrect Binary Table for LE[U] Opcodes

The LE[U] section of the binary table is used to confirm that a <= b, where a and b are two 8-byte numbers. Like all records in this table, it operates on exactly one byte of a and b. Thus, the overall result should be externally determined by combining the various 1-byte results.

When the most significant bytes of a and b are equal, the table asserts that the condition succeeds, which means that a is not greater than b. However, the result actually depends on whether the previous lookup (for the second-most-significant byte) concluded that a was not greater than b. For example, this table would conclude that a <= b is true when a = 0x0002000000000000 and b = 0x0001000000000000.

Consider introducing a carry bit to account for the aforementioned case.

Invalid Last-Carry Correction in 32-bit Mode

The binary comparison operations output a boolean, which requires setting the least-significant byte. Since the result is not known until the most-significant byte is processed, the result is saved in the most-significant byte using the binary table, and the output is then corrected in the binary processor.

However, this technique has an unnecessary step and is implemented incorrectly for 32-bit opcodes. In particular, the result is written to byte 3 but the correction is applied to byte 7. As the last carry already encodes the output boolean, there is no need to save and then clear the result.

For operations that use the last carry as the result, consider simply outputting 0 at every step of the binary table and then correcting this by adding the final cout to the least significant bit.

Medium Severity

Asymmetric Read and Store: Store Ignores High 32 Bits

main.pil computes the indirect load address addr1 as b_offset_imm0 + b_src_ind * (a[0] + 2**32 * a[1]). However, it computes the indirect store address addr2 as store_offset + store_ind * a[0], ignoring a[1].

Using different address computations for loads and stores allows pointers to be truncated on stores while loads use the full 64-bit address. This means that reading and writing when a exceeds 32 bits can occur at two different addresses.

Consider computing addr2 for indirect stores as store_offset + store_ind * (a[0] + 2**32 * a[1]) to match the load path, including any stack offset logic. Alternatively, consider limiting addresses to 32 bits when reading.

Low Severity

Missing ; Causes Compilation Error

In binary_extension_table.pil, multiple lines are missing a ; at the end, such as line 144, line 156, and line 168. This results in a compilation error since the PIL2 language requires that all statements conclude with a ;.

Consider adding ; at the end of each affected line to ensure successful code compilation.

Missing and Misleading Documentation

Throughout the codebase, multiple instances of missing and/or misleading documentation were identified:

  • The binary_add.pil file does not have any documentation specifying the expected size of the witnesses. However, it expects a, b, and c_chunk to be of a certain size, otherwise causing this constraint to fail.
  • The GT operation in the prepopulated binary_table.pil only works for signed comparisons. Consider documenting that this should not be used for unsigned comparisons.
  • Line 71 of binary_table.pil could use a comment // EXT_32 to specify which opcode it is used for.
  • This comment says "...and c[input_chunks-1] = cout", whereas it should say "...and c[input_chunks-1] = cout * factor."
  • In line 75 of binary_extension.pil, the comment suggests that in2_low is either 5 or 6 bits. However, what is meant is that only the least-significant 5 or 6 bits are relevant to the table.
  • In line 318 of main.pil, the comment says "greater than", whereas it should say "before."
  • Consider using numeric separators (_) to improve visual clarity throughout the codebase in large values, such as these.
  • The comment in line 4 of binary_extension.pil uses the word "standard", whereas it should use "extended".
  • The comments in lines 137-138 of main.pil should read "[is_external_op == 0 && op == 0] ==> c = 0, flag = 1" and "[is_external_op == 0 && op == 1] ==> b = c, flag = 0."
  • The comments in lines 159-160 of main.pil are flipped.

Consider adding the missing comments and revising the aforementioned ones to improve consistency and more accurately reflect the implemented logic. Doing so will make it easier for auditors and other parties examining the code to understand what each section of code is designed to do.

Duplicate Rows in Precomputed Lookup Table

The EXT_32 opcode portion of the binary lookup table is divided into four parts, differentiated by index_offset values of 0, 1, 2, 3.

  • Index offset 0 results in op_is_min_max = 0 and result_is_a = 0, generating a FLAGS value of 0.
  • Index offset 1 results in op_is_min_max = 1 and result_is_a = 0, resulting in a FLAGS value of 2.
  • Index offset 3 results in op_is_min_max = 1 and result_is_a = 1, resulting in a FLAGS value of 6.

However, index offset 2 results in op_is_min_max = 0 and result_is_a = 0, which is exactly the same as index offset 0. The lookup values are thus identical and unnecessary.

Consider removing the aforementioned extra rows to improve storage and computing performance.

Notes & Additional Information

Issues With Constants

Throughout the codebase, multiple instances were identified where constants could be used, better defined, or removed.

For instance, the following constants are unused:

Consider removing the above-listed constants for improved code clarity.

In the following instances, constants could be used instead of unexplained literals ("magic numbers"):

Consider defining and using the constants to avoid the use of magic numbers, which can cause issues and confusion when modifying code in the future.

In the following instance, a constant could be defined in all uppercase for improved code clarity:

Consider defining all constants in uppercase for improved code clarity.

Code Simplifications

Throughout the codebase, multiple opportunities for code simplification were identified:

  • Line 21 of binary_add.pil could be used to simplify the previous line.
  • Lines 118-122 of binary_table.pil could be simplified using the ternary operator.
  • This loop in binary.pil is broken into four mutually exclusive and sequentially executed blocks. Instead, it could be refactored into two different loops (where bytes 0, 3, and 7 are handled as individual statements).
  • There are multiple places, such as in line 98 of binary_extension_table.pil, where non-binary values are treated as true. While anything greater than 0 is true, it could be clearer to explicitly state this.
  • There are instances in main.pil where stack_enabled is tested as a boolean, whereas in other parts, it is explicitly compared with 1. This means that if a value is set to a value greater than 1, it will be treated as true and false in different parts of the same main instance. Consider defining the if statements consistently.
  • The SE_W opcode does not explicitly handle the case when offset > 3. Consider explicitly handling this case for consistency with the logic for other opcodes such as this and this.
  • The OP_IS_SHIFT column is defined in binary_extension_table.pil, but it is unused until it is used by binary_extension.pil. Consider removing this column from the table and handling this logic in binary_extension.pil instead.

Consider revising the aforementioned code to improve consistency and simplify the implemented logic. Doing so will make it easier for auditors and other parties examining the code to understand what each section of code is designed to do.

Consistency of Ternary Operator

In the GT opcode, the ternary operator used for signed comparison reads c = (b & 0x80) ? 1 : 0;. However, for all other signed comparisons in the file, such as the one on this line, the condition is based on a & 0x80.

Considering updating this line of code to read c = (a & 0x80) ? 0 : 1; for consistency and ease of reasoning. This would be particularly beneficial for future-proofing the system if refactoring is needed.

Typographical Errors

Throughout the codebase, multiple instances of typographical errors were identified:

  • In line 33 of main.pil, "inicial_pc" should be "initial_pc".
  • In line 113 and line 128 of main.pil, "a immediate" should be "an immediate".
  • In line 131 of main.pil, "form" should be "from".
  • In line 198 of main.pil, "de" should be "the".
  • In line 225 of main.pil, "sizeo_of_instance" should be "size_of_instance".
  • In line 230 of main.pil, "máximum" should be "maximum".
  • In line 423 of main.pil, "isn't send to closed the cycle" should be "is not sent to close the cycle".
  • In line 432 of main.pil, "it will the previous access" should be "it will be the previous access".
  • In line 433 of main.pil, "must not be send" should be "must not be sent".
  • In line 510 of main.pil, "these constraints that defines" should be "these constraints define".

Consider correcting the identified errors to improve the clarity and maintainability of the codebase.

Client Reported

Inconsistent Two-Bit Carry

The LT_ABS family of opcodes uses two carry bits to process each byte. However, it assumes a single carry bit when constructing the flags bitmap. This results in the operation carry being incorrectly interpreted as the op_is_min_max flag.

Consider extending the flags bitmap so that there is a dedicated position for the "operation carry" flag.

Program Counter Not Constrained at Start of Segment

The segmentation-continuation logic ensures (1, 2) that the segment_initial_pc value is consistent with the previous segment. However, the first pc value in the segment is not constrained to match this value. This implies that the prover can introduce an invalid jump between segments.

Consider constraining the first pc value in each segment to equal segment_initial_pc.

Conclusion

Zisk is a novel, open-source zkVM with the ability to prove any program that can compile to RISC-V. As the constraints were written in a new programming language, PIL2, particular care was given to learning this language and ensuring that the appropriate constraints were in place.

The review identified multiple issues, with severities ranging from notes to critical. Most notably, there was an underconstrained subcircuit and some incorrect logic for the binary table.

The Zisk team is appreciated for their efforts in maintaining an open-source codebase, for answering our questions in a prompt and clear manner, and for taking the time to walk us through their system. Such efforts result in a more resilient codebase, and the team is glad to have collaborated with them on this milestone.

Appendix

Issue Classification

OpenZeppelin classifies smart contract vulnerabilities on a 5-level scale:

  • Critical
  • High
  • Medium
  • Low
  • Note/Information

Critical Severity

This classification is applied when the issue’s impact is catastrophic, threatening extensive damage to the client's reputation and/or causing severe financial loss to the client or users. The likelihood of exploitation can be high, warranting a swift response. Critical issues typically involve significant risks such as the permanent loss or locking of a large volume of users' sensitive assets or the failure of core system functionalities without viable mitigations. These issues demand immediate attention due to their potential to compromise system integrity or user trust significantly.

High Severity

These issues are characterized by the potential to substantially impact the client’s reputation and/or result in considerable financial losses. The likelihood of exploitation is significant, warranting a swift response. Such issues might include temporary loss or locking of a significant number of users' sensitive assets or disruptions to critical system functionalities, albeit with potential, yet limited, mitigations available. The emphasis is on the significant but not always catastrophic effects on system operation or asset security, necessitating prompt and effective remediation.

Medium Severity

Issues classified as being of medium severity can lead to a noticeable negative impact on the client's reputation and/or moderate financial losses. Such issues, if left unattended, have a moderate likelihood of being exploited or may cause unwanted side effects in the system. These issues are typically confined to a smaller subset of users' sensitive assets or might involve deviations from the specified system design that, while not directly financial in nature, compromise system integrity or user experience. The focus here is on issues that pose a real but contained risk, warranting timely attention to prevent escalation.

Low Severity

Low-severity issues are those that have a low impact on the client's operations and/or reputation. These issues may represent minor risks or inefficiencies to the client's specific business model. They are identified as areas for improvement that, while not urgent, could enhance the security and quality of the codebase if addressed.

Notes & Additional Information Severity

This category is reserved for issues that, despite having a minimal impact, are still important to resolve. Addressing these issues contributes to the overall security posture and code quality improvement but does not require immediate action. It reflects a commitment to maintaining high standards and continuous improvement, even in areas that do not pose immediate risks.