Summary
Type: zkVM/Circuit
Timeline: From 2025-10-13 → 2025-11-14
Languages: PIL2
Findings
Total issues: 13 (0 resolved)
Critical: 1 (0 resolved) · High: 2 (0 resolved) · Medium: 1 (0 resolved) · Low: 3 (0 resolved)
Notes & Additional Information
4 notes raised (0 resolved)
OpenZeppelin performed a review of the 0xPolygonHermez/zisk repository at commit 5104c56.
Our review focused on the following files:
state-machines
├── binary/pil
│ ├── binary.pil
│ ├── binary_add.pil
│ ├── binary_extension.pil
│ ├── binary_extension_table.pil
│ └── binary_table.pil
└── main/pil
└── main.pil
Zisk is a general-purpose, open-source zkVM that is designed for computing and proving programs. Zisk uses a custom ISA called Zisk ISA. While this ISA is custom, the Zisk team has created a transpiler that turns RISC-V instructions into Zisk instructions. Therefore, any source code that can be compiled to RISC-V can be proven by Zisk. For example, the Zisk team has recently shown promising results in proving Ethereum blocks.
Similar to other zkVMs, Zisk has an emulation step and a proving step. The emulation step in Zisk is implemented in Rust for compatibility and efficiency. The proving step requires taking the created execution trace from the emulation and satisfying the circuit constraints that define the Zisk ISA. While these constraints could have been written in various different circuit languages, Zisk has chosen to do so using the PIL2 language. The focus of our review was on these PIL2 constraints.
This review was divided into two scopes. The first scope contained all the PIL2 files in the binary folder. These have to do with operations such as MIN, MAX, LT, GT, EQ, ADD, SUB, LE, AND, OR, XOR, SLL, SRL, SRA, SE, etc. To ensure the correctness of these instructions, they are validated against a lookup table containing all the 1-byte (8-bit) combinations for each of these operations. Since Zisk works with 8-byte (64-bit) inputs, each operation between two inputs is looked up against this predetermined table byte-by-byte in order to determine the correct output. This portion of the review focused on the correctness of the predetermined table values, as well as the correctness of the constraints stitching these bytes together.
The second scope was centered around the main.pil file. This file takes each instruction from the program's execution trace and directs it to the different constraints for the miscellaneous instructions. The set of instructions constituting the program is trusted and taken from the ROM (read-only memory). The ROM is trusted as it is a public input and committed by the prover. main.pil also handles the reading and writing of registers where the inputs and outputs of the operations are stored. Similarly, this file is also responsible for directing memory writes and reads to the memory constraints, as well as maintaining the control flow of the program by updating the program counter.
One innovation by Zisk is to avoid creating a monolithic proof for the execution by splitting the execution into smaller segments and generating a proof for each segment. The main.pil file provides constraints in order to guarantee consistency between the end of one segment and the beginning of another, as well as between the different steps belonging to a segment. main.pil also has the option to prove execution traces that use a stack. However, since this part of the file is not currently intended for production, it was excluded from the review scope following discussions with the Zisk team. As such, main.pil was fully reviewed except for the parts related to the stack.
This review is part of an effort from OpenZeppelin to help secure young and promising projects with a strong open-source DNA, aligning with our mission to help protect the open economy.
The has_initial_carry witness is unconstrained. This allows all relevant binary opcodes to be executed with the opposite carry bit, creating an off-by-one error. In particular, when processing the least-significant byte, the carry bit is set to has_initial_carry * INITIAL_CARRY_LT_ABS. For the LT_ABS family of opcodes, the carry bit is used when negating the negative input and should be set to 1. If it is set to 0, the negation will compute an incorrectly decremented value, possibly resulting in an incorrect comparison.
On the other hand, all other opcodes should start with an initial carry bit of 0. If has_initial_carry is set to the inverse of INITIAL_CARRY_LT_ABS, it will cause the initial carry bit to be set to 1, which could result in an incorrect computation.
Consider constraining has_initial_carry so it is 1 for the LT_ABS family of opcodes and 0 otherwise.
LE[U] OpcodesThe LE[U] section of the binary table is used to confirm that a <= b, where a and b are two 8-byte numbers. Like all records in this table, it operates on exactly one byte of a and b. Thus, the overall result should be externally determined by combining the various 1-byte results.
When the most significant bytes of a and b are equal, the table asserts that the condition succeeds, which means that a is not greater than b. However, the result actually depends on whether the previous lookup (for the second-most-significant byte) concluded that a was not greater than b. For example, this table would conclude that a <= b is true when a = 0x0002000000000000 and b = 0x0001000000000000.
Consider introducing a carry bit to account for the aforementioned case.
The binary comparison operations output a boolean, which requires setting the least-significant byte. Since the result is not known until the most-significant byte is processed, the result is saved in the most-significant byte using the binary table, and the output is then corrected in the binary processor.
However, this technique has an unnecessary step and is implemented incorrectly for 32-bit opcodes. In particular, the result is written to byte 3 but the correction is applied to byte 7. As the last carry already encodes the output boolean, there is no need to save and then clear the result.
For operations that use the last carry as the result, consider simply outputting 0 at every step of the binary table and then correcting this by adding the final cout to the least significant bit.
main.pil computes the indirect load address addr1 as b_offset_imm0 + b_src_ind * (a[0] + 2**32 * a[1]). However, it computes the indirect store address addr2 as store_offset + store_ind * a[0], ignoring a[1].
Using different address computations for loads and stores allows pointers to be truncated on stores while loads use the full 64-bit address. This means that reading and writing when a exceeds 32 bits can occur at two different addresses.
Consider computing addr2 for indirect stores as store_offset + store_ind * (a[0] + 2**32 * a[1]) to match the load path, including any stack offset logic. Alternatively, consider limiting addresses to 32 bits when reading.
; Causes Compilation ErrorIn binary_extension_table.pil, multiple lines are missing a ; at the end, such as line 144, line 156, and line 168. This results in a compilation error since the PIL2 language requires that all statements conclude with a ;.
Consider adding ; at the end of each affected line to ensure successful code compilation.
Throughout the codebase, multiple instances of missing and/or misleading documentation were identified:
binary_add.pil file does not have any documentation specifying the expected size of the witnesses. However, it expects a, b, and c_chunk to be of a certain size, otherwise causing this constraint to fail.GT operation in the prepopulated binary_table.pil only works for signed comparisons. Consider documenting that this should not be used for unsigned comparisons.binary_table.pil could use a comment // EXT_32 to specify which opcode it is used for.binary_extension.pil, the comment suggests that in2_low is either 5 or 6 bits. However, what is meant is that only the least-significant 5 or 6 bits are relevant to the table.main.pil, the comment says "greater than", whereas it should say "before."_) to improve visual clarity throughout the codebase in large values, such as these.binary_extension.pil uses the word "standard", whereas it should use "extended".main.pil should read "[is_external_op == 0 && op == 0] ==> c = 0, flag = 1" and "[is_external_op == 0 && op == 1] ==> b = c, flag = 0."main.pil are flipped.Consider adding the missing comments and revising the aforementioned ones to improve consistency and more accurately reflect the implemented logic. Doing so will make it easier for auditors and other parties examining the code to understand what each section of code is designed to do.
The EXT_32 opcode portion of the binary lookup table is divided into four parts, differentiated by index_offset values of 0, 1, 2, 3.
op_is_min_max = 0 and result_is_a = 0, generating a FLAGS value of 0.op_is_min_max = 1 and result_is_a = 0, resulting in a FLAGS value of 2.op_is_min_max = 1 and result_is_a = 1, resulting in a FLAGS value of 6.However, index offset 2 results in op_is_min_max = 0 and result_is_a = 0, which is exactly the same as index offset 0. The lookup values are thus identical and unnecessary.
Consider removing the aforementioned extra rows to improve storage and computing performance.
Throughout the codebase, multiple instances were identified where constants could be used, better defined, or removed.
For instance, the following constants are unused:
Consider removing the above-listed constants for improved code clarity.
In the following instances, constants could be used instead of unexplained literals ("magic numbers"):
binary_extension_table.pil, the constant bits could be used instead of 64.binary_extension_table.pil could have a constant defined instead of 32.binary_extension.pil could use constants instead of 2**8, 2**16, and 2**24.binary_extension.pil could use a constant instead of 2**24 - 1.main.pil could use RC instead of 2.Consider defining and using the constants to avoid the use of magic numbers, which can cause issues and confusion when modifying code in the future.
In the following instance, a constant could be defined in all uppercase for improved code clarity:
binary_extension_table.pil, bits and bytes could be written in uppercase.Consider defining all constants in uppercase for improved code clarity.
Throughout the codebase, multiple opportunities for code simplification were identified:
binary_add.pil could be used to simplify the previous line.binary_table.pil could be simplified using the ternary operator.binary.pil is broken into four mutually exclusive and sequentially executed blocks. Instead, it could be refactored into two different loops (where bytes 0, 3, and 7 are handled as individual statements).binary_extension_table.pil, where non-binary values are treated as true. While anything greater than 0 is true, it could be clearer to explicitly state this.main.pil where stack_enabled is tested as a boolean, whereas in other parts, it is explicitly compared with 1. This means that if a value is set to a value greater than 1, it will be treated as true and false in different parts of the same main instance. Consider defining the if statements consistently.SE_W opcode does not explicitly handle the case when offset > 3. Consider explicitly handling this case for consistency with the logic for other opcodes such as this and this.OP_IS_SHIFT column is defined in binary_extension_table.pil, but it is unused until it is used by binary_extension.pil. Consider removing this column from the table and handling this logic in binary_extension.pil instead.Consider revising the aforementioned code to improve consistency and simplify the implemented logic. Doing so will make it easier for auditors and other parties examining the code to understand what each section of code is designed to do.
In the GT opcode, the ternary operator used for signed comparison reads c = (b & 0x80) ? 1 : 0;. However, for all other signed comparisons in the file, such as the one on this line, the condition is based on a & 0x80.
Considering updating this line of code to read c = (a & 0x80) ? 0 : 1; for consistency and ease of reasoning. This would be particularly beneficial for future-proofing the system if refactoring is needed.
Throughout the codebase, multiple instances of typographical errors were identified:
main.pil, "inicial_pc" should be "initial_pc".main.pil, "a immediate" should be "an immediate".main.pil, "form" should be "from".main.pil, "de" should be "the".main.pil, "sizeo_of_instance" should be "size_of_instance".main.pil, "máximum" should be "maximum".main.pil, "isn't send to closed the cycle" should be "is not sent to close the cycle".main.pil, "it will the previous access" should be "it will be the previous access".main.pil, "must not be send" should be "must not be sent".main.pil, "these constraints that defines" should be "these constraints define".Consider correcting the identified errors to improve the clarity and maintainability of the codebase.
The LT_ABS family of opcodes uses two carry bits to process each byte. However, it assumes a single carry bit when constructing the flags bitmap. This results in the operation carry being incorrectly interpreted as the op_is_min_max flag.
Consider extending the flags bitmap so that there is a dedicated position for the "operation carry" flag.
The segmentation-continuation logic ensures (1, 2) that the segment_initial_pc value is consistent with the previous segment. However, the first pc value in the segment is not constrained to match this value. This implies that the prover can introduce an invalid jump between segments.
Consider constraining the first pc value in each segment to equal segment_initial_pc.
Zisk is a novel, open-source zkVM with the ability to prove any program that can compile to RISC-V. As the constraints were written in a new programming language, PIL2, particular care was given to learning this language and ensuring that the appropriate constraints were in place.
The review identified multiple issues, with severities ranging from notes to critical. Most notably, there was an underconstrained subcircuit and some incorrect logic for the binary table.
The Zisk team is appreciated for their efforts in maintaining an open-source codebase, for answering our questions in a prompt and clear manner, and for taking the time to walk us through their system. Such efforts result in a more resilient codebase, and the team is glad to have collaborated with them on this milestone.
OpenZeppelin classifies smart contract vulnerabilities on a 5-level scale:
This classification is applied when the issue’s impact is catastrophic, threatening extensive damage to the client's reputation and/or causing severe financial loss to the client or users. The likelihood of exploitation can be high, warranting a swift response. Critical issues typically involve significant risks such as the permanent loss or locking of a large volume of users' sensitive assets or the failure of core system functionalities without viable mitigations. These issues demand immediate attention due to their potential to compromise system integrity or user trust significantly.
These issues are characterized by the potential to substantially impact the client’s reputation and/or result in considerable financial losses. The likelihood of exploitation is significant, warranting a swift response. Such issues might include temporary loss or locking of a significant number of users' sensitive assets or disruptions to critical system functionalities, albeit with potential, yet limited, mitigations available. The emphasis is on the significant but not always catastrophic effects on system operation or asset security, necessitating prompt and effective remediation.
Issues classified as being of medium severity can lead to a noticeable negative impact on the client's reputation and/or moderate financial losses. Such issues, if left unattended, have a moderate likelihood of being exploited or may cause unwanted side effects in the system. These issues are typically confined to a smaller subset of users' sensitive assets or might involve deviations from the specified system design that, while not directly financial in nature, compromise system integrity or user experience. The focus here is on issues that pose a real but contained risk, warranting timely attention to prevent escalation.
Low-severity issues are those that have a low impact on the client's operations and/or reputation. These issues may represent minor risks or inefficiencies to the client's specific business model. They are identified as areas for improvement that, while not urgent, could enhance the security and quality of the codebase if addressed.
This category is reserved for issues that, despite having a minimal impact, are still important to resolve. Addressing these issues contributes to the overall security posture and code quality improvement but does not require immediate action. It reflects a commitment to maintaining high standards and continuous improvement, even in areas that do not pose immediate risks.