Author: Zoe | Puzzle Ventures (Email: email@example.com | Twitter: @zoezts)
Since the competition among various public chains, to Danksharding in Ethereum’s roadmap, to layer 2 solutions like Optimistic Rollups and zk-rollups, we have been continuously discussing the scalability of blockchain – what to do when a large number of users and funds come in? Through a series of articles to follow, I want to show you a future vision that consists of data access, off-chain computation, and on-chain verification.
Trustless Data Access + Off-chain Computation + On-chain Verification
“Proving Consensus” is an important part of this blueprint. This article explores the significance of using zero-knowledge proof of consensus on top of Ethereum’s PoS, including:
- Exploring the Future of Universal Finance The Current Situation and Opportunities of NoFi
- The Next 10 Years of Paper Plane Telegram Flying into the Cryptographic Deep Space
- Grayscale releases ‘good news’, can it break the Bitcoin September curse?
1. The importance of decentralization for EVM.
2. The importance of decentralized data access for web3 scalability.
Proving full consensus on the Ethereum mainnet is a complex task, but if we can achieve zkization of the consensus layer, it will help Ethereum’s scalability while ensuring security and trust, enhance the robustness of the entire Ethereum ecosystem, reduce participation costs, and allow more people to join.
一、为什么证明共识层很重要? | Why does proving consensus matter?
1. Perspective from Ethereum.
2. Perspective of Protocol Stacks on Ethereum.
二、区块链数据来自何处？不同数据源的信任假设 | Where is Blockchain Data? Trust Assumptions for Different Data Sources
三、用零知识证明共识层之路 | The LianGuaith to Prove Consensus Using ZK
1. Key Steps in Consensus Formation in Ethereum 2.0
2. Tech Stacks to Prove Consensus
3. The End Game: Diversified Level 1 zkEVM
四、未来展望 | What is the Future?
五、参考 | Reference
一、为什么证明共识层很重要? | Why does proving consensus matter?
Using zk to verify the consensus layer of Ethereum L1 makes sense in two main directions. First, it can compensate for the shortcomings of current node diversity and enhance the decentralization and security of Ethereum itself. Second, it provides a foundation of availability and security for various layers of protocols in the Ethereum ecosystem to face more users, including cross-chain security, trustless data access, decentralized oracles, and scalability.
1. Perspective from Ethereum
For Ethereum, to achieve its decentralization and robustness, it needs a diverse environment of clients. This means that more people, especially ordinary users, need to participate and run clients based on different code environments. However, it is not realistic to require every user to run a full node because it requires a large amount of resources that few people can afford, such as at least 16 GB+ RAM and Fast SSD with 2+TB, and these requirements are still growing.
The current goal is to achieve a light node that can provide the same level of trust as a full node (minimized trust) but with lower costs in terms of memory, storage, and bandwidth. However, currently, light nodes do not participate in the consensus process or are only partially protected by the consensus mechanism (Sync Committee).
This goal is referred to as “The Verge” in Ethereum’s roadmap.
Goal: verifying blocks should be super easy – download N bytes of data, perform a few basic computations, verify a SNARK and you’re done— The Verge on Ethereum’s Roadmap
“The Verge” aims to bridge the client gap, and a key step is how to achieve a trustless light node that has the same level of security as today’s full nodes, filling the “client gap” and allowing more people to actively participate in the decentralization and robustness of the network.
2. Perspective of Protocol Stacks on Ethereum
Starting from first principles, we need to solve the problem of combining on-chain data access with off-chain computation verification.
Currently, the use of on-chain data is relatively primitive and insufficient. In many cases, the data required for protocol adjustments is too complex to be computed on-chain, and the cost of obtaining data in a trustless manner is too high, requiring access to a large amount of historical data and frequent digital computations.
For individual users and projects, our ideal situation is to achieve decentralized, end-to-end trustless data transmission and read/write. Based on this, we should aim for as low computational costs as possible while considering security, usability, and economics to cater to future users.
Specifically, it includes the following aspects:
1. Decentralized and trustless oracles: Current protocols use centralized oracles to avoid direct access to a large amount of historical data on-chain, which adds unnecessary trust costs and reduces composability.
2. Data reading and writing for data-sensitive protocols and assets: For example, DeFi protocols need to make some parameter adjustments during operation. But whether it is possible to access historical data and perform more complex calculations without trust, such as adjusting AMM fees based on recent market fluctuations, designing on-chain derivative trading price models and dynamic fluctuations, introducing machine learning methods for asset management, and adjusting lending interest rates based on market conditions.
3. Cross-chain Security: Currently, light node solutions based on zk technology are superior in terms of security, capital efficiency, statefulness, and diversity of information transmission. The current cross-chain solutions, such as TeleLianGuaithy by Succinct and Polehedra on LayerZero, are based on zk verification of block headers using the Sync Committee. However, the Sync Committee is not the Ethereum PoS consensus layer itself and relies on certain trust assumptions, leaving room for further improvements.
Currently, due to considerations such as economic costs, technical limitations, and user experience, developers often rely on centralized RPC servers such as Alchemy, Infura, and Ankr when using on-chain data.
2. Where is Blockchain Data? Trust Assumptions for Different Data Sources
There are two sources of computational data in the blockchain: on-chain data and off-chain data. Calculations are performed based on these two sources. For example, the demand for adjusting DeFi protocol parameters mentioned earlier.
Data Access, computation, proof and verification
There are two notable characteristics of reading, writing, and computing on-chain and off-chain data:
1. In order to achieve decentralization and security, it is best to verify the data we obtain, that is, “Don’t Trust, Verify”.
2. It often involves many complex and expensive computational processes.
If suitable technical solutions are not found, the above two points will affect the usability of the blockchain.
We can illustrate different ways of obtaining data through a simple example. Suppose you want to check your account balance, what would you do?
One of the safest ways is to run a full node yourself, check the Ethereum state stored locally, and retrieve the account balance from it.
Full Node Benchmark. Sync mode and client selection will affect the required space requirements. Reference: https://ethereum.org/en/developers/docs/nodes-and-clients/run-a-node/; https://docs.google.com/presentation/d/1ZxEp6Go5XqTZxQFYTYYnzyd97JKbcXlA6O2s4RI9jr4/mobilepresent?pli=1&slide=id.g252bbdac496_0_109)
However, running a full node yourself is costly and requires maintenance. To save trouble, many people may directly request data from centralized node operators. Although there is nothing wrong with doing so, similar to operations in Web2, and we have never seen these providers engage in any malicious behavior, it also means that we have to trust a centralized service provider, which increases the overall security assumption.
To solve this problem, we can consider two solutions: reducing the cost of running nodes and finding a method to verify the trustworthiness of third-party data.
The main characteristics of light nodes include:
- Ideally, light nodes can run on mobile phones or embedded devices.
- Ideally, they can have the same functionality and security guarantees as full nodes.
- However, light nodes do not participate in the consensus process, or are only protected by a partial consensus mechanism, namely the Sync Committee.
The Sync Committee is the trust assumption of light nodes.
Before The Merge, starting from December 2020, the Beacon Chain underwent a hard fork called Altair, the core purpose of which was to provide consensus support for light nodes. Unlike PoS full consensus, this group of validators (512) is composed of a smaller dataset, with random sampling occurring at longer intervals (256 epochs, about 27 hours).
Light clients such as Helios and Succinct are taking steps toward solving the problem, but a light client is far from a fully verifying node: a light client merely verifies the signatures of a random subset of validators called the sync committee, and does not verify that the chain actually follows the protocol rules. To bring us to a world where users can actually verify that the chain follows the rules, we would have to do something different.
How will Ethereum’s multi-client philosophy interact with ZK-EVMs?, by Vitalik Buterin*
This is why we need to verify the entire consensus layer of Ethereum, in order to welcome a future that is more secure, with stronger usability, diverse protocols, and widespread adoption. Currently, the best solution appears to be zero-knowledge (zero-knowledge) technology.
III. The LianGuaith to Prove Consensus Using ZK
In order to build a trustless environment, it is necessary to address the issues of light node credibility, decentralized data access, and off-chain computation verification. Zero-knowledge proofs are currently the most recognized core technology in these areas, which involve but are not limited to underlying solutions such as zkEVM, zkWASM, other zkVMs, zk Co-processors, etc.
Proving the consensus layer is an important part of this.
The PoS algorithm is very complex, and implementing them in a ZK way requires a lot of engineering work and architectural considerations. Let’s first break down its components.
1. Key Steps in Consensus Formation in Ethereum 2.0
(1) Validator-related algorithms
This includes the following steps:
- Becoming a validator: Validator candidates need to send 32 ETH to the deposit contract and wait for at least 16 hours to several days or weeks for the Beacon Chain to process and activate them as official validators. (Refer to the FAQ – Why does it take so long for a validator to be activated)
- Exercising validator duties: Involves random number and block proof algorithms.
- Exiting validator role: The exit process for validators can be voluntary or due to being penalized (slashed) for misconduct. Validators can initiate an “exit” at any time, with a limit on the number of validators exiting per epoch. If there are too many validators attempting to exit simultaneously, they will be placed in a queue and still need to fulfill their validation duties until their turn comes. After successful exit, validators can withdraw their staked funds after 1/8 of an eek.
(2) Random number-related algorithms
- Each epoch contains 32 slots. Two epochs in advance, all validators are randomly grouped into 32 committees. In the current epoch, they perform their duties and are responsible for consensus on each block.
- Each committee has two roles: a proposer and builders. The proposer is randomly selected, while the rest are builders. This separates the processes of transaction ordering and block construction (see proposer/builder separation – PBS).
(3) Block attestation and BLS signature-related algorithms
- The signature part is the core of the consensus layer.
- The validation committee for each slot votes (using BLS signatures) and requires a 2/3 approval rate to construct a block.
- In the Ethereum PoS consensus layer, BLS signatures use the BLS12-381 elliptic curve, which is pairing-friendly and suitable for aggregating all signatures to reduce proof time and size.
- In proof-of-work, blocks can be reorganized (re-org). After the merge, the concept of “finalized blocks and safe head” is introduced on the execution layer. To create a conflicting block, an attacker needs to destroy at least 1/3 of the total staked Ether. In many ways, PoS is more reliable than PoW.
By the end of June 2023, "Puzzle Ventures Evening Study" introduced Hyper Oracle's zkPoS (using zk methods to validate the Ethereum consensus layer) in the middle part. For details, please see zkPoS: End-to-End Trustless
(4) Others: Weak subjectivity checkpoints
One of the challenges faced by trustless PoS consensus proofs is the choice of weak subjectivity checkpoints, which involves social consensus based on social information. These checkpoints serve as revert limits, as blocks before the weak subjectivity checkpoints cannot be changed. See: https://ethereum.org/en/developers/docs/consensus-mechanisms/pos/weak-subjectivity/
Checkpoints are also a point to consider in the zkification of the consensus layer.
2. Tech Stacks to Prove Consensus
In the proof consensus layer, proving signatures or other computations themselves are very expensive, but verifying zero-knowledge proofs is relatively cheap.
When choosing the method of using zero-knowledge proof consensus layer, the protocol needs to consider the following factors:
- What do you want to prove?
- What are the application scenarios after the proof?
- How to improve the efficiency of the proof?
Take Hyper Oracle as an example, for proving BLS signatures, they chose Halo2 instead of Circom used by Succinct Labs for the following reasons:
- Both Circom and Halo2 can generate zero-knowledge proofs for BLS signatures (BLS12-381 elliptic curve).
- Hyper Oracle does more than just zkPoS, its core product is a programmable on-chain zero-knowledge oracle. The ones directly facing users are zkGraph, zkIndexing, and zkAutomation, and it also uses zkWASM virtual machine to verify off-chain calculations. Although Circom is easier for engineers to use, it has poor compatibility and cannot ensure that the logic of all functions can be used.
- Circom-LianGuaiiring will be compiled into R1CS, which is incompatible with zkWASM and other circuit’s Plonkish constraint systems, while Halo2 LianGuaiiring circuit can be easily integrated into zkWASM circuit. In contrast, R1CS is not ideal for batch proofing.
- In terms of efficiency, Halo2-LianGuaiiring generates smaller BLS circuits, shorter proof time, lower hardware requirements, and lower gas fees.
Another key point in using zero-knowledge to prove the consensus layer is recursive proof, which means proof of proofs, packaging previous events into one proof.
Without recursive proof, the output will eventually be a proof of size O(block height), that is, each block attestation and its corresponding zkp. Through recursive proof, except for the initial state and the final state, for any number of blocks, we only need a proof of size O(1).
Verify Proof N and Step N+1 to get Proof N+1, i.e. you know N+1 pieces of knowledge, instead of verify all N Steps separately.
Going back to the initial goal, our solution should target “light clients” with computational and memory constraints. Even if each proof can be verified in a fixed time, if the number of blocks and proofs accumulates, the verification time will become very long.
3. The End Game: Diversified Level 1 zkEVM
The goal of Ethereum is not only to prove the consensus layer but also to achieve the zero-knowledgeification of the entire Layer 1 virtual machine through zkEVM, ultimately realizing diversified zkEVM to enhance Ethereum’s decentralization and robustness.
In response to these issues, Ethereum’s current solutions and roadmap are as follows:
“Lightweight” – smaller memory, storage, and bandwidth requirements
- Currently achieved by light nodes that only store and verify block headers.
- Future developments require further efforts in verkle trees and stateless clients, involving improvements to the mainnet data structure.
“Secure and trustless” – achieving minimal trust similar to full nodes
- Basic light node consensus layer has already been implemented, namely Sync Committees, but this is only a transitional solution.
- Use SNARK to verify Ethereum Layer 1, including verifying the Verkle Proof of the execution layer, verifying the consensus layer, and SNARKifying the entire virtual machine.
- Level 1 zkEVM is used to zero-knowledgeify the entire Ethereum Layer 1 virtual machine and achieve diversified zkEVM.
In an ideal scenario, when entering the zk era, we need multiple open-source zkEVMs – different clients with different implementations of zkEVM, and each client will wait for a proof compatible with its own implementation before accepting a block.
However, multiple proof systems may face some issues as each proof system requires a peer-to-peer network, and a client that only supports a certain proof system can only wait for the corresponding type of proof to be recognized by its verifier. Two major challenges that may arise include “latency challenge” and “data inefficiency.” The former is mainly caused by slow proof generation, allowing malicious actors to create temporary forks during the time difference in generating proofs for different proof systems. The latter is due to the need to generate multiple types of zk proofs, requiring the preservation of original signatures. Although theoretically the advantage of zkSNARK itself is the ability to remove original signatures and other data, some contradictions need to be optimized and resolved.
Four, Future Outlook
In order to bring more users to web3, provide a smoother experience, create higher usability, and ensure the security of applications, we must build infrastructure for decentralized data access, off-chain computation, and on-chain verification.
The proof consensus layer is one important component. In addition to Ethereum PSE and the aforementioned zkEVM layer2, there are other protocols that are using zero-knowledge proof consensus to achieve their application goals. This includes the Hyper Oracle (Programmable zkOracle Network) plan, which aims to use zero-knowledge proofs for the entire consensus layer of Ethereum PoS to obtain data. Succinct Labs’ TeleLianGuaithy is a light node bridge that achieves cross-chain communication by verifying Sync Committee consensus and submitting state validity proofs. Polyhedra, which was originally a light node bridge, has now declared the use of devirgo to achieve full node full consensus zk proofs.
In addition to cross-chain security and decentralized oracle, this off-chain computation + on-chain verification method may also participate in the fraud proof of optimistic rollup and integrate with OP L2; or in the intent-based architecture, provide on-chain proofs for more complex intent structures, and so on.
Here we are talking not only about the off-chain ecosystem surrounding Ethereum, but also the broader market outside Ethereum.
There are still many aspects of this topic that are worth further research. For example, a16z published an article on August 24th last week, which believes that “stateless blockchain” cannot be achieved. Other issues include weak subjectivity checkpoints, whether Sync Committee security is mathematically sufficient, etc. Interested peers are welcome to contact the author (firstname.lastname@example.org) to continue the discussion on this topic.
Thanks again to colleagues for their guidance and feedback, Alex @ IOBC (@looksrare_eth), Fan Zhang @ Yale University (@0xFanZhang), Roy @ Aki Protocol (@aki_protocol), Zhixiong LianGuain @ LianGuai (@nake13), Suning Yao @ Hyper Oracle (@msfew_eth), Qi Zhou @ EthStorage (@qc_qizhou), Sinka @ Delphinus (@DelphinusLab), Shumo @ Manta (@shumochu).
Annotated Ethereum Roadmap
Altair Hard Fork – The Beacon Chain
How will Ethereum’s multi-client philosophy interact with ZK-EVMs?, Vitalik Buterin
State of research: increasing censorship resistance of transactions under proposer/builder seLianGuairation (PBS), Francesco (Ethereum foundation)
How The Merge ImLianGuaicts Ethereum’s Application Layer, by Tim Beiko
Ethereum Developer Docs – Nodes and Clients, Ethereum Foundation
Building Helios: Fully trustless access to Ethereum, a16z
How I Learned to Stop Worrying and Love the Sync Committee, Uma Roy, Succinct Labs
zkPoS: End-to-End Trustless, msfew & Shuyang, Hyper Oracle
Proof of Consensus for Ethereum, Succinct Labs
zkLightClient on LayerZero, Polyhedra
Intent-Based Architectures and Their Risks, Quintus Kilbourn, Georgios Konstantopoulos, LianGuairadigm
RFP: OP Stack Zero Knowledge Proof, Optimism
Disclaimer: This research report represents the author’s independent viewpoint based on publicly available information, provided for reference and discussion purposes only. It does not constitute financial, investment, or any other advice.