The biggest trust risk of Rollup the unavoidable issue of man-made governance

Author: Link, “Geek Web3”

Introduction: Since Solana gradually declined and OP issued tokens, Layer2 and Rollup seem to have become new havens for countless Web3 practitioners. With the continuous spread of the bear market and FTX’s bankruptcy and Multicoin’s heavy losses, Ethereum’s competitors have gradually faded out of the Web3 stage, constantly losing the confidence to compete with ETH. More and more people are starting to see Rollup as the new narrative core, and more and more projects are emerging on L2 like mushrooms after rain.

But is all of this a “false prosperity,” a “bubble that could be punctured at any time”? Is Rollup and L2 really as good as most people claim? Is it really as secure as people perceive it to be? Not to mention the fact that many OP Rollups do not have fraud proofs, what other security risks does Rollup have?

This article is inspired by the recent release of “Upgradeability of Ethereum L2s” by L2BEAT. It discusses the multi-signature and committee trust risks behind Rollup upgrades (immediately upgrading Rollup contracts and taking away user assets) and the long-standing discussions about Rollup, combined with the recent Multichain, to talk about why L2 is not as “good” as many people think.

Summary of Rollup Principles

A brief overview of Rollup operation principles:

Ethereum Rollup = a set of contracts on Layer1 + Layer2 network’s own nodes.

The community of Layer2 network nodes can be divided into several roles, with the most important being the sequencer. It receives transaction requests that occur on Layer2, determines their execution order, and then packs the transaction sequence into batches and sends them to the Rollup contract on Layer1 (referred to as Rollup contract in the following text).

Layer2 full nodes can directly obtain transaction sequences from the sequencer, or read transaction batches sent by the sequencer to Layer1. However, the latter has a higher level of finality (immutability) than the former. Usually, once a batch of transactions is sent by the sequencer to Layer1, the order of these transactions cannot be changed (as long as Ethereum does not experience block rollbacks, the transaction sequence of Rollup will not change).

Since transaction execution changes the state of the blockchain ledger, in addition to transaction order, Layer2 full nodes also need to synchronize the state of the ledger with the sequencer to ensure consistency.

Therefore, the sequencer not only sends transaction batches to the Rollup contract on Layer1, but also sends the updated results of transaction execution (State root/State diff) to Layer1.

It is not difficult to see that L1 (Ethereum) actually serves as a bulletin board for L2 nodes, which is far more decentralized, trustless, and secure than L2’s own network. For L2 full nodes, as long as they obtain the Rollup transaction sequence on L1 + the initial Stateroot, they can reconstruct the L2 blockchain ledger and calculate the latest Stateroot. If the Stateroot calculated by the L2 full node is inconsistent with the Stateroot published by the sequencer on L1, it indicates that the sequencer is engaged in fraudulent behavior.

The most intuitive hypothetical scenario is: Can the sequencer of L2 steal user assets? For example, can it forge transactions that should not happen (e.g., transfer some L2 users’ tokens to the address of the sequencer operator, and then transfer these tokens to L1)? Such issues can be summarized as: What should be done when the sequencer publishes incorrect transaction data or an incorrect Stateroot?

There are different countermeasures for sequencer fraud risks in different types of Rollups. Optimistic Rollup allows L2 full nodes to provide fraud proofs, proving that the data published by the sequencer on L1 is incorrect. For example, Arbitrum sets up a whitelist for nodes, allowing L2 nodes on the whitelist to publish fraud proofs.

In addition, considering that most exchanges and private cross-chain bridge projects run L2 full nodes, errors can be discovered immediately, and the success rate of most Rollup sequencers stealing coins is basically 0 (because they still need to cash out in the end, either by completing the transaction on the exchange or by transferring the stolen coins to L1 and finding another way out).

(The Aggregator in the figure is actually the sequencer)

However, for Optimism without fraud proofs, the sequencer can steal coins through the Rollup’s own cross-chain bridge contract. For example, the sequencer operator can forge transaction instructions to transfer other people’s assets on L2 to their own address, and then transfer the stolen coins to L1 through the Rollup’s built-in bridge contract. Because there are no fraud proofs, OP full nodes cannot challenge erroneous transactions, so theoretically, OP sequencers can steal users’ assets on L2 (if they really want to).

The solution to this problem is “social consensus” (relying on community members and social media for public supervision) or relying on the official endorsement of OP.

Interestingly, a certain exchange recently reduced the delay for users to transfer coins from Arbitrum and Optimism to the exchange (from 100 L2 blocks to 1 L2 block), which actually implies trust in the sequencers of ARB and OP that they will not engage in malicious activities (assuming they are endorsed by official centralized servers).

Unlike Optimistic Rollup, in addition to relying on L2 full nodes, ZK Rollup solves the problem of sequencer fraud through Validity Proof (often confused with ZK Proof). In the ZK Rollup network, there is a type of node called Prover, which is dedicated to generating Validity Proof for transaction batches published by the sequencer. At the same time, there is a contract on L1 specifically for verifying the Validity Proof (usually called Verifier). As long as the proof corresponding to the transaction batch and Stateroot/State diff passes the verification of the Verifier contract, it will be finalized. The official bridge of ZK Rollup will only allow withdrawal transactions that have been verified by Validity Proof, which is obviously much more reliable than Optimism.

In theory, the security of OP Rollup is guaranteed by L2 full nodes (at least one honest node that can publish fraud proofs). The security of ZK Rollup is guaranteed by the Verifier contract on L1 (transaction finalization is completed by L1 nodes). On the surface, they can both “inherit the security of L1” (rely on L1 for transaction finalization/settlement). Ethereum maximalists even refer to them as “equivalent to the security of L1” (consistent with the finality of L1’s transaction results). However, the reality is far from that, or even much different.

The “old and recurring” points

Firstly, the generation of Validity Proofs in ZK Rollup is extremely slow. The sequencer can execute thousands of transactions within a second, but it may take several hours at most to generate proofs for these thousands of transactions. However, this problem is easy to solve. Mainstream ZKRs generally improve the speed of proof generation by dividing the proof generation tasks and assigning them to different Prover nodes for parallel processing.

Secondly, we need to consider the delay of L2 nodes in publishing data on L1. Because every time the sequencer or Prover sends data to L1, there is a fixed cost (similar to consuming a container every time goods are shipped). It is not cost-effective or even loss-making to frequently publish data on L1, so the sequencer and Prover will try to minimize the frequency of publishing data on L1 and wait until a large amount of data is accumulated before packaging and publishing.

In other words, when the number of users is small and the number of transactions initiated is not large enough, the sequencer will delay publishing data to L1. For example, when there were fewer users last year, Optimism sent transaction batches to L1 every half an hour. Now, because there are more users, this problem has been effectively solved. Unlike OP, Starknet reduces the frequency of publishing State diffs to reduce data costs, which has resulted in a delay of 7-8 hours for transaction finalization in Starknet.

In addition, in order to further reduce costs, many ZK Rollups often “aggregate many proofs and send them to L1 at once”. In other words, the Prover does not immediately send a proof to L1 after generating it, but waits until multiple proofs are generated, aggregates them, and then sends them to the Verifier contract on L1. (In fact, the process of aggregating proofs is to use one proof to contain the computation steps generated by multiple proofs)

The consequence of doing so is that the frequency of Proof issuance is further reduced, and the delay from transaction initiation to final confirmation is further extended.

According to the block explorer, the transaction confirmation delay of Polygon ZKEVM is about 30~50 minutes, while Starknet and Zksync Era are over 7 hours. Obviously, this is only “partial inheritance of L1 security” and is far from the “equivalent to L1 security” claimed by Ethereum supporters.

Of course, all of the above problems can be solved through technological progress in the near future. For example, many projects are developing high-performance hardware to reduce the generation time of validity proofs; Optimism also promises to soon release a fraud proof system; Ethereum’s Danksharding solution will significantly reduce the data cost of Rollup, effectively solving the problems listed above.

The difficult problem of “human governance”

Like DeFi and other application projects, the operation of the Rollup network relies on relevant contracts on L1, and these contracts are “upgradable”, which means that part of the code can be replaced (most Rollups use proxy contracts), and changes can be made immediately under the authorization of multisig or security committees. To put it simply: Rollup can steal user assets by quickly changing the contract code on L1 through a multisig or security committee controlled by a minority of people.

First, let’s talk about why “Rollup contracts need to be upgraded” and “how they are upgraded”. Contract code on Ethereum is immutable after deployment, but Rollup inevitably has various bugs during development, which may lead to incorrect results; at the same time, Rollup also needs to frequently add new features due to frequent product iterations; in more extreme cases, the Rollup contract may be attacked by hackers, so Rollup contracts need to be upgradable, which is often achieved through proxy contracts.

Proxy contracts are actually a commonly used method in Ethereum contract development, which separates the data and business logic of the contract and stores them in different contracts. The data (state variables) is stored in the proxy contract, and the business logic (functions) is saved in the logic contract. The proxy contract delegates the execution process of the function to the logic contract through delegatecall, and then returns the final result to the caller.

In the contract upgrade under the proxy pattern, it only requires redirecting the proxy contract to the new logic contract (rewriting the address of the logic contract stored in the proxy contract). Most Rollup projects have adopted this method for contract upgrades, which can be described as simple and straightforward.

It is not difficult to imagine that the upgradability of Rollup contracts is actually a huge risk. If the upgraded contract contains malicious code, such as modifying the withdrawal conditions of the Rollup’s built-in Bridge contract or changing the conditions for the Verifier contract to determine the correctness of the validity proof, the sequencer can steal funds (as explained earlier).

However, the problem is that Rollup contracts cannot be disallowed from being upgraded, as explained earlier. Therefore, in most cases, Rollups will use DAO governance, security committees, or multisig authorization to decide whether to upgrade the Rollup contract. In addition, a time lock (Timelock) will be set to provide a delayed window for contract upgrades.

Considering that most DAO proposals have automated execution processes (implemented through on-chain contracts), even if a contract upgrade is to be made, sufficient votes must be obtained first, and then it must go through the delay specified by the time lock (which often takes many days) before the contract upgrade can be executed. If someone wants to maliciously upgrade the contract, they need to pass the governance attack (such as the governance attack that occurred on Tornado Cash). However, this approach is costly and requires obtaining enough tokens, so it is unlikely to succeed under normal circumstances. Even if the governance attack succeeds, due to the restrictions of the time lock, users will have enough time to withdraw their assets from L2, and the Rollup team will also have enough time to take emergency measures.

It seems that the time lock is the magic weapon to solve malicious contract upgrades. However, the so-called “emergency measures that the Rollup team can take” actually bypass the DAO governance and time lock, and immediately upgrade the Rollup contract through multisig or security committee authorization. Considering that mainstream Rollups now custody billions of dollars in user assets, the “contract can be immediately upgraded” authorized by multisig and security committees is the ultimate emergency measure, but it is also a Damocles sword hanging over all users.

Obviously, this is a problem of maximizing trust: you need to trust that the Rollup team will not have the intention to steal your assets. If we consider it from a trustless perspective (as Nick Szabo’s viewpoint), all Rollups controlled by multisig and security committees are insecure. Avalanche founder Emin Gun Sirer, Solana founder Anatoly, and famous hacker Justin Bons have all emphasized this type of issue.

Which Rolllups are controlled by multisig/committee?

According to the report “Upgradeability of Ethereum L2s” published by the well-known L2 research institution L2 BEAT and the L2BEAT data visualization website, mainstream Rollups such as Arbitrum, Optimism, Loopring, ZKSync Lite, ZkSync Era, Starknet, and Polygon ZKEVM all have upgradable contracts authorized by multisig or committee, and can bypass time lock restrictions.

Although dYdX has an EOA address to bypass DAO governance upgrade contracts, it is subject to time lock restrictions (at least 2 days delay). Immutable X has a 14-day contract upgrade delay. Therefore, according to L2BEAT, dYdX and Immutable X are more trustless than other mainstream Rollups already deployed on the mainnet.

So how can we reduce the trust risk brought by multisig and security committees? The answer is actually similar to the Multichain incident: it can be attributed to the anti-sybil problem. It is necessary to ensure that the multisig/committee is controlled by multiple entities with no significant overlapping interests and low collusion risk. Currently, besides increasing the maturity of DAO decentralized governance and inviting reputable individuals or institutions to participate in multisig/committee, there doesn’t seem to be a better solution. And these scenarios seem to be common in the real world of democratic politics.

Of course, it is also possible to limit the contract upgrade behavior managed by multisig/committee through time locks, but this requires weighing many factors because the purpose of multisig/committee is to quickly handle some emergency situations. At the same time, if the Rollup project team does not have a firm determination on the issue of trustlessness, this problem cannot be solved.

Therefore, although different Rollup projects can guarantee the security of user assets in the vast majority of cases through clever mechanism design, the probability of black swan events occurring in Rollups is not zero due to the existence of multisig and committees. Even if the probability of collusion among multisig and committee members is only one in ten thousand, considering the value of assets hosted on L2 (assuming it is 10 billion US dollars), the daily risk of L2 user assets is still as high as 1 million US dollars. This is reminiscent of the Multichain incident and is truly chilling.

So personally, I believe that, as Polynya said before, most of the funds within the Ethereum ecosystem will still tend to circulate and be locked on L1, rather than L2. The Rollup ecosystem cannot capture the majority of the value within the Ethereum ecosystem in the long term. For large holders and whales, the Ethereum mainnet is obviously a more suitable and reliable destination for funds than L2. So, the answer to the question that many people previously considered, “Will the rise of L2 lead to the decline of L1,” has already been answered.

As Keigo Higashino said in his work, understanding human nature is much more difficult than understanding mathematical formulas. It is more complex, harder to change, and harder to comprehend. Many things cannot be solved by purely technical means. Whenever it involves “human nature,” it will always be the most uncontrollable, unpredictable, and serious problem that needs to be treated with caution. In this regard, let us remember the immortal words on Kant’s tombstone:

“There are two things that constantly surround my mind. The deeper I ponder them, the more wonder and awe they evoke in my heart: the inner moral law and the brilliant starry sky above my head.”

Like what you're reading? Subscribe to our top stories.

We will continue to update Gambling Chain; if you have any questions or suggestions, please contact us!

Follow us on Twitter, Facebook, YouTube, and TikTok.

Share:

Was this article helpful?

93 out of 132 found this helpful

Gambling Chain Logo
Industry
Digital Asset Investment
Location
Real world, Metaverse and Network.
Goals
Build Daos that bring Decentralized finance to more and more persons Who love Web3.
Type
Website and other Media Daos

Products used

GC Wallet

Send targeted currencies to the right people at the right time.