[Keynotes] Possible futures of the Ethereum protocol, part 2: The Surge
The Ethereum project initially focused on two key approaches to improve scalability: sharding and layer 2 (L2) protocols. Sharding, proposed early on, aimed to distribute transaction verification across nodes so that each node would only handle a small subset of all transactions. This model mirrors how peer-to-peer networks like BitTorrent manage data. The other approach, layer 2 protocols, sought to operate independently atop Ethereum while still benefiting from its security, with iterations evolving from state channels (2015) to Plasma (2017) to rollups (2019). Rollups proved more powerful but required considerable on-chain data bandwidth. By 2019, sharding research had tackled the data availability issue, allowing both approaches to merge into the “rollup-centric” roadmap, Ethereum’s current scaling strategy.
The rollup-centric approach envisions Ethereum’s base layer (L1) as a secure, decentralized foundation, with L2s handling scaling. This design parallels other hierarchical systems, like how courts ensure justice while leaving societal growth to individual entrepreneurs. Key developments in 2023, such as the bandwidth boost via EIP-4844 blobs and several EVM rollups reaching maturity, have advanced this roadmap. Ethereum’s sharded L2s each operate independently, yet collectively maintain L1’s robustness and decentralization.
Ethereum’s scaling goals, termed “The Surge,” target 100,000 transactions per second across L1 and L2s, safeguarding L1’s decentralization while enabling L2s to inherit Ethereum’s core qualities: trustlessness, openness, and resistance to censorship. Interoperability among L2s is essential, uniting Ethereum’s ecosystem.
The “scalability trilemma” — introduced in 2017 — argues that decentralization, scalability, and security cannot all be fully optimized. Attempts by other blockchains to claim they solved the trilemma without significant architecture changes have proven misleading, as they often sacrifice decentralization. Ethereum addresses this with SNARKs and data availability sampling, allowing for scalable verification of data and computations while maintaining decentralization. Plasma architectures also offer a potential solution by shifting data responsibility to users, now more feasible thanks to SNARKs.
Further progress in data availability sampling
Problem Overview
The Ethereum blockchain, as of the Dencun upgrade in March 2024, can process three ~125 kB blobs per 12-second slot, allowing ~375 kB of data availability bandwidth per slot. When transaction data is published on-chain, this equates to a maximum of about 173.6 transactions per second (TPS) for rollups. Even adding Ethereum’s calldata, which can theoretically support 607 TPS, does not meet long-term scalability goals. Using the PeerDAS strategy, Ethereum aims to achieve 16 MB per slot, targeting a potential 58,000 TPS with improved data compression.
Mechanics of PeerDAS
PeerDAS, a one-dimensional (1D) sampling mechanism, divides data into polynomial “shares” broadcast over subnets. Each subnet transmits a specific sample of the data, allowing other nodes to request the shares they need from peers. This approach could support up to 256 blobs (16 MB per slot) while keeping the data bandwidth needed per node manageable at around 1 MB. However, further scaling will require two-dimensional (2D) sampling, adding redundancy across blobs and making distributed block construction feasible without the full data.
Remaining Steps and Trade-offs
PeerDAS is being incrementally expanded and improved to reach full scalability. Longer-term goals include formalizing 2D DAS, ideally using quantum-resistant cryptography alternatives to the current KZG commitment scheme. Three main strategies for data availability are under consideration: advancing to 2D DAS, retaining 1D DAS with reduced scalability, or pivoting to Plasma as the core Layer 2 architecture.
Interaction with Ethereum’s Roadmap
Data compression could reduce the need for extensive data sampling, and a widespread Plasma adoption might further delay 2D DAS. The final decision on DAS will shape the protocols for distributed block building and fork choice rules.
Data Compression for Rollups
Problem Overview
Current data requirements for transactions in rollups are high. With a 16 MB data slot limit, scalability reaches approximately 7,407 TPS. Reducing transaction size would allow higher TPS within the same bandwidth constraints.
Compression Techniques
- Zero-byte compression reduces long zero-byte sequences.
- Signature aggregation replaces multiple signatures with a single BLS signature.
- Address pointers replace repetitive address data with shorter pointers, saving storage space.
- Custom serialization for values optimizes representation for common transaction values, reducing their byte count.
Remaining Steps and Trade-offs
Implementation of compression techniques, such as BLS signatures, poses compatibility challenges with certain hardware. Compression involving address pointers or state diffs increases complexity and reduces transaction auditability.
Roadmap Interactions
ERC-4337 could support signature aggregation, and integrating parts of it in L2 would accelerate compression adoption.
Generalized Plasma
Problem Overview
For high-bandwidth applications like payments and social media, Ethereum needs more scalability. Plasma, a design where operators publish Merkle roots on-chain instead of the full block, could help by enabling users to verify and withdraw assets with on-chain proofs in the event of operator failure.
Working of Plasma
Operators provide Merkle proofs for asset states, allowing users to withdraw assets even if some data is unavailable. With SNARK verification, Plasma systems could secure more general types of assets and enable fast, challenge-free withdrawals when operators act honestly.
Remaining Steps and Trade-offs
Producing Plasma systems for general use requires balancing complexity with performance and ensuring reduced trust dependence. Hybrid Plasma-rollup designs, such as Intmax, could enhance scalability to approximately 266,667 TPS by combining the best of both worlds.
Interaction with Roadmap
Effective Plasma solutions would reduce L1’s data demands and MEV pressure, especially if Plasma becomes the primary scaling tool, lowering reliance on high-performance L1 data availability.
Maturing L2 Proof Systems
What problem are we solving?
- Current State: Most rollups are not yet fully trustless; they often rely on a security council that can override the proof system. Some proof systems are inactive or only advisory.
- Challenge: Bugs in code are a major barrier to fully trustless rollups. However, truly trustless rollups are essential, and overcoming this barrier is crucial.
How does it work?
Stage System:
- Stage 0: User nodes can sync the chain, but validation is centralized.
- Stage 1: A trustless proof system validates transactions, but a security council can override with a supermajority (75% vote). Upgrades must allow sufficient delay for user exit.
- Stage 2: Trustless validation with council intervention allowed only in provable bug cases. Upgrades require very long delays.
Goal: Achieve Stage 2 by establishing confidence in the proof system through:
- Formal Verification: Mathematical techniques (e.g., using Lean 4 or AI-assisted proving) ensure only valid transactions are approved.
- Multi-Provers: Multiple proof systems (2-of-3) and a security council, with council intervention allowed only if proofs disagree.
Tradeoffs and Remaining Tasks:
- Formal Verification: Needs a fully verified SNARK prover for EVM; a simplified route involves verifying a minimal VM (e.g., RISC-V or Cairo).
- Multi-Provers: Requires confidence in individual proof systems and their independent failure patterns. High gas costs are a tradeoff for increased security.
- Interaction with Roadmap: Moving activity to L2 can reduce MEV pressure on L1.
Cross-L2 Interoperability Improvements
What problem are we solving?
- Current Challenge: Navigation within the L2 ecosystem is complex and often reintroduces trust through centralized bridges or RPC clients. For a unified Ethereum experience, this needs to change.
How does it work?
Categories of Improvements:
- Chain-Specific Addresses: Distinct addresses for each chain (L1, Optimism, Arbitrum) to simplify cross-L2 transactions.
- Chain-Specific Payment Requests: Enables standardized requests (e.g., “send me X tokens on chain Z”) for payments and dapp funding.
- Cross-Chain Swaps and Gas Payment: Standardized protocol for cross-chain transactions, such as ERC-7683 and RIP-7755.
- Light Clients: Tools like Helios for Ethereum extend trustless verification to L2s.
- Keystore Wallets: Streamlines key updates across multiple L2s.
- Shared Token Bridge: Creates a minimal rollup for cross-L2 token transfers, reducing gas fees and eliminating the need for liquidity providers.
- Synchronous Composability: Allows synchronous calls between L2s or L2 and L1 for improved DeFi efficiency.
Tradeoffs and Remaining Tasks:
- Standardization Dilemmas: Premature standardization risks limiting progress, while delays lead to fragmentation. Social cooperation among L2s, wallets, and L1 is essential.
- Interaction with Roadmap: Most improvements affect higher layers and have minimal impact on L1, though shared sequencing could significantly impact MEV.
Scaling Execution on L1
What problem are we solving?
L1 vs. L2 Scaling: If L1 capacity remains limited, risks arise, including weakened ETH economics, disincentives for L2 adoption, and limited fallback capacity in L2 failure scenarios. L1 scaling is therefore valuable for long-term resilience and functionality.
How does it work?
Strategies for Scaling:
- Increase Gas Limit: Effective but risks L1 centralization. Complementary tech (e.g., history expiry, statelessness) and optimized client software help mitigate risks.
- Cheaper Operations: Make some computations (e.g., addition vs. multiplication) more efficient by adjusting gas costs. Examples include EOF, multidimensional gas pricing, and EVM-MAX with SIMD for modular arithmetic.
- Native Rollups: Run parallel EVM instances within L1 for improved capacity, akin to what rollups offer but directly integrated.
Tradeoffs and Remaining Tasks:
Each strategy has different impacts. Native rollups offer capacity but limit synchronous composability across shards. Increasing the gas limit could reduce decentralization, while making certain operations cheaper may complicate the EVM further.
Interaction with Roadmap: L1 scaling impacts MEV, slot times, and depends heavily on successful L1 verification via “the Verge.”
You can read the full article on Vitalik Buterin website.