What are the principles shaping the development of Ethereum 2.0, Serenity? Where are the landmarks and constellations that will guide us? What’s the map for the journey?
Author| Ben Edgington
Every project should have design goals. They capture the overall vision for the project, shaping and constraining its direction. Design goals provide a grid against which individual design decisions can be evaluated. They will be a measure of the project’s overall success or failure.
Back in November 2018, after some conversations at DevCon, Danny Ryan documented five explicit design goals for Ethereum 2.0. These design goals haven’t come from nowhere; they have been floating in the ether (sorry) for some time and reflect the priorities of the researchers formulating Ethereum 2.0. For more background, Danny’s Bitcoin Podcast episode is a good listen.
The list reads as follows: (The summary headings are mine, and I’ve rearranged the list to start with those I find the most interesting.)
- Decentralisation: to allow for a typical consumer laptop with O(C) resources to process/validate O(1) shards (including any system level validation such as the beacon chain).
- Resilience: to remain live through major network partitions and when very large portions of nodes go offline.
- Security: to utilize crypto and design techniques that allow for a large participation of validators in total and per unit time.
- Simplicity: to minimize complexity, even at the cost of some losses in efficiency.
- Longevity: to select all components such that they are either quantum secure or can be easily swapped out for quantum secure counterparts when available.
The design goals as stated encapsulate the “what” of Ethereum 2.0 — what it will look like — but they don’t really cover the “why”. In this article, I want to explore why each of these matters, and how it has already shaped the protocol development.
In truth, any one of these topics could merit a full article of its own, so this will be a paddle in the shallows, but I hope to give a reasonable view of where we’re heading. On the way, we’ll be touching on many of the new features of Ethereum 2.0.
To allow for a typical consumer laptop with O(C) resources to process/validate O(1) shards (including any system level validation such as the beacon chain).
An Internet map [Image — cropped from original]
One of the foundational principles of Ethereum has always been that it should be as decentralised as possible. In practice, this means that the barrier to participating in the protocol (i.e. mining in Ethereum 1.0) should be as low as possible, and that the benefits of centralising participation should be minimal.
Proof-of-work mining fails the latter test, and likely the former as well. We clearly see PoW mining becoming concentrated in large installations with access to economies of scale in hardware, cheap power and plentiful cooling. As mining moves towards requiring relatively costly ASICs, it is no longer cheap for individuals to participate. It is arguable, but most agree that proof-of-stake protocols will not suffer such strong centralising pressures, and this is one of the reasons driving Ethereum 2.0’s move to PoS.
Striving to enable decentralisation has guided every part of the design of Ethereum 2.0.
For example, a decision was made last summer to break free from the limitations of the current Ethereum 1.0 Mainnet and, instead, to implement the Ethereum 2.0 sharded system on a brand new proof-of-stake beacon chain. This immediately allowed for far more validators (the Ethereum 2.0 analogue of miners) to be supported within the system, reducing the minimum stake amount from 1000 Ether to 32 Ether per validator. Thirty-two Ether is a much, much lower barrier to participation than one thousand. In addition, much milder penalties are now imposed on validators that are occasionally offline, which opens participation to a whole range of individuals with consumer-grade hardware who can’t promise five-nines uptime.
Having said that, this design goal is not actually about proof-of-stake. It is about ensuring that everything essential for maintaining the Ethereum 2.0 system can be performed on a widespread network of commodity hardware. To state the converse: no participant will be required to have big-iron in order to be a full participant of the system.
The Ethereum 2.0 protocol design is deliberately free of any kind of “super-node” that requires particularly high CPU or memory requirements. Such nodes might exist, for example, at exchanges or block-explorers, but they have no in-protocol advantage over regular validator nodes. Instead, extensive use of light-client protocols will allow validating nodes to focus their resources on securing the beacon chain and the particular shard they are currently working on, while maintaining only a small amount of “need-to-know” information on the rest of the system.
In terms of the detailed statement of the design goal, the only computer resources you need to be a full participant of the Ethereum 2.0 protocol, securing the system and receiving staking rewards, is a consumer-grade laptop or small hosted VPS.
Radically decentralised design is not easy — it is much easier to design highly performant centralised protocols — but it is a non-negotiable for Ethereum.
To remain live through major network partitions and when very large portions of nodes go offline.
[Image — cropped from original]
What happens to a blockchain when 80% of its nodes go offline for an extended period?
There is an expectation that the Ethereum blockchain will one day underpin critical infrastructure: payments, identity, power generation, and more. It is not acceptable for infrastructure like this to be unavailable. It must keep running, even in a catastrophe. Even if some huge country were to cut off all protocol traffic via its national firewall one day. Even in World War III.
An elementary theorem of distributed systems states that any system can choose only two of the following: consistency, availability, and partition tolerance. Since network partitions are hard to avoid — nodes go offline or become available for all sorts of reasons — in practice this is a choice between consistency (do all nodes give the same answer to a query) and availability (does the network remain running). Basically, the only way to guarantee consistency in the event of a partition is to shut down the network.
This goal has strongly influenced the design of the random number generator in the beacon chain. The current gold-standard of distributed random number generation is Dfinity’s threshold-relay, which relies on cryptographic signatures to generate (pseudo-) randomness in a way that can’t be manipulated by participants. Ethereum 2.0 had to find an alternative approach, since the threshold-relay is not able to keep operating through network partitions. Instead, we use a classical RANDAO, where participants contribute randomness that gets combined together. Now, RANDAOs are known to be slightly vulnerable to manipulation (the last-actor problem). For this reason, the consensus mechanism was designed to make it more robust against attempts to manipulate the randomness. An attacker would have to, by chance, control a large number of consecutive blocks at just the right time to have any opportunity to harm the system: the protocol design makes this possibility overwhelmingly unlikely under the assumption that two-thirds of the participants are honest.
In any case, this RANDAO construction favours liveness: the blockchain can keep running, even if many participants are not available.
Another example of this principle in practice is the “inactivity penalty quotient” that used to delight in the name “inverse sqrt e drop time”. In order to finalise a block on the beacon chain, a super-majority of (the stake held by) validators has to cast a vote for it. If a large number of validators is offline, or behind a network partition, then it may become impossible to get this super-majority. The solution to this is to progressively cut validators’ stakes if they fail to show up during periods when the blockchain hasn’t finalised anything for a while. This reduction takes place over a relatively long period of time, around 18 days, to allow time for the network to come back together if possible. Eventually, however, the stakes held by non-participating validators become low enough that majority stake voting is again possible, and the blockchain can resume making progress in finalising blocks.
(Note that this behaviour with respect to liveness gives users and applications the most choice: during periods when the chain is not finalising blocks, individual applications can decide on whether to use it or not, based on their own criteria. If the chain just stops, it takes away this flexibility. Thanks to Danny Ryan for this insight.)
One of the promises of Ethereum has always been that it is “unstoppable”. We are striving to deliver this in Ethereum 2.0 as well.
To utilize crypto and design techniques that allow for a large participation of validators in total and per unit time.
Almost a third of the faces are frowny, but they do not dominate any subset.
Now, let’s consider security in Ethereum 2.0, by which I mean that it is as hard as possible for attackers to make the network behave in unexpected ways (such as a 51% attack in the PoW protocol).
In Ethereum 2.0, this kind of security comes from having a large pool of validators who are in turn organised into large committees: each signing-off on protocol activities, such as checking data availability and voting to finalise transactions.
The advantages of having a large number of available validators and large committees are several. First, a large validator pool allows more opportunity for decentralisation and therefore diversity of participants, which makes collusion less likely and more difficult. Nonetheless, the threat model assumes that an attacker could potentially end up controlling up to a third of all the validators.
Second, if something that violates the protocol occurs — perhaps finalising two blocks at the same height — it means that many validators must have disobeyed the rules (by voting for two blocks). This behaviour is detectable and the misbehaving group will be punished by having their stakes wiped out. Requiring that a great many validators sign-off on any decision therefore gives strong “economic security”: a lot of money will be lost by the bad actors if they misbehave. This is not the case in proof of work chains: 51% attackers bear only the marginal costs of running their hardware: their ASIC farms don’t burn down.
The third reason is mathematical. If we assume our total pool of validators has up to a third dishonest validators, then, the larger the committees we select from that pool, the less likely it becomes that any single committee will have a majority of dishonest members. To illustrate: say we have 1000 validators, 333 of which are dishonest. If I randomly select a committee with one member, then there is a 33.3% chance that the committee is a dishonest validator. If I randomly select a committee of three members, there is a 25.9% chance that two out of the three are dishonest. Thirteen members gives a 10% chance that a majority is dishonest, and so on. At the other end of the scale, if I select a committee of 667 members then there is a 0% chance that over half are dishonest.
In Ethereum 2.0, an attacker would need to gain a 2/3 majority within a committee to do serious damage. With the chosen minimum committee size of 128, selected from a pool of several thousand validators, there is less than a one-in-a-trillion chance of this happening, even if the attacker manages to control a third of the whole validator pool.
Achieving this design goal enables the widespread use of validator committees within the protocol: basically every protocol action is voted on by committees of randomly selected set of members. These committees are constantly shuffled, in some cases as frequently as every 384 seconds. A cryptographic innovation that has made this possible is the use of BLS aggregate signatures. Basically, committee members can individually sign-off on decisions, and these individual signatures can be combined into a single signature that is easy to verify. Without this capability, the time taken to validate signatures from individual committee members would severely limit the number of validators that could participate in any decision.
To minimize complexity, even at the cost of some losses in efficiency.
Ethereum 2.0 is not a Rube Goldberg machine [image: public domain]
This is, perhaps, the easiest of the goals to justify. Complexity is the enemy of security. To ensure that Ethereum 2.0 always functions as intended, we must be able to reason about its protocols. It must be possible to analyse its behaviours to root out the corner cases and perverse incentives. We should be able to perform formal verification as much as possible.
If you look over the current specification, “simplicity” may not be your first impression. But, simplicity is not just about lines of code, it is primarily about the concepts we are implementing. Any blockchain technology already sits at the intersection of three notoriously tricky disciplines: distributed systems, cryptography and game theory. Even in a relatively straightforward cryptoeconomic system such as Ethereum 1.0, unintended consequences can arise. Gas Token is an example of this: a mechanism designed to reduce the amount of data stored in the blockchain state has resulted in a large increase in data stored. The simpler our protocol, the better we can analyse and defend against oddities like this.
The blockchain world is alive with innovation. It’s a rare day when I don’t hear about an amazing new consensus protocol, a funky new cryptographic primitive, a groovy new cryptoeconomic gadget. By comparison, the design of Ethereum 2.0 can look conservative, although it does in fact embody a tremendous amount of innovation. The conservatism is deliberate: the design must be as simple as it can be.
As an example, there are similarities between the design of Polkadot and the design of Ethereum 2.0. (Which is unsurprising as Gavin Wood did some of the early thinking around Ethereum sharding.) Both are sharded protocols: in Polkadot each shard can run a different protocol, whereas in Ethereum 2.0 every shard runs the same protocol. There are attractions to Polkadot’s heterogeneous design, but in the end, for Ethereum, minimising complexity wins.
The idea is to put only the minimum necessary apparatus into the Ethereum 2.0’s Layer 1, the mission-critical consensus layer, and to push complexity further up the stack. This is one reason why features like identity and privacy are not baked into the protocol. Vitalik’s article, “Layer 1 Should Be Innovative in the Short Term but Less in the Long Term”, is great further reading on this.
To select all components such that they are either quantum secure or can be easily swapped out for quantum secure counterparts when available.
[Image — cropped from original]
If and when sufficiently capable quantum computers become available, it will severely weaken much of the cool cryptography currently available, in particular BLS and ECDSA digital signatures. No-one knows when this might happen, but it’s best to be prepared if we intend our blockchain to live longer than a few decades.
Some elements of the Ethereum 2.0 design, such as the hash-based RANDAO, are already believed to be post-quantum secure. Others, such as the signature aggregation using elliptic curve based BLS signatures, are known to be potentially vulnerable.
One approach under development is STARKs — Scalable, Transparent ARguments of Knowledge. These exist in zero-knowledge form as ZK-STARKS (useful for applications providing privacy), but are also useful as-is for efficiently proving that protocol actions have been correctly made, without the checker having to verify every detail of the action.
The key point here about STARKs, aside from all their other cool properties, is that they are designed to be “post quantum secure”.
However, we need to be cautious. The general rule of thumb used by applied cryptographers is, “don’t trust it until it’s had at least a decade of scrutiny”. STARKs are very new, and still evolving. Work on their security properties is only just getting started.
There’s time yet, and the main point is to ensure that the cryptographic mechanisms that we currently know to be quantum vulnerable are easily swappable for post-quantum algorithms in future. Concrete examples of where STARKs could replace elements of the Ethereum 2.0 protocol are in creating data-availability proofs, and with the signature aggregation protocol mentioned above.
Decentralisation, resilience, security, simplicity, longevity — these are the principles underlying the design of Ethereum 2.0. These make up its DNA. These design goals ensure that Ethereum 2.0 inherits a uniquely “Ethereum” identity, derived from the vision of its community. These goals, or principles, or values, are what sets Ethereum 2.0 apart from other blockchain designs.
At PegaSys, a ConsenSys project, we are building Artemis, an implementation of the Ethereum 2.0 protocol, alongside working on fundamental research around the protocols for Ethereum 2.0. We are working hard, helping to make these ambitions a reality.
Many thanks to Danny Ryan for his review, comments and input! Also thank you to colleagues at PegaSys for review, feedback and thoughtful questions: Olivier Bégassat, Tim Beiko, Gautam Botrel, and Horacio Mijail Anton Quiles.