WPRC 2026#027

Alpenglow: Solana's Consensus Upgrade

WPRC-027· SF· 2025. 06· INFRASTRUCTURE

Alpenglow: Solana's Consensus Upgrade

AlpenGlow solves Solana's 12.8-second finality problem by introducing a novel consensus mechanism that achieves transaction finality in ~100ms, replacing Tower BFT with a simpler, more efficient protocol that removes Proof of History while maintaining security.

Contributors

SolarSolana Chinese Community·Rongxin

The WhitePaper Reading Club Singapore [28] - 24 June 2025
Alpenglow: Solana’s largest upgrade to date	Solar (Solana Chinese Community), Rongxin

Summary

AlpenGlow solves Solana's 12.8-second finality problem by introducing a novel consensus mechanism that achieves transaction finality in ~100ms, replacing the complex Tower BFT with a simpler, more efficient protocol that removes Proof of History while “maintaining security”.

Overview

Alpenglow consensus protocol dramatically improves speed and efficiency. Co-developed by Solana’s research spin-off Anza in 2024–2025, Alpenglow throws away legacy components (like TowerBFT and the PoH hash clock) and introduces Votor for fast voting and Rotor for data broadcast. Alpenglow achieves 100× faster finality than Solana’s current consensus, reaching block finality in ~150 ms (median) under good conditions . Moves validator voting off-chain (using cryptographic vote certificates), while keeping security guarantees.(i) ~100–150 ms Deterministic Finality: Reduces Solana’s time-to-finality from ~12.8 seconds to ~0.1–0.15 seconds. Blocks are now finalized based on network latency, and node failures. It could be a 1, 2 or multiple rounds of votes (as fast as ~100 ms in optimal cases). (ii) Simplified Architecture – No PoH or On-Chain Votes: Replaces Proof of History and TowerBFT with fixed 400 ms slots and a two-mode BFT voting scheme. Votor handles consensus off-chain via lightweight BLS signature certificates, eliminating per-block vote transactions and their fees . This slashes validator overhead and complexity while slowing ledger growth (fewer extraneous txns). (iii) High-Speed Block Propagation (Rotor): Uses “erasure-coded” one-hop broadcast (inspired by Solana’s Turbine) to send blocks. The leader’s block is split into “shreds” and sent to “stake-weighted” peers. (iv) “20+20” Resilience Model: Guarantees safety if up to 20% of stake is Byzantine (bad) and liveness even if an additional 20% of nodes are down or unresponsive. Consensus can still finalize blocks under extremely adverse conditions – a unique trade-off where Solana sacrifices some fault tolerance (from 33% down to 20% adversarial) to gain much faster finality and better crash-fault handling. Question: Is this still reaching a max of 60% or ⅔ good network participants? It’s just that now the adversary is split into those that are

Background

Liveness: the system continues to make progress and eventually produces new decisions or outputs, ensuring that consensus is not stalled indefinitely. Example: In Bitcoin, liveness means miners continually produce new blocks. Even if temporary delays happen (network congestion or mining power fluctuation), the protocol eventually progresses, adding new blocks. 2. Consistency (Safety): ensures all honest nodes agree on the same final state—no conflicting states or decisions are committed. Example: In Ethereum’s Proof-of-Stake (Casper), safety ensures that two finalized blocks never conflict. Once a block is finalized, it’s irreversible, and all validators agree on the chain’s history. 3. (a) Synchronous Networks: Assumes a known, fixed upper bound on network message delivery times. Example: Tendermint assumes a partially synchronous network with known maximum message delay (~seconds). Consensus progresses in rounds, assuming messages will arrive within a predictable timeframe. (b) Asynchronous Networks: Assumes no bounds on message delays—messages can be delayed arbitrarily without the system failing. Example: HoneyBadgerBFT can achieve consensus without knowing any time bounds on message delivery. It ensures agreement even if the network slows significantly, as it makes no timing assumptions.

Team

(i) Professor Roger Wattenhofer: Distributed systems researcher and prof at ETH Zurich. He won the Prize for Innovation in Distributed Computing in 2012 and authored “Blockchain Science,” a well-known textbook. He co-authored a 2024 analysis of Solana (“Halting the Solana Blockchain with epsilon fraction of stake”) that identified liveness issues in TowerBFT, which set the stage for this redesign. He leads research at Anza and architecture of Alpenglow. (ii) Jakub “Kobi” Sliwinski: ETH Zurich PhD under Wattenhofer. Kobi worked at Dfinity (the Internet Computer), focusing on high-throughput consensus data dissemination. He contributed to both the theoretical design of Rotor/Votor and the practical aspects (Rust implementation). (iii) Quentin Kniep: An ETH Zurich PhD and previously contributed to Sui, Ethereum, and Algorand. At Anza, Quentin was instrumental in implementing the reference code for Alpenglow (the GitHub repository is under his account). (iv) Solana Labs engineers and Jump Crypto’s Firedancer team were likely involved in review and feedback.

Components

(Key Innovations - focus on the innovations, and key parts)

Rotor	Responsible for sending new blocks from the leader to all validators (an evolution of Solana’s existing Turbine protocol), and simplified to a single-layer “broadcast” rather than a multi-hop tree. The core idea is to remove the leader as a by splitting the block into fragments and leveraging the bandwidth of many nodes in parallel:(1) Erasure Coding: When a leader node produces a block, Rotor breaks this block into small pieces called “shreds”. Then “parity shreds are generated” using erasure coding (e.g. Reed-Solomon codes), such that nodes can reconstruct the block from any sufficient subset (e.g. 60 out of 100 pieces). This gives redundancy: even if some pieces are lost in transit.(2) Stake-Proportional Relay: Rather than the leader sending all shreds to every other validator (which would overwhelm its bandwidth), Rotor uses a one-hop relay model. The leader sends different subsets of shreds to a selection of relay nodes (ideally one relay per shred, or per group of shreds). These relays are chosen in proportion to stake and network closeness – validators with higher stake (and better connectivity) will have more shreds. Each relay then multicasts its shred to all other validators. The network operates like a distributed sprinkler: the leader “waters” a few key nodes with data, and those nodes spray it out to everyone.(3) Minimal Network Hops: By flattening the Turbine tree into a single layer, Rotor ensures that most validators receive block data in just one network hop from the initial relays. This is motivated by the observation that network latency, not throughput, is the dominant factor in block propagation. Fewer hops means lower overall latency (since each hop adds propagation delay). Rotor’s approach is effectively asymptotically optimal in using total network bandwidth – all nodes’ upload capacity is utilized simultaneously to broadcast pieces of the block .(4) Relay Selection & Resilience: Alpenglow introduces new techniques to pick relay nodes for each block by varying and randomizing the relay assignment (likely using a low-variance sampling strategy across stake) to avoid single points of failure. Even if some relays behave maliciously or drop packets, other relays’ transmissions plus the erasure coding mean the block can still reach everyone. This improves upon Turbine’s fixed paths, making Rotor more resilient to adversarial interference or regional outages. Relays might also be incentivized (through protocol rewards) for their service , aligning their interests with fast propagation. Question: Does this provide a strong path for DoubleZero’s approach?Note:It appears Solana decided will be at the forefront of network latency reduction + research.
Blokstor	A specialized data store that holds incoming shreds and assembles them into full blocks:(1) Shred Assembly: As a validator receives shreds from Rotor’s relays, Blokstor keeps track of which shred indices of the current block it has and verifies each shred’s integrity (checking the leader’s signature on the shred and a Merkle path proof that the shred fits into the block’s merkle tree). Once Blokstor collects enough shreds to complete the block (e.g. >=60% of shards, given erasure coding), it reconstructs the full block b and marks it as complete for that slot.Note: Reminds me of some of stuff with QMDB by LayerZero (WPRC KL-01): https://docs.google.com/document/d/13kDWZBWo4kWqq_83ntWw0yCdrH8mb8Xu9xRAjgSe8-s (2) Event Emission: When a block for a slot is assembled (Question: is slots still based on Proof of history? Synced clock or the 400MS specified?) Blokstor emits an event (e.g. Block(slot, block_hash, parent_hash)) to the rest of the system (specifically the voting component) that a candidate block is ready. This triggers the validator to begin voting on that block. If multiple different blocks were somehow proposed for the same slot (due to a malicious or equivocated leader), Blokstor can handle them as separate sets of shreds (each identified by a different block hash). It is essentially a local “warehouse” of block data waiting to see which block becomes canonical.(3) Persistence and Speed: Blokstor likely uses an efficient in-memory structure or a fast database to store shreds because timing is critical. It must handle a flood of incoming packets (potentially tens of thousands) and assemble the block within the 400 ms slot time. This component effectively decouples data propagation from consensus logic – once a block is in Blokstor, the consensus engine can treat it as available. Note: I feel Monad is probably working on something similar for the TX storage DB?
Votor	Votor determines if a block is valid and should be finalized, using a two-tier voting process that can commit in one round if enough validators respond quickly, or in two rounds if needed. (replaces TowerBFT’s vote-locking with a faster, round-based agreement) NOTE:Should look into TowerBFT and understand the improvement.(1) Consensus Model: Votor is inspired by recent academic research (the “Simplex” consensus approach in partially synchronous networks) and adapts it to a proof-of-stake setting. For each slot (or round), validators expect either a block (from the leader) or a skip (if the leader fails). Votor ensures that for every slot (Question: 400Milli-second?), 1 of 3 outcomes occurs: (a) a block is notarized (pre-approved) by a supermajority of stake, (b) a block is finalized (committed) by sufficient votes, or (c) a Skip Certificate finalizes an empty slot if no block was produced. This way, the chain progresses slot by slot without ambiguity.(2) 1-Round vs 2-Round Voting: The protocol optimistically attempts to finalize in a single round: as soon as validators have the block, they cast a Notarization vote (“yes, I got block b and it looks valid”) for that block. If the block quickly gathers ≥80% of the total stake in votes during this first round , Votor considers it finalized immediately with a Fast Finalization Certificate [NOTE: haha - like skip-grade students]. This is possible as it’s high enough that no fork can reach 80% without overlapping honest voters, ensuring safety. However, if the first round does not reach 80% (for instance, some validators were slow or offline), and given at least 60% voted, it moves to a second round (a slow path) in parallel. In round two, validators who saw the 60% threshold achieved will cast another vote (a Finalization vote) for the same block. If this second round reaches ≥60% of stake (which is effectively the usual BFT supermajority threshold), then the block is finalized with a standard Finalized Certificate [Question: does the 2nd round also need to be completed within 400MS?]. These 2 voting “paths” run concurrently – meaning validators don’t wait idly; the second-round voting can begin as soon as it’s clear the 80% fast path wasn’t met, and will finalize the block as soon as its 60% condition is met. In practice, whichever condition (80% in one round, or 60% + 60% in two rounds) is satisfied first will finalize the block. This design ensures latency is minimized: if the network is highly responsive, finality happens in ~100 ms (single round), and if the network is a bit slower or some validators lag, finality still happens in ~150 ms (two rounds overlapping) Question: How? Wouldn’t a second round AT LEAST double the latency? .(3) Direct Vote Transmission (No Gossip): Validators do not submit votes as on-chain transactions nor flood them via random gossip. Instead, each validator broadcasts a signed vote message as a single UDP packet directly to a set of peers (which can be all other validators or a subset weighted by stake for efficiency). This is essentially a structured mesh for votes, ensuring that every vote can reach every other validator quickly without clogging the system. [Question/Note: Crazy, so an offchain subnetwork just for voting?] Any node that observes enough votes can aggregate them into a certificate. The Boneh–Lynn–Shacham (BLS) signature scheme is used to aggregate signatures into one compact certificate once a quorum threshold is reached. E.g when a validator sees ≥60% of stake worth of distinct vote signatures for a block, it can combine them into a single BLS multi-signature attesting to that fact. This certificate (which is just a few hundred bytes) is then broadcast to all validators, and ultimately anchored on-chain in a block header (instead of thousands of individual vote txs). By offloading the vote communication off-chain and using cryptographic aggregation, Votor drastically reduces network and compute overhead for consensus while preserving verifiability – anyone can later verify the BLS aggregated signature corresponds to a valid supermajority of votes.(4) Certificates and Finality: There are a 4 types of certificates that nodes produce: (i) Notarization certificate (≥60% of 1st round votes), (ii) a Fast Finalization certificate (≥80% in 1st round votes), (iii) Finalized certificate (≥60% in 2nd round), and (iv) Skip certificate (≥60% skip votes). As soon as one valid certificate for finality is known, validators consider that block finalized and will build subsequent blocks on top of it. If a validator wasn’t watching the votes in real time, they will learn finality when they receive a certificate from some peer – that’s equivalent to hearing “the block was finalized” and they can then safely consider it committed. KEY PROOF: Overlapping quorum thresholds (80% vs two times 60%) are set such that it’s impossible to finalize 2 conflicting blocks: you can’t have two different blocks each gather, say, 60% without at least 20% overlap, and an adversary only has 20% max, so they can’t cause two forks to both surpass the threshold. This preserves safety: honest nodes will never disagree on a finalized block.(5) Timeouts and Local Clock: To handle non-responsive leaders or slow networks, Votor employs a timeout mechanism instead of PoH. Each validator uses a local clock (synchronized loosely by standard means) to enforce a universal 400 ms slot time. When a validator finishes processing the previous slot’s block, it starts a timer for the next one [Note: Removed Proof of Space Time which provided a unified clock]. If the leader’s block data arrives before the timeout, the validator votes on it (NotarVote). Skip Vote: If the 400ms passes and no block was received, validators assumes the leader failed and issues a SkipVote for that slot. If ≥60% of stake send SkipVotes, a Skip Certificate is formed, marking that slot as empty but finalized . This allows the network to rapidly skip over inactive leaders without manual intervention. The next leader can then proceed with the subsequent slot. Importantly, these local timeouts act as an upper bound – they are never reset, so even under network stress, after a known maximum delay, honest nodes will stop waiting and move on. This replaces PoH’s function as a clock: rather than relying on a cryptographic delay, nodes use real time limits (since Rotor and one-hop messaging are fast enough to trust a 400 ms bound). The removal of PoH also means no more “voting lockout” periods; validators can vote on any new fork if needed (e.g. if a malicious leader tried something odd) without being penalized by lockouts, simplifying the protocol greatly. [Question: Why is timing between blocks so important in maintaining security?]
Pool	An in-memory ledger each validator keeps that logs every vote and the resulting certificates for every slot. Like Ethereum’s mempool, but for consensus votes: as soon as the Pool shows a quorum of stake for a slot, the node builds (or grabs) the corresponding certificate, gossips it once, and records it so no duplicate certificates circulate. By tracking stake weight, certificate uniqueness, and local finality in one place, the Pool gives every validator an up-to-the-second view of “who has voted for what” and which blocks are already notarized or finalized—much like the mempool gives every Ethereum node a live view of “who wants to do what” before those transactions land on-chain.

Simulation Stats

(i) Median finality: ~150ms → Fast path finality: ~100ms, Slow path: ~150–180ms, Worst-case: ~270ms, Average global: ~2× network latency Compared to Tower BFT: 80× improvement. (ii) Machine Loc: 65% of stake is <50ms from Zurich → near-instant finality for majority. Validators in distant regions (e.g., South America) finalize within ~270ms. (ii) Throughput & Efficiency: ~4k avg / 65k peak TPS, Consensus no longer a bottleneck, Fork risk sharply reduced, Vote txs removed. (iii) Resource Utilization: Lower CPU load (no PoH hashing), Higher network load, but manageable, Off-chain votes = smaller packets, BLS-aggregated, Rotor uses full network bandwidth efficiently, not just leader’s

Glossary

Proof of History (PoH): Hash-chain clock removed in Alpenglow; fixed-duration slots and local timers simplify timing and cut attack surface.	Epoch: Multi-day period with fixed stake and leader schedule; stake only reshuffles at epoch boundaries.	Slot: ~400 ms time slice; exactly one leader may propose a block.
Leader / Leader Window: Stake-weighted validator that produces blocks for a small batch of consecutive slots (e.g., 4 slots ≈ 1.6 s) to amortize hand-off costs.	Shred & Slice: A block is split into small, erasure-coded shreds; any quorum of shreds (a slice) can reconstruct the block.	Blokstor: Node module that stores shreds, reassembles full blocks, repairs missing pieces, and prunes non-final forks.
Votor: 2-round BFT engine: ≥60 % stake notarizes; ≥80 % in round 1 or two rounds of 60 % finalizes. Handles voting, certificates, and fork choice.	Notarization / Finalization: 60 % yes-votes → Notarized (pre-commit); fast 80 % or two 60 % rounds → Finalized	Rotor: One-hop relay that pushes shreds to all validators with minimal latency, (irreversible replacing Turbine.).
Certificate: BLS-aggregated proof of stake weight. Types: Notarization (60 %), Fast Finalization (80 % round 1), Finalization (60 % round 2), Skip (60 %).	Pool: In-memory ledger of all recent votes and certificates; lets each validator see when thresholds are met and prevents duplicate certificates.	“20 + 20” Resilience: Liveness with ≥60 % honest online stake; up to 20 % may be Byzantine and another 20 % offline.
Skip Vote / Certificate: Empty-block vote used when a leader is silent; 60 % stake forms a Skip cert so the chain can advance.	Partial Synchrony: Safety always; progress once messages arrive within Δ ≈ 400 ms.	Tower BFT (legacy): PoH-timed, lockout-based Solana consensus needing ~32 slots to finalize; replaced by Votor.

References

Anza Blog on Alpenglow - www.anza.xyz/blog/alpenglow | Helius Deep Dive on Alpenglow www.helius.dev/blog/alpenglow-simulation-results Blocmates Overview blocmates.com/blogmates/solana-introduces-alpenglow-consensus | CoinDesk Coverage - coindesk.com/tech/2024/05/21/solana-alpenglow-consensus-upgrade | Oak Research Summary - oakresearch.io/research/alpenglow-consensus-solana | BlockBeats Article - theblockbeats.info/news/441273 | Blockchain Council Report - blockchain-council.org/blockchain/solana-alpenglow-consensus | Blockworks Report - blockworks.co/news/solana-alpenglow-consensus-finality |

Crypto APIs Post -cryptoapis.io/news/solana-alpenglow-upgrade | The Defiant Article - thedefiant.io/news/solana-alpenglow-150ms-finality

The WhitePaper Reading Club Singapore [28] - 24 June 2025
Alpenglow: Solana’s largest upgrade to date	Solar (Solana Chinese Community), Rongxin