The core-protocol team is starting the work on ‘Flow’s Dynamic Protocol State’. In a nutshell, this work is a necessary premise for the Flow network to autonomously defend itself against malicious nodes within the network. On a technical level, the Dynamic Protocol State is the foundation for countermeasures such as slashing misbehaving nodes or entirely revoking their authorization to participate.
Below, you’ll find broader context on the Dynamic Protocol State and specific technical goals for the work. We will be shortly following up with a more complete proposal, as we are working through the technical details. Community contributions are warmly welcome, including ideas, feedback, change requests, PRs.
Context
The Protocol State maintains the necessary information about the operation of the Flow network protocol, including:
• identity table listing all staked nodes, their public keys, role, etc (for the current and upcoming epoch once available)
• blocks produced by nodes in the Flow network
• finality and sealing status of blocks
At the moment, the identity table is fixed for an entire epoch, which is largely an engineering shortcut. Therefore, the following important features are currently blocked by the limitations of the protocol state:
- Ability for node operator to revoke their node’s keys if they suspect that their keys were compromised (so called self-ejection).
- Slashing nodes for protocol violations (incl. ejection from the network).
- Efficiently retrieve the identity table at a certain block, without the need to locally reconstruct the state from cross-epoch history. (in principle this is possible today, but would require very convoluted hacky approach
Note: The protocol State Interface (→ code) is mostly already in the mature form. Just the implementation backing the interface has many shortcuts.
Goal
The protocol state supports tracking and updating the Identity Table throughout the epoch:
- Identity Table is updated in a fork-aware manner
- method of applying updates needs to be BFT
- root hash of the identity table should be included in each block (thereby allowing to retrieve node identity information about the upcoming epoch once it is incorporated into the protocol state)
Scope
- Design of mature implementation with specific emphasis on BFT.
- Investigate whether it is helpful to include the GitHub issue #3668 in this work stream.
- Functionality to track and persist Protocol State on a block-by-block basis (fork aware).
-
We want to restrict out attention on the components of the Protocol State that stay constant on the happy path and are node-independent:
- identity table
- epoch information (current, next, previous → EpochQuery interface)
- global protocol parameters (→ GlobalParams interface)
- Flag for tracking EECC
The resulting data structure changes rarely throughout an epoch. Lets refer to this information set as
ProtocolStateSubstrate
-
Therefore, we want to de-duplicate identical
ProtocolStateSubstrate
instances in the data base.
-
- Root hash of the
ProtocolStateSubstrate
should be included in each block. - Interface for applying updates to the
ProtocolStateSubstrate
(so called ‘identity-changing operations’)- Happy-path for applying
EpochSetup
andEpochCommit
service events - verification logic (consensus nodes) for checking correctness of proposed updates
- API should be general purpose, i.e. it supports applying any updates resulting from slashing adjudications in the future.
- Happy-path for applying
- We have considered structuring node Identity into an immutable and a mutable part. Including this cleanup work here is probably a good idea (see #6232 for further details).
Out of scope
- Beyond
EpochSetup
andEpochCommit
, we do not implement any other identity changing operations.
This means that the ‘Dynamic Protocol State’ work stream provides low-level primitives for implementing slashing later.
Timeline & Milestones
The following is a rough outline for the entire work stream.
-
Completion of Design and scoping
-
MVP (not fit for mainnet)
Seeing the light at the end of the tunnel; might still miss some features necessary for mainnet, but it already covers the core changes. At this point, we will have a more reliable estimate when the first mainnet deployment will be possible
-
First version suitable for mainnet deployment
contains all features necessary for mainnet (allowed to still contain notable technical debt; integration testing not included)
-
Integration Testing
at the end of this milestone, we will have a first version deployable to mainnet (actual deployment not included)
-
First production version
the resulting version cleans up all significant technical debt
Riskiest Assumptions
Conceptually, the problem is well understood and we have high confidence that the outlined direction will lead to a mature solution.
However, there are many details that still need to be worked through. This work-stream has a notable research component. The implementation will probably be intricate with may requirements to consider. In all likelihood, we will encounter implementation challenges and technical debt that needs to be cleaned up. Hence, by nature of the work stream, time lines have high uncertainty. I recommend applying a factor of 3x for translating the scoped work to projected time lines.
Key & Secondary Metrics
- Primary metric: milestones completed
- Secondary metric: number of story points completed of next milestone
Anti-Goals
- slashing