Skip to main content

Network Team Update

· 3 min read
Marcin Szamotulski
Marcin Wójtowicz

Overview of spint 100 and 101

Cardano Incident

On November 12th, Cardano experienced a fork due to a bug in ledger (de-)serialisation. Some nodes accepted an invalid tx, while others rejected it, leading to a fork of the chain.

The network team worked closely with the other Cadano teams, Intersect, Cardano Foundation, Emurgo, and stake pool operators to monitor and resolve the incident.

The network layer was not affected by the incident, and its resiliency played a role in the recovery of the network. We identified some areas that can further improve the network layer's robustness in such situations, and we'll be working on addressing these issues.

Churn Mitigation

We rolled out a churn mitigation included in the latest cardano-node-10.5 and 10.6 releases. This change ensures the speed of churn of hot peers is at least the speed of churn for established peers, and the speed of churn of the established peers is at least the speed of churn of known peers. This way, we avoid a situation where, over a long period of time, established peers will accumulate already tried hot peers, and cold peers will accumulate already tried established peers. See #5238 for more details.

The issue was identified using CF's cardano-ignite tool, with analysis provided by Karl Knutsson.

DMQ-Node

We are initiating a public repository for dmq-node to host its codebase.

We removed the KES evolution configuration; as a consequence, the genesis file option is no longer needed in the configuration of dmq-node, see #5244.

Ouroboros Leios development

Leios Demo

We have contributed improvements to consensus team's Leios demo, which includes minimal prototypes of a few miniprotocols and a simple network emulation layer for exchanging data between a leios server and patched cardano nodes. Our updates addressed unrealistic results obtained stemming from the use of a program toxiproxy for modeling bandwidth and delays. We have also improved our packet capture tooling and gathered some preliminary data for analysis. Based on this early investigation, we have identified a few areas where non-trivial changes may have to be made to the network stack to support the new protocol, as well as new features which could be introduced to deal with greatly expanded traffic requirements, while maintaining the base Praos protocol timeliness guarantees. More details about this were provided at the recent November Leios demo presentation, and further work will continue along these lines.

Server-side re-ordering for a request-response mini-protocol

It is a requirement of Leios to support Frishest-First delivery. For that purpose we started to work on a prototype implementationof a request-response mini-protocol which allows for server side re-ordering while maintaining typed-protocols safety guarantees of deadlock & live-lock freedom.

Peer Selection Improvements

Refined peer selection for local root peers behind firewalls: instead of polling, it now waits for incoming connections and reuses them outbound. See #5241 for details.