Stake-Driven Data Diffusion Release for Relays
IOG networking team decided to release the Stake-Driven Data Diffusion with
Robust Optimised Peer Selection also more commonly known as P2P.  In the
last update, we informed about a performance regression, but it turns out it
only affects block producers, and thus we highly advise against running it on
such nodes.  Further investigation is required to find the cause of it.
On IOG's benchmarking cluster we have seen quite a good performance improvement
on block propagation itself.  The cluster is running a static topology with
valency 6 (each node is connected to 6 other nodes).  In which every of the 50
nodes are block producers.  The setup of this network is the same as mainnet.
We've seen 40-50% performance improvement on block propagation comparing to the
same cluster deployed with the same topology but using non-P2P nodes.  We think
this performance improvement is caused by using full duplex connections.  Quite
likely the transaction traffic floating in both directions on the same TCP
connection helps to keep the TCP window open.  Note that in a cluster of 50
nodes with valency 6 the probability of having at least one duplex connection
is more than 50%.  We don't expect the same improvement on mainnet because the
network is much wider and the transaction traffic is not as large.
Just before the release we squashed two small bugs:
- issue #4163 - top level integration bug in keep-alive;
- issue #4177 - a bug in outbound-governor;
- PR #4165 - a fix cardano-pingsupport ofNodeToNodeV_10.
Peer Sharing
We were carrying a review of peer sharing PR.
DeltaQ
Neil Davies was invited to give a guest lecture entitled Avoiding System Catastrophes at UCLouvain.
What have we achieve last sprint
- issue #4163: we found out that a control message is not passed to the
- keep-alivemini-protocol, this results in every demotion executing demotion
timeout rather than a graceful termination.  With the fix the node will no longer log:
 - { "kind": "PeerStatusChangeFailure"
 , "peerStatusChangeType": "WarmToCold (ConnectionId {localAddress = 192.168.0.10:7000, remoteAddress = 3.129.186.40:3000})"
 , "reason": "TimeoutError"
 }
 
 
- issue #4177: we fixed an assertion failure in the - outbound-governor; now
we don't try demoted peers which are being demoted already.
 
- PR #4155: we refactored - ouroboros-networkpackages.  There's a top level- ouroboros-consensus-diffusionpackage which integrates- network&- consensuscode.  We also introduced:
 - ouroboros-network-apipackage which contains the API shared between- network&- conensus;
- ouroboros-network-mockpackage which contains mock API used for testing
(e.g. a mock chain & chain producer, etc.)
- ouroboros-network-protocolspackage which contains implementation of all
(but- handshake) mini-protocols, exposes a- testliband contains- testand- cddlcomponents.
 - This made the dependency tree of - network&- consensuspackages much
cleaner.
 
- PR #4169: we described the usage of release branches in - CONTRIBUTING.mddoc.
 
- PR #4165: we fixed - cardano-pingsupport of- NodeToNodeV_10protocol.
 
DeltaQ
The abstract of the talk:
An essential step to ensuring that distributed systems are fit for
purpose.
Distributed systems have become an integral part of our society and
daily lives. We are, both implicitly and explicitly, individually as well as
collectively, placing ever more trust in them.
Are they worthy of this trust?  Our need for them to be ‘fit-for-purpose’ goes
well beyond notions of functional correctness (i.e. never getting the wrong
answer). We need them to deliver the desired outcomes in a timely, robust,
reliable, resilient fashion, at scale and in a sustainable way (both
economically and environmentally).
This all sounds like a worthy aspiration, but what would be a practical
approach to capturing and reasoning about these issues? How can we ensure that
systems can meet their fit-for-purpose objectives, not just in their design but
as they are deployed, encounter the imperfect world, are scaled to become
economic, and proceed into ongoing maintenance?
This talk will illustrate how the notions of Outcomes and Quality Attenuation
(as captured by ‘∆Q’) are being used to both frame the necessary notions and
provide a basis for assuring the refinement and reification of such systems,
from initial concept to operational infrastructure.
You can download the slides from here.