Skip to main content

59 posts tagged with "network"

View All Tags

· 2 min read
Marcin Szamotulski

High-level summary

In last sprint we got a performance report of P2P performance testing cluster (which consists of 50 nodes). There is a performance regression in the header notification metric. The P2P cluster is constructed with the same topology as the non-p2p reference one this indicates some regression which needs to be further investigated. This poses a risk for releasing P2P.

We also continued to work on peer sharing: pull #4019.

We continued working on dynamic block production which is required for P2P release for BP nodes: pull #3159.

We simplified the P2P topology format: issue #4559, pull #3888.

We added a new trace point for asynchronous demotions of local peers with Warning severity. This trace is important for SPOs.

Detail description

Performance regression

Below we include a graph which shows the performance regression of the P2P code base vs non P2P.

On the x axis is time in seconds which measures the delay from the start of the slot to when a header was received. The y axis is the percentile of nodes that received a header. We are currently investigating possible causes of the regression.

New P2P topology form

The new topology file format is described in this issue #4559.

Tracing improvements

  • We improved a handshake error reporting, pull #4136
  • We added TraceDemoteLocalAsynchronous rendered as DemoteLocalAsynchronous in json format, pull #4127. Such demotions should be investigated by the pool operator. They can indicate a problem in the deployed system, but also they could indicate a remote problem in arranged connections with other SPOs.

Open Source Improvements

We improved documentation of io-sim and typed-protocols for open-source contributors and/or maintenance tasks: pull #22, pull #45, pull #48.

· 3 min read
Marcin Szamotulski

High-level summary

The team has focused on debuging & fixing bugs for the P2P single relay release, which included

  • diagnosing, fixing and writing tests for a bug in peer-state-actions which fortunately hasn't been released;
  • diagnosing & preventing misconfiguration of DNS

We also focused on developing peer sharing. We also held a session with the scientists on eclipse evasion.

Detailed description

P2P Network Stack

During the past two weeks the team focused on p2p single relay release and peer sharing. We found and fixed an important bug recently introduced in one of the components of p2p networking stack (fortunately never released). Together with a fix, we designed a unit test diffusion simulation as well as quickcheck property test (both could reproduce it). We also changed the code in a way that if such a bug is reintroduced in the future, it will be obvious to diagnose. For more see:

Initial benchmarking run of the P2P code was executed. The results where unlike what we see on the mainnet. We found a possible misconfiguration of the cluster (caused by 0 TTL on domain names), which could be the direct cause of it. We wrote a PR which rules out such misconfiguration. We are awaiting on the next benchmarking results. See more at:

ouroboros-network#4106

We also started working on P2P single relay release. The PR ouroboros-network#4120 includes 108 patches cherry-picked from the master branch. We started working toward integration these changes against the release branch of cardano-node. Early next week we ought to be able to have an early version of cardano-node with non experimental P2P support!

For more detailed release plan please see P2P - Single Relay issue.

Consensus

We identified and fixed missing error reporting in consensus initialisation phase. See more at ouroboros-network#4015

Cardano Node

We also made changes in cardano-node in order to give better experience for node operators. This includes updating severities of some of the traces as well as implementing new format of the p2p topology file. For more see:

Peer Sharing

We continued working on implementation of peer sharing. We have an early implementation which will be reviewed and analysed in next weeks. We started working on cardano-node integration. We need PR #4392 to be merged before such integration will be able to land in cardano-node, although this is not blocking us currently. See more at:

Eclipse Evasion

We held a session which included Alexander Russel, Sandro Coretti-Drayton and Nick Frisby from the consensus team. We discussed high lever design of the eclipse evasion scheme, which is important for the design and implementation of ouroboros-genesis. We got a positive feedback from the researchers.

IO-Sim

In this period we made little progress towards releasing IO-Sim on Hackage. A single PR which added a few missing instances of the STM monad.

Open Source

We made sure the CI runs for PRs which comes from forks (which is important to accept contributions from 3rd parties).

Mithril Cardano Integration

We held initial discussions with Arnaud Bailly about possible path to integrate mithril to cardano-node and take advantage of the ouroboros-network diffusion layer.

· 3 min read
Marcin Szamotulski

Network Update

Ouroboros Network

Ouroboros Consensus

  • Recently we found out that the consensus does not log exceptions thrown during intiialisation. This was fixed in PR input-output-hk/ouroboros-network#4015 As part of this pull request we also changed that all exceptions rethrown by the connection handler thread are wrapped in ExceptionInHandler.

Some older items, which were not announced

  • We identified and fixed an issue related to socket activation (socket options where not set for sockets passed through socket activation). PR input-output-hk/cardano-node#3979 This fix will be released in the next cardano-node release.

Cardano Node

  • We extended the NixOs service module so that one can modify socketPath, runtimeDir, databasePath, traceSocketPathAccept, traceSocketPathConnect and stateDir options. PR input-output-hk/cardano-node#4196

IO-Sim

We resolved a number of issues before release of io-sim on hackage:

See PR #24.

We also improved experience for contributors of io-sim and typed-protocols by adding issue templates:

Typed Protocols

Input Endorsers Simulation

New features include:

  • Histograms of block arrival frequency, for both network (inbound) and CPU (block validation). This is interesting to check that we're not overloading the CPU block validation capacity, or network link capacity. Or alternatively to observe the behaviour in an overload situation if we set the block generation rate high enough.

  • Pie chart of utilisation of TCP links. This shows how small a fraction of links are being used at any one time, and shows that once the system "warms up" and is operating stably, most block delivery is ballistic.

  • Showing off the new screen layout combinators, that let us put multiple charts, titles etc on screen at once and scale them to whatever screen or video resolution we like without having to tweak numbers (this example is scaled to fit 1080HD video resolution).

· One min read
Marcin Szamotulski

The networking team took an active part in the project iteration (PI) planning session, see cardano-node backlog for detailed outcomes.

  • We started working on a detailed design / implementation plan for gossip.

  • We merged input-output-hk/ouroboros-network#3859 which sets the ouroboros-network repository for the single relay release.

  • We identified a bug in the network simulator, which is fixed in the input-output-hk/ouroboros-network#3852. The above PR was reviewed.

  • We set the tracing configuration for nodes which we deploy and fixed and identified some deployment hiccups. We identified some bugs in the RT view which were registered by the maintainers. input-output-hk/ouroboros-network-ops#4

  • We fixed typos in network-mux library: input-output-hk/ouroboros-network#3921

  • For easy of debugging we renamed a trace point: input-output-hk/ouroboros-network#3922

  • Duncan iterated on his simulation / visualisation. He also was able to identify and fix a bug in the simulator. The simulation contains 50 nodes. Dashed lines indicate and established connection, while solid lines indicate a TCP connection with fully open TCP window.