Skip to main content

43 posts tagged with "network"

View All Tags

· 2 min read
Marcin Szamotulski

High-level overview of sprint 47

Bootstrap Peers

We continued to review the process of bootstrap peers, see ouroboros-network#4555

CI / Tests

We investigated our CI issues. We found a memory leak in typed-protocols function used for testing codecs which triggered out of memory manager (OOM) on some platforms (typed-protocols#43); we also found a bug in the connection manager which resulted in CI timeouts (see connection-manager-fix).

KeepAlive client

We found two small issues with the keep-alive client, which were addressed by Karl Knutsson (Cardano Foundation), ouroboros-network#4689.

Galois

We merged two large PRs prepared by Galois:

Cardano Network Service Assurance (CNSA)

Galois made the following progress on CNSA:

  • a simple [InfuxDB] database backend has been added;
  • the documentation has been updated;
  • internal improvements to the code;
  • progress on a new "CSNA analysis" that provides, for each sampler node, the block download throughput in bytes over time.

New CHaP Release

We cut a new release of ouroboros-netowrk packages to CHaP: chap#547

More details

CI / Tests

We improved the memory footprint of some of our tests by analysing a stream of IOSim traces without retaining them, see ouroboros-network#4696

As a safety measure, we introduced an upper bound for heap memory used by test artefacts in our nix tests. We use 200MB limit for all tests except for network-mux tests which use 350MB limit, see ouroboros-network#4702.

We refactored one of our tests to use ephemeral ports thus allowing it to run concurrently, see ouroboros-network#4702.

We merged ouroboros-network#4623 which fixes a bunch of test failures.

All of them were due to a bug in test logic rather than a bug in production code.

Release Process

We updated our release process & associated scripts, see ouroboros-network#4705.

· One min read
Marcin Szamotulski

High-level overview of sprint 46

Bootstrap Peers

We continued reviewing of bootstrap peers, see ouroboros-network#4555.

Towards Typed Protocols 0.2.0.0

We diagnosed the performance regression of the new design. The work on typed-protocols will be postponed. For more details see the typed-protocols#3. As an outcome of the performance debugging we prepared PR which updates the demo-ping-pong and demo-chain-sync applications.

Peer Sharing

We made progress in review of ouroboros-network#4644, which simplifies the peer sharing and fixes the ouroboros-network#4642 issue.

Tech Debt

We reviewed the ouroboros-network#3836 PR which inspects all the uses of error in ouroboros-network. The PR was prepared by Galois.

· 3 min read
Marcin Szamotulski

High-level overview of sprint 45

Bootstrap Peers

We started reviewing the bootstrap peers PR, ouroboros-network#4615.

Towards Typed Protocols 0.2.0.0

We discovered a performance regression when using typed-protocols-0.2.0.0, and we started investigating where it comes from. Currently, we see that typed-protocols-0.2.0.0 can outperform typed-protocols-0.1.0.0 when running in isolation with a simple ping-pong protocol, so the regression might be in the new block fetch implementation which comes with typed-protocols-0.2.0.0 See typed-protocols#3.

Tech Debt

We merged two PRs written by Galois engineers:

  • a pull request which refactors the main entry function for P2P, see ouroboros-network#3834;
  • a pull request which reviews usage of unsafe function in the network code based.

Galois also made progress with the following two issues:

IO-Sim

IOSimPOR

We found and fixed a bug in IOSimPOR. We'd like to thank Prof. John Hughes (Quviq AB) for helping us with debugging the issue.

We also provided a more uniform API for IOSimPOR, and added ways to make the debugging similar problems in the future easier.

Technical Details on IOSim refactoring
We removed the usage of `unsafePerformIO` from `IOSimPOR`, which also means removing parallel evaluation of discovered races. We found out that it gives only 25% better performance. In the future QuickCheck will offer running different cases in parallel which should provide better performance as there are no dependencies between the evaluation of different test cases, while schedules are discovered while running which limits the possible gains from running them concurrently. The performance was not the only factor though. When using parallelism in the lazy `ST` monad we'd need to rely on memory guarantees of `STRefs`. In `GHC-9.6` they share the implementation with `IORef`s, but it might not be the case in the future.

IOSim

To prepare for the next release, we consolidate packages taking advantage of the public sublibraries supported now both by cabal and Hackage. This is a work in progress, io-sim#114.

Cardano Newtork Service Assurance

Galois made the following progress:

  • A test run of spinning up a CNSA instance was done, as a result documentation was updated.
  • Based on the IOG code review of the CNSA code, updates to the CNSA code were made.
  • Galois has started the design for adding a CNSA analysis for "fetched bytes over time while node is syncing".

P2P adoption

In the last two weeks, we've seen increase in P2P adoption. P2P relays

The following graphs show several different versions of relays running on the mainnet. The green line NodeToNodeVersionV10.True denotes P2P relays. node versions

Open Source

We upstreamed our FFI bindings to Windows named pipes to Win32 package, the PR was accepted and merged.

We also received an external contribution which enhanced our documentation, see ouroboros-network#4676.

· 2 min read
Marcin Szamotulski

High-level overview of sprint 44

Bootstrap Peers

In this sprint, we focused on developing bootstrap peers.

Thanks to the input from Samuel Leathers (IOG) and John Lotoski (IOG), we identified a possible improvement to bootstrap peers. A more detailed description is available here.

Cardano-Node-8.4.0 Release

We also were responsible for the cardano-node-8.4.0-pre release. A final integration PR is currently being merged. We published new versions of ouroboros-consensus, cardano-api and cardano-cli.

Towards Typed Protocols 0.2.0.0

We also updated the future typed-protocols-0.2.0.0 and its integration with cardano-node. This is towards our goal which we planned for the next quarter. The identified tasks are to fix breaking tests, and then measure and address possible performance regressions.

Tech Debt

Mark Tullsen (Galois) submitted two more PRs: ouroboros-network-#4663, ouroboros-network-#4664. We provided feedback on their other pull requests: ouroboros-network-#4661 and ouroboros-network-#4660.

P2P adoption

In the last two weeks, there was a regression in P2P adoption concerning the number of SPOs or stakes, although the number of overall P2P relays has increased. Karl Knutsson (Cardano Foundation) is investigating this issue. P2P relays

The following graphs show several different versions of relays running on the mainnet. The green line NodeToNodeVersionV10.True denotes P2P relays, which slowly increase over time. The V9 and earlier versions of the node-to-node the protocol indicates nodes version 1.35.x or earlier. node versions

Data has been kindly provided by Cardano Foundation and their mainnet monitoring infrastructure.

Open Source

We are in the process of upstreaming our ffi to Windows Named Pipes API to the Win32 package, see [win32-220].

· 3 min read
Marcin Szamotulski

High-level overview of sprint 43

In this sprint, we received contributions from CF & Galois. Karl Knutsson (CF) has addressed various issues regarding peer churning in P2P, timeouts and our WireShark dissector. While the Galois developers focused on addressing issues from their review last year. See below for more details.

We continued working on bootstrap peers ouroboros-network-#4661.

We refactored our test suites: they are split into io-tests which require to be run natively on all platforms (these tests mostly contain tests that require IO system calls) and sim-tests which are platform independent. We run io-tests on all supported platforms (e.g. x86_64-linux, x86-64-darwin, aarch64-darwin and x86_64-w64-mingw32 (Windows)) natively. The sim-tests are not executed on Windows due to memory limitations on GitHub Actions runners. ouroboros-network-#4653

We also started rebasing typed-protocols refactoring branches.

Marcin was appointed as the cardano-node release engineer for the 8.4.0-pre version. So far he integrated cardano-ledger-conway-1.8 and ouroboros-network-0.9.1.0 to ouroboros-consensus, cardano-cli and cardano-api. Once we will have an integration branch for cardano-node, cardano-ledger-conway-1.8 and ouroboros-consensus packages can be released to CHaP and PRs can be merged once they go through review & CI.

We also fixed some smaller issues regarding peer sharing (both were discovered by Karl from CF). More details are included below.

Progress on P2P addoption

SPO relays

There are currently ~2000 relays running P2P enabled nodes that belong to 557 pools with a combined stake of 7900Mil Ada. On 16th of August it was ~1700 relays, 531 pools with a combined stake of 7700Mil Ada.

P2P relays

The following graphs show several different versions of relays running on the mainnet. The green line NodeToNodeVersionV10.True denotes P2P relays, which slowly increase over time. The V9 and earlier versions of the node-to-node the protocol indicates nodes version 1.35.x or earlier. node versions

Data has been kindly provided by CF and their mainnet monitoring infrastructure.

IOG relays

As of this week, 90% of IOG relays are running a P2P setup. In the next sprint all IOG relays will be running P2P.

Detailed description

In this sprint, we got a few contributions from CF:

  • Karl made peer churning mechanism less aggressive ouroboros-network-#4656; and
  • he added timeouts for idle states in ChainSync & KeepAlive miniprotocols. These timeouts help a node remove idle connections from the responder (server) side ouroboros-network-#4648.
  • he improved the WireShark dissector by adding support for the peer-sharing mini-protocol ouroboros-network-#4656.

Galois has been making progress in addressing some of the issues they raised in their review (last year):

Peer Sharing

  • Light peer sharing is only enabled when peer sharing is turned on ouroboros-network-#4652;
  • Handshake incorrectly reports peer sharing value. It's supposed to relay the remote value, but instead, it returns the local value. ouroboros-network-#4642 (in review).

Async Demotion Test Fix

  • We fixed an async demotion test failure which turned out to be a weakness of the test itself rather than a bug in the connection manager. ouroboros-network-#4655