Skip to main content

· 4 min read
Michael Karg

High level summary

  • Benchmarking: Release benchmarks for Node 9.1; UTxO-HD in-memory benchmarks; typed-protocols feature benchmarks.
  • Development: Correct resource trace emission for CPU 85% spans metric. Governance action benchmarking still under development.
  • Workbench: Preparations for bumping nixpkgs. Started removal of the container-based podman backend. Support GHC9.8 nix shells.
  • Infrastructure: Test and validate an upcoming change in node-to-node submission protocol.
  • Tracing: cardano-tracer: Support of non-systemd Linux was merged; safe restart of internal monitoring servers.

Low level overview

Benchmarking

We've run and analyzed a full set of release benchmarks for Node version 9.1. Comparing with the mainnet release 9.0, we could not observe any performance regression.

Additionally, we've performed feature benchmarks for an upcoming new API for typed-protocols. Those did not exhibit any regression either in comparison with the baseline using the current API.

Furthermore, we've performed various benchmarks for the UTxO-HD in-memory backend on Node versions 9.0 and 9.1. Based on those observations, a rare race condition could be eliminated, where block producers on occasion failed to fork off a thread for the forging loop. The overall network performance of the UTxO-HD in memory backend shows a slight improvement over the regular node, but currently comes with a slightly increased RAM usage.

Development

We've spotted an inconsistency in one of our benchmarking metrics - CPU 85% spans - which measures the average number of consecutive slots where CPU usage spikes to 85% or higher (however short the spike itself might be). There was a difference between legacy tracing system (which yielded the correct value) and the new one, for which a fix has already been devised.

The implementation of Conway governance action workloads for benchmarking is ongoing.

Workbench

With a nixpkgs bump on the horizon, we're working on adjusting, and testing, our usage of packages that change their status, lose their support, or packages that require pinning a version for the workbench.

Additionally, we'll remove a container-based backend for workbench, which ties in OCI image usage on podman with Nomad. It was a precursor to the current Nomad backend, which is containerless and can directly build Nomad jobs using nix.

Last not least, we've merged a small PR which enables our workbench to build nix shells with GHC9.8, as this not only pulls in the compiler, but much of the Haskell development toolchain. The correct version couplings between compiler and toolchain components is now declared explicitly from GHC8.10.7 up to GHC9.8.

Infrastructure

We've tested and validated an upcoming change in ouroboros-network which demands any node-to-node submission client to hold the connection for at least one minute before being able to submit transactions. The change works as expected and does not interfere with special functionality required by benchmarking.

Tracing

The trace consumer service for the new tracing system used to require systemd on Linux to build and operate. There are, however, Linux environments that choose to not use systemd. It is now possible to configure the desired flavour of that service, cardano-tracer, at build time, thus adding support for those Linuxes - cardano-node#5021.

cardano-tracer consumes not just traces, but also metrics. With the new tracing system, this shifts running a metrics server from the node to the consumer process. One possible setup in the new system is operating only one consumer service and connecting multiple nodes to it. In its current design, this requires to safely shutdown and restart the monitoring server, using the metrics store of any connected node that's been requested. We're currently battle-testing ekg's (the monitoring package that's being used) built-in behaviour and exploring solutions in case it does not fully meet requirements.

· 2 min read
Marcin Szamotulski

High-level overview of sprint 68

Peer Sharing

Karl Knutsson (CF) produces graphs which show how peer sharing usage expands on mainnet.

Peer Sharing: discovered unique peersPeer Sharing: discovered unique peers

Private relays in the last graph are relays which we are not certain that are registered on the chain.

Typed Protocols

We conducted an investigation whether the new proposed typed-protocols version (see typed-protocols#52 introduces any performance regression. No regression was found when running a cardano-node (v9.1.0 vs a modified version using the new typed-protocols API) the benchmarking cluster. Also no regression was observed when syncing mainnet. The graph below shows accumulated size of block downloaded over time for both nodes:

Accumulated block size over time

The following draft PRs are openned:

cardano-cli ping

Fixed a bug in which cardano-cli ping exited with wrong exit code when a wrong network magic was supplied, see ouroboros-network#4865.

cardano-cli ping will also now report the remote IP address and port when querying the tip:

> cardano-cli ping -h backbone.mainnet.cardanofoundation.org -p3001 -t -j -q | jq
{
"tip": [
{
"addr": "2a01:2a8:a23d:16::17",
"blockNo": 10699400,
"hash": "f37649c4a6ae0c8b208da7c46d4e04312518969e612af0a8dbfdadcbd7180dd2",
"port": 3001,
"rtt": 0.013192945,
"slotNo": 131991843
},
{
"addr": "2a0e:dc0:3:b122::1",
"blockNo": 10699400,
"hash": "f37649c4a6ae0c8b208da7c46d4e04312518969e612af0a8dbfdadcbd7180dd2",
"port": 3001,
"rtt": 0.024089979,
"slotNo": 131991843
},
{
"addr": "2001:15e8:110:4aae::1",
"blockNo": 10699400,
"hash": "f37649c4a6ae0c8b208da7c46d4e04312518969e612af0a8dbfdadcbd7180dd2",
"port": 3001,
"rtt": 0.034663209,
"slotNo": 131991843
}
]
}

See ouroboros-network#4931.

Tx-Submission

We continued writing tests for the new tx-submission application.

We started extending typed-protocols codec to have access to both raw bytes and decoded transactions in the tx-submission mini-protocol. See ouroboros-network#4934.

· One min read
Jordan Millar

· One min read
Noon van der Silk

High-level summary

We fixed a bug when keeping a Head alive during the Conway hardfork on preview, by adding some code to handle the cost calculations. We also released 0.18.0 featuring incremental decommits. We continued on with some items supporting Hydra Doom and general maintenance of our code to be compatibile with our upstream dependencies. Next, we'll be looking to release our new homepage, and carry on with network testing, and general upgrades to our ledger and dependencies.

What did the team achieve?

  • Fixed bug to allow Head closing on Conway #1545
  • Fixed bug around transactions during a decommit #1540
  • Released 0.18.0!
  • Working on a new landing page
  • TLS support for the API server #1555
  • Use some types from upstream to make maintenance easier #1563

What's next?

  • Publish new landing page to our homepage: https://hydra.family/
  • Get pumba testing our network resiliance #1532
  • PlutusV2 -> PlutusV3 upgrade investigations #1523
  • Switch ledger to Conway #1178
  • Support Hydra demo at Rare Evo.

· 2 min read
Alexey Kuleshevich

High level summary

Some minor new features have been added, namely ledger state queries that are necessary for figuring out votes for current proposals and functionality for computing the size of transaction necessary for network communication. Other than that most of the focus still continued to be on improving Conway testing coverage and addition of conformance tests.

Low level summary

Conway

  • pull-4514 - Add governance related state queries
  • pull-4521 - Added method to compute over-the-wire CBOR encoded transaction size

Testing

  • pull-4518 - Made conformsToImpl discard generator failures
  • pull-4508 - Make Imp tests setup more realistic
  • pull-4496 - Enable conformance testing for RATIFY
  • pull-4544 - Updated translation of UnRegDRep deposit

Infrastructure and releasing

  • pull-4531 - Free up disk space in the GHA CI runner before building
  • pull-4526 - cabal.project: Bump index-states and remove allow-newer
  • pull-4532 - Fix cardano-ledger-core version
  • pull-4536 - Bump plutus to 1.32.0.0
  • pull-4537 - GHA: fix cabal version mismatch between build and test job
  • pull-4540 - Free up disk space in the GHA CI runner before testing
  • pull-4545 - Update formal-ledger-specifications SRP