Skip to main content

· One min read
Samuel Leathers

Node Reelease Update

2022-11-02 - 2023-01-13

Executive Summary

A 1.35.5 release for single relay P2P is nearly completed and should be released this month. This release is based on release/1.35 branch and does not bump cardano-ledger.

The team successfully integrated an interim release bump of ledger and consensus into cardano-node master. This work will not be released in a node version, but will be continued by the current dependency bump in progress.

We anticipate once this dependency bump is completed, regular 2 week releases will be feasible again.

The 1.35.4 release is being ran by more than 70% of stake pools. Planning for the mainnet hard fork date is in progress.

Completed

In Progress

· One min read
Dorin Solomon

High level summary

During the last 2 weeks we did more improvements on our Test Framework, ran some sanity tests related to the P2P Single Relay functionality.

We also update the Node & DB-Sync sync tets to build with Nix as the prebuilt files are no longer available at PR level.

Workstreams

Framework improvements:

  • extended the cardano-node-tests with the ability for anybody to fork the repo and run all our System Tests on GitHub Actions
  • added 2 new nightly pipelines - nightly-mixed and nightly-p2p - details here
  • some optimizations on how our regression tests are scheduled on pytest workers and how cluster instances are assigned to the tests;

=== 743 passed, 67 skipped, 24 xfailed in 9166.64s (2:32:46) === to === 753 passed, 67 skipped, 14 xfailed in 4654.80s (1:17:34) ===

Node:

  • ran a couple of sanity runs of CLI a& sync tests on a local branch with P2P Single Relay enabled
  • started the preparations for testing the next tag - details here

DB-Sync:

  • some improvements on db-sync sync tests

· 2 min read
Jean-Philippe Raynaud

High level overview

The Mithril team has been designing a mechanism for handling seamless updates of the Mithril networks in case of breaking-changes that require synchronous update of the signer nodes. This design has been formalized in an ADR. They have been working on an implementation of a proof of concept to rely on an on-chain transaction to synchronously trigger the version switch of all the signer nodes. They have also worked on implementing prototype solutions to minimize the use of breaking changes where soft updates are possible.

Finally, they have worked on upgrading the devnet and fixing some flakiness in the end to end tests of the CI.

Low level overview

  • Implemented the redaction of an ADR for handling graceful updates of the Mithril Network #671
  • Worked on a proof of concept to handle backward compatibilty of exchanged messages with protobuf #677
  • Worked on a proof of concept to handle backward compatibilty of exchanged messages with avro #678
  • Worked on a proof of concept for reading/writing era activation markers with a Cardano chain transaction #672
  • Worked on upgrading the Cardano node of the Mithril devnet, as well as fixing flakiness of the CI #523
  • Prepared and tested the new 2302 distribution pre-release 2302.0-prerelease
  • Updated the documentation for SPO to build a signer node in order to better reflect the new release process #681

· 3 min read
Damian Nadales

High level summary

The consensus team is resuming its activities after the Christmas break. During these weeks we focused on cleaning and benchmarking the UTxO-HD prototype, and discussing with the Ledger team the changes that might be required for the next iterations. The pull request that adds the Conway era is waiting for a second review round and we hope to merge it soon. On the technical debt side we are looking into a property-test failure found in the iterators. We are investigating if this is an error in the model or in the implementation. We also improved the documentation of our testing code.

Workstreams

UTxO HD Prototype

We worked with the Ledger team to start preparing the next versions of UTxO-HD. The Ledger team is concerned that for the remaining maps we might need the full ledger state on epoch boundaries. Since the main consumer of the ledger rules is Consensus, the code that requires access to a full state could be moved from the ledger to some Ledger-Consensus bridge. Eg. the traversal of rewards could take place in such bridge, instead of querying the ledger for the values that are required in the epoch-transition computations.

We relocated some UTxO-HD definitions, in preparation for merging the prototype into master.

We also completed updated local benchmarks comparing the replay time and memory consumption of:

  • the baseline node (f2fc76ef45647275c98634da1718290b976ff364)
  • the UTxO-HD node with the in-memory backend
  • the UTxO-HD node with the LMDB backend

The following plot shows the results: we can see that the LMDB node barely reaches 8GB of memory, but it takes 1.78 times longer to replay the chain. The in-memory backend is about 30 minutes faster, but still slower than the baseline version. We are aware of this phenomenon and it is inherent to the problem of maintaining sequences of differences of the last k ledger states that allows us to perform rollback and roll-forward. We are in the process of measuring syncing from scratch times.

We also added StaticEither accessors that helped us to simplify the UTxO-HD prototype.

New Conway era

We incorporated the feedback of the pull request, and rebased this branch on top of master. The PR is pending a second review round and we hope to merge this soon.

Technical debt

We are investigating a property-testing failure involving iterators. Solving this requires understanding the expected behavior of iterators in the counterexample found by QuickCheck to determine if the error is in the model or in the implementation.

Fostering collaboration

We moved the contents of docs/Testing.md closer to the code, so that the explanations about the tests are easier to find in the relevant modules, and the documentation is easier to keep up to date.

· 2 min read
Serge Kosyrev

High level summary

Since our last update, we focused on infrastructure work: benchmark enablement, tracing system, benchmark environment merge and open source support:

  1. SECP benchmarking enablement is underway: enabling SECP runs in our cardano-ops benchmarking environment is still in progress.
  2. The new tracing system: the improved API of the new tracing system was implemented, and we're now porting the tracing integration layer over.
  3. Infrastructure: the mainnet protocol parameter history is now encoded in the workbench profile machinery at epoch-level granularity, which gives us a systematic approach towards description of past and future benchmarks.
  4. New benchmark deployment infrastructure: we've made some progress on Nomad deployment backend, shared by both of the data publishing and benchmarking needs.
  5. Legacy benchmarking: we've started merging the legacy benchmark deployment infrastructure into the workbench.
  6. Open sourcing: the benchmarking data publishing tool was adapted to the Nomad execution environment provided by SRE, pending final deployment.

Performance

The AWS cluster infrastructure necessary for SECP benchmarking is still being worked on.

Tracing

The improved tracing internals were implemented, and we're now into the phase of updating the tracing integration, which is also mostly done.

Infrastructure

Thanks to collaboration with the DevX team, we have identified and pursued a design that would enable our Nomad workbench backend to execute deployments of both the benchmarking cluster and our data publishing components.

On the benchmark parametrisation front, we have eliminated a long-standing weakness in the way we were specifying the protocol parameters. We now have a very clear and granular method to keep track of protocol parameter evolution -- e.g. the mainnet history changes are now tracked at epoch granularity, while also allowing for systematically described change overlays. This makes the benchmark profile definition much more clear and robust against mistakes.

We also started a merge of the legacy benchmarking environment (based on cardano-ops) into the workbench. The separation between environments was too costly, causing us to reimplement any benchmarking change twice -- first, during development, in the workbench, then in cardano-ops. In addition, maintenance of compatibility code was incurring additional costs, slowing benchmark data analysis development. Once this merge is complete, this will allow us to sharply cut the benchmark development cycle and overheads.