Skip to main content

· 2 min read
Alexey Kuleshevich

High level summary

Aside from more testing and overall quality-of-life improvements on the ledger test suite side, we have implemented a couple of important features that will be enabled after the next intra-era hard fork:

  • Translation of RegTxCert and UnRegTxCert to PlutusV3 scripts context will now be done correctly, which means deposit and refund respectively, will actually be translated.
  • Treasury withdrawals that are empty or sum up to zero will no longer be allowed.

Some performance improvements and a bug fix to a ledger event were also implemented during this period.

Low level summary

Features

  • pull-4623 - Change GovInfoEvent's "unclaimed" field from Set to a Map
  • pull-4627 - Fix Conway implementation of RegTxCert and UnRegTxCert
  • pull-4643 - Improve certificate performance
  • pull-4630 - Disallow empty withdrawals
  • pull-4646 - Don't return ZeroTreasuryWithdrawals failure during bootstrap

Testing

  • pull-4497 - Show coloured tree-diff output in ImpTests
  • pull-4615 - Prevent HSpec from messing with ImpSpec colors
  • pull-4625 - Additional DELEG tests
  • pull-4599 - Move TxInfo golden tests to new package
  • pull-4629 - Add TxInfo golden test for Conway
  • pull-4575 - Ts salvage newtylespecs

Infrastructure and releasing

· 2 min read
Jean-Philippe Raynaud

High level overview

This week, the Mithril team released the new distribution 2437.1. This release includes stable support for Cardano transaction certification and stake distribution certification in both the signer and aggregator, a breaking change in the Mithril client WASM related to handling unstable features, along with bug fixes and performance improvements.

The team also continued working on decentralizing the signature orchestration of the Mithril network. They completed the implementation of a buffer store for individual signatures that may arrive before being processed by an aggregator and finished refactoring the signer state machine. They also worked on developing a mechanism to support specific configurations for signing Cardano transactions and focused on the autonomous computation of the messages to be signed by the signer.

Finally, they worked on a refactoring of the service which computes the messages to certify in the signer and aggregator, and fixed the problem preventing the consistent certification of Cardano transactions in the pre-release-preview network.

Low level overview

  • Released the new distribution 2437.1
  • Published a dev blog post about the Mithril client WASM breaking change in unstable features
  • Completed the issue Release 2437 distribution #1901
  • Completed the issue Cardano transactions certification stopped in pre-release-preview #1938
  • Completed the issue Aggregator buffers signatures for unknown open message #1900
  • Completed the issue Refactor state machine of the signer #1922
  • Completed the issue Retrieve custom signing configurations with epoch settings in signer #1923
  • Completed the issue Refactor signable builder services to compute full protocol message in signer/aggregator #1941
  • Worked on the issue Aggregator advertises constant signing configurations for an epoch #1924
  • Worked on the issue Signer computes what to sign on its own #1925
  • Worked on the issue Breaking change in crane fails Hydra CI #1928

· 4 min read
Michael Karg

High level summary

  • Benchmarking: Release benchmarks for Node 9.2.0. Validating the new "age of Voltaire" performance baseline.
  • Development - New Tracing System: A space leak in the forwarding mechanism was fixed; a log rotation bug is being investigated.
  • Workbench: Large refactoring of workbench, optimizing nix closure size and adding profile flake outputs. Adjusted Nomad backend was merged.
  • Infrastructure: Dropping Vault for the Nomad cluster was tested and merged.
  • Tracing: Further metrics names alignment; be OpenMetrics specs compliant; adding annotations to Prometheus metrics; internal monitoring servers routing has entered testing.

Low level overview

Benchmarking

We've run and analyzed a full set of release benchmarks for Node version 9.2.0. In comparison with Mainnet release 9.1.1, we could not observe any performance regression.

Moreover, we've validated the stability of our new "age of Voltaire" performance baseline on 9.1.1. Currently, we're running a cross-comparison between baselines and Node versions 9.1.1 and 9.2.0 to ascertain that the new baseline arrives - at scale - at the same performance observations and predictions as the previous one.

Development - New Tracing System

Forwarding traces and metrics in the new system exhibited a tiny space leak. Under conventional operation, this leak would only become noticeable after running uninterrupted for days or even weeks. It took very hard pressure on the system, and additional profiling, to make it visible. It could be fixed by avoiding unnecessary allocations of continuations: The buffer of objects to forward inherently carries the position of the next object to process, such that a fully evaluated closure can trivially be reused to handle any subsequent forwarding request. This has led to new versions of packages trace-foward-2.2.7 and ekg-forward-0.6. Huge thanks to John Lotoski and Javier Sagredo, whose meticulous information helped to swiftly address the issue.

On the benchmarking cluster, we've observed cardano-tracer's log rotation to occasionally misbehave: under certain circumstances, the service leaks handles by not redirecting output to the latest log file in the rotation. We've located the issue and are working towards a fix.

Workbench

We've been working on a major refactoring of workbench code. The main benefit of this endeavour is being able to pull in a very heavy dependency optionally only when required, when building and running the workbench shell. This will especially facilitate runs on CI machines after garbage collections, but also building a local shell on individual developer machines. Additionally, benchmarking profiles designed for the cluster are now provided as nix flake outputs. This allows for building a more versatile automation in the future, where workbench and cardano-node commits won't need to be tied to each other. Last not least, the refactoring simplified the way the shell commands are evaluated, doing away with nested calls in many instances. The refactoring PR has been thouroughly tested and merged.

Furthermore, the workbench is now prepared for a nixpkgs upgrade and has dropped the container-based Nomad / podman backend - the respective PR was merged successfully.

Infrastructure

Removal of the Vault service for managing benchmarking cluster credentials has been successfully tested and merged. The service is scheduled for final shutdown end of month, reducing hardware cost and maintenance effort.

Tracing

We've received initial feedback regarding the alignment of metrics names between new and legacy tracing systems. Based upon that feedback, we're currently working on some further adjustments to the naming schema.

The implementation for hosting multiple EKG monitors in one single service has been finished and is currently in the testing phase. The dynamic routing to monitoring data, now used both for EKG and Prometheus, reflects the nodes that are connected to cardano-tracer. We've also added a JSON response format, which makes it easier to query and scrape existing routes as part of automations. Finally, this PR also removes the dependency on the snap server framework and transitively on HsOpenSSL (which is prone to cause build issues in the future).

Currently, we're working on various improvements to the Prometheus metric expositions in cardano-tracer. We aim to implement full compliance with the OpenMetrics specification, which should greatly enhance integration processes. Furthermore, metrics will be augmented with # TYPE and # HELP annotations, as tracked in issue cardano-node#5021.

Last not least, we've closed off issue cardano-node#3988. For adding an optional prefix to metrics names, the Node config option TraceOptionMetricsPrefix can now be used.

· 2 min read
John Lotoski

High level summary

The SRE team continues work on Cardano environment improvements and general maintenance.

Some notable recent changes, updates or improvements include:

  • All environments have been upgraded to cardano-node 9.2.0.

  • All IOE run cardano-parts clusters (ie: sanchonet, preview, preprod, etc testnets, mainnet and network-team clusters) have been upgraded to support ipv4/ipv6 dual stack operations. This includes each cardano network's respective public access or backbone DNS, now offering AAAA records for ipv6 connections.

Repository Work

Cardano-parts

  • Sets cardano-node to 9.2.0. Adds ipv6 tf, module and recipe support for ipv4/ipv6 dual stack operations. Updates alerts and dashboards for the new tracing system to reflect metrics name changes and legacy metric prefix normalization. Adds misc fixes and improvements. More detail is available in the PR description: cardano-parts-pull-48

Cardano-playground

  • Deploys cardano-node to 9.2.0. Converts all relevant cluster resources and machines to ipv4/6 dual-stack operations. Updates alerts and dashboards for the new tracing system to reflect metrics name changes and legacy metric prefix normalization. More detail is available in the PR description: cardano-playground-pull-32

Cardano-mainnet

  • Deploys cardano-node to 9.2.0. Converts all relevant cluster resources and machines to ipv4/6 dual-stack operations. Adds new bootstrap scaling machine startup and shutdown recipes. Updates alerts and dashboards for the new tracing system to reflect metrics name changes and legacy metric prefix normalization. More detail is available in the PR description: cardano-mainnet-pull-22

Ouroborous-network-ops

  • Deploys cardano-node to 9.2.0. Converts all relevant cluster resources and machines to ipv4/6 dual-stack operations. Updates alerts and dashboards for the new tracing system to reflect metrics name changes and legacy metric prefix normalization. More detail is available in the PR description: ouroborous-network-ops-18

· 2 min read
Damian Nadales

High level summary

  • Debunked our working theory on the cause of performance degradation when taking a ledger snapshot. We are now back to the UTXO set as the first contributing cause to said degradation, and together with the Ledger team we have proposed a way decrease the number of allocations when serializing the ledger state.
  • Developed the first and second draft scripts for estimating the bandwidth necessary to ensure the CPU is the bottleneck when syncing (#1240). This is informing us and the Networking Team how to refine BlockFetch for the syncing node (especially for Genesis).
  • On the UTXO-HD front:
    • After addressing several issues found during benchmarking and testing, the performance team ran benchmarks on the utxo-hd-9.1 branch, yielding positive results. The nodes function without errors. The memory and CPU usage is almost on par with the 9.1 node.
    • A tool has been provided to convert ledger state snapshots from pre-UTxO-HD nodes to UTxO-HD nodes, allowing users to use UTxO-HD right away without needing to replay the chain (since they can use their locally stored ledger state after converting it with the aforementioned tool).
    • The SDET team will run integration tests on the utxo-hd-9.1 branch. If the tests pass, we will start working on wrapping up the documentation and preparing the branch for merging once it is decided to release this feature.
    • Bear in mind that:
      • This UTxO-HD release uses an LMDB backend (but it also provides an in-memory backend). The LSM-tree backend should arrive Q1 2025.
      • UTxO-HD is just the first step of a bigger initiative for moving parts of the ledger state to the disk storage, lowering the memory requirements of the node and contributing to long term sustainability of Cardano.