Skip to main content

· 2 min read
Jean-Philippe Raynaud

High level overview

This week, the Mithril team continued implementing the incremental certification of the Cardano database. They completed the cloud synchronization of artifacts, introduced enhancements and optimizations for artifact production, and adapted the explorer to accommodate these changes. Additionally, they finalized the design for splitting the mithril-common crate and re-spun the testing-sanchonet network.

Other progress includes starting work on compressing aggregator HTTP responses, fixing a bug that prevented debug logs from being produced on the nodes, and resolving an issue with Prometheus data recording in the infrastructure.

Low level overview

  • Completed the issue Implement artifacts cloud synchronization in Incremental Cardano DB with GCP #2211
  • Completed the issue Design mithril-common split & re-organization in repository #2175
  • Completed the issue Upgrade testing-sanchonet for respin with Cardano 10.1.4 #2209
  • Completed the issue Mithril client does not work in Windows Power Shell #2199
  • Completed the issue Missing debug and info logs in Mithril nodes #2227
  • Completed the issue Signer does not handle properly signature signed entity timeout #2229
  • Completed the issue Grafana aggregator dashboard is not working on release networks #2230
  • Worked on the issue Incremental Cardano DB artifacts production enhancements #2234
  • Worked on the issue Update explorer for Incremental Cardano DB #2212
  • Worked on the issue Activate compression of aggregator HTTP responses #2225
  • Worked on the issue OpenAPI examples check is not working #2235
  • Worked on the issue Activate Pythagoras Mithril era #2034

· 4 min read
Marcin Szamotulski
Karl Knutsson

Overview of sprint 78 & sprint 79

Documentation

We reviewed the technical report. We closed a number of issues, the most important are:

And a few smaller issues:

This was done in the Network Spec Update PR, we also fixed many grammar & spelling errors network-spec: language.

SRV Record Support

We worked on SRV records support by ouroboros-network, issue #2780, PR 5018. We will merge it after reuasble diffusion.

Quering Network State through Node-to-Client Protocol

The aim is to make P2P network stake available through the Local State Query Mini-Protocol.

We opened a draft PR, also see the issue, where we mentioned all the branches where the work is progressing. See below for more technical details.

Extensible Ouroboros Network Diffusion Stack

The work stream reached the review phase. See issue#5016.

Tx-Submission

The Consensus team agreed to implement needed mempool performance optimisations and is making progress on them. See ouroboros-consensus#1359.

Ouroboros-Network-0.19 Release

We cut ouroboros-network-0.19 and 0.19.1 releases.

Configuration Changes for Block Propagation Times

Block propagation times are influenced by the number of TCP round trips required to transmit a block.

In mid-December, we published a post discussing configuration changes to the Linux IP stack. These adjustments involved increasing the initial TCP congestion window to 42 segments and ensuring that the congestion window remained open for idle connections.

IOG applied these changes to four stake pools located in Brazil, South Africa, Dubai, and Japan around December 15th.

The Cardano Foundation manages a standard peer-to-peer (P2P) node in Paris, which operates without manual connections to other Cardano Foundation nodes or IOG nodes. After implementing the configuration changes, we noted a statistically significant improvement in the propagation times for blocks larger than 10 segments (about 14,480 bytes) produced by IOG's pools.

Block Size (bytes)Improvement (ms)
14,480 - 28,960-132 to -78
28,960 - 57,920-197 to -130
>57,920-255 to -176
Block Propagation Times

These results demonstrate that a Stake Pool Operator (SPO) can enhance the propagation times of their own pool's blocks by applying config changes targeting TCP's congestion window.

Low-level summary

Querying Network State through Node-to-Client Protocol

In the first interaction, we will make it possible to query the node-to-node state through LocalStateQuery mini-protocol (part of the node-to-client protocol).

data ConnectionManagerState peeraddr = ConnectionManagerState {
connectionMap :: Map (ConnectionId peeraddr) AbstractState,
-- ^ map of connections, without outbound connections in
-- `ReservedOutboundSt` state.

registeredOutboundConnections :: Set peeraddr
-- ^ set of outbound connections in the `ReserverdOutboundSt` state.
}
deriving (Eq, Show)


data InboundState peeraddr = InboundState {
remoteHotSet :: !(Set (ConnectionId peeraddr)),
remoteWarmSet :: !(Set (ConnectionId peeraddr)),
remoteColdSet :: !(Set (ConnectionId peeraddr)),
remoteIdleSet :: !(Set (ConnectionId peeraddr))
}
deriving (Eq, Show)

data OutboundState peeraddr = OutboundState {
coldPeers :: Set peeraddr,
warmPeers :: Set peeraddr,
hotPeers :: Set peeraddr
}
deriving (Eq, Show)


data NetworkState peeraddr = NetworkState {
connectionManagerState :: ConnectionManagerState peeraddr,
inboundGovernorState :: InboundState peeraddr,
outboundGovernorState :: OutboundState peeraddr
}
deriving (Eq, Show)

· 4 min read
Michael Karg

High level summary

  • Benchmarking: Release benchmarks for Node 10.1.4; performance evaluation of ledger metrics trace location.
  • Development: Database-backed quick queries for locli analysis tool.
  • Infrastructure: Voting workload definition merged to master, work on Haskell profile definition now continues.
  • Tracing: C library for trace forwarding and documentation ongoing; improved fallback configs.
  • Community: new Discord channel #tracing-monitoring supporting new tracing system rollout.

Low level overview

Benchmarking

We've run and analyzed a full set of release benchmarks for Node version 10.1.4. We could not observe any performance risks, and expect network performance to very closely match that of previous 10.1.x releases.

Furthermore, we've been investigating the location on the 'hot code path' where metrics from ledger are traced - such as UTxO set size or delegation map size. This currently happens at slot start, when the block forging loop is kicked off. We aim to decouple emitting those traces from the forging loop, and instead moving them to a separate thread. This thread could potentially wake up after a pre-defined time has passed, like e.g. 2/3 of a slot's duration. That would ensure getting those values out of ledger does not occur simultaneously to block production proper.

Moreover, as a new feature, it would enable tracing those metrics on nodes that do not run a forging loop themselves. And last not least, it would free up the way to providing additional metrics at the new location - like DRep count, or DRep delegations - without negatively affecting performance. Initial prototyping has yielded promising results so far.

Development

Parametrizable quick queries, a new feature of our analysis tool locli, have commenced development. They rely on the new database storage backend for raw benchmarking data to be efficient. These quick queries are based on a filter-reduce framework, with composable reducers, which provide a clean way to express exposing very specific points or correlations from the raw benchmarking data.

The quick query feature also incorporates ad-hoc plotting of the query results, and will incorporate exporting the result into exchange formats like CSV or JSON in the future.

Infrastructure

The voting workload definition has been cleanly integrated with the workbench. This also includes an abstract definition of concurrent workloads - which was previously unnecessary, as exactly one workload would be handled by exactly one and the same service. The integration, along with the added flexibility, has been merged to master.

We're now actively working again on the Haskell definition of benchmarking workloads, including a test suite. Most of this improvement had already been done; it still needs final realignment with the current state of all existing workloads. It will allow us to trade hard-to-maintain large jq definitions with concise testable code, and recursive shell script invocations with using a well-defined command line interface only once.

Tracing

Good progress has been made on the small, self-contained C library that implements trace forwarding. It will allow processes in any language that can call to C via a foreign function interface to use cardano-tracer as a target to forward traces and metrics. The initial prototype has already evolved into a library design, which intends to offer to the host application a simple way to encode to Cardano's schema of trace messages - and to use its forwarding protocol asynchronously, as to minimize interruption of the application's native control flow.

In preparation of the new tracing system's release, we've also revisited the fallback configuration values the system will use if it is accidentally misconfigured by the user. The forwarder component uses a bounded queue buffer for trace output to compensate for a possibly unreliable connection to cardano-tracer. The fallback bounds were chosen to conserve trace output at all cost - as it turns out, too high of a memory cost, if trace forwarding does not happen at all, due to faulty configuration. We've adjusted this and other fallback values to sensible defaults to guarantee a functional system even in case of configuration errors.

Community

Our team will host a new channel #tracing-monitoring on IOG's Technical Community discord server. The migration to the new tracing system might affect existing automations built by the community, or how existing configuration need adjusting to achieve the intended outcome. In the channel, we'll offer support for the community in all those regards, as well as answer more general questions regarding the Node's tracing systems.

Additionally, we're currently releasing our documentation improvements to the excellent Cardano Developer Portal, linked below.

· 2 min read
John Lotoski

High level summary

The SRE team continues work on Cardano environment improvements and general maintenance.

Some notable recent changes, updates or improvements include:

  • Sanchonet was respun after a community scheduled test hardfork to PV11

  • Buildkite agent modules were added to cardano-playground

  • Adds a latest tag GHA action to cardano-node upon new release publication

Repository Work

Cardano-parts

  • Cardano-node has been updated to 10.1.4, cardano-cli and the -ng variant to 10.1.1.0 and 10.2.0.0 respectively, and mithril to v2450. Colmena has been updated to utilize a new pure flake evaluation approach. New nix jobs were added for a new "next-gen" network spin up method, which supports network creation with a fork directly to Conway and then retirement of the genesis bootstrap pool in favor of on-chain registered backbone pools. CI tests to support these new jobs were added. The recipe to query governance actions was updated with the latest voting calculations and the output was improved with color and additional reporting totals. A psql prepared statement for voting activity over time was added to the postgres module. Other small miscellaneous improvements and clean up were made with details available in the release notes: cardano-parts-release-v2025-01-17

Cardano-playground

  • Cardano-node has been updated to 10.1.4, mithril to v2450 and all envs deployed. Buildkite modules were added to support fast buildkite agent scaling in any AWS region. Sanchonet was respun after a planned community hard fork test. The recipe to query governance actions was updated with the latest voting calculations. A new start-demo-ng recipe was added to utilize a new "next-gen" spin up method. More detail is available in the PR description: cardano-playground-pull-39

Cardano-mainnet

  • Cardano-node has been updated to 10.1.4 and mithril to v2450 and all machines deployed. The mainnet canary dashboard was updated with a governance voting analysis panel. The recipe to query governance actions was updated with the latest voting calculations. More detail is available in the PR description: cardano-mainnet-pull-29

Cardano-node

  • Adds GHA steps to push a latest tag for node and api containers on release events where the tag is the latest release. Updates the docker-compose to default to the latest tag and bumps iohk-nix for an updated target number of established peers. Fixes related configs to pass CI checks. cardano-node-pull-6057

· 5 min read
Alexey Kuleshevich

High level summary

Due to the holiday season this time around the Ledger report will be from a period of 6 weeks instead of the usual 2 weeks. That being said, this is also the time when everyone goes on vacation. Therefore the report is larger than usual, but not as big as if two periods of reporting were skipped at a usual time.

Most of the effort was spent on polishing up some of the Conway features before the upcoming Plomin hard fork that is scheduled to happen some time in January, as well as continued testing of the Conway features in order to improve our confidence in the upcoming hard fork. Because of this effort we nailed a couple of serious bugs, fixes for which are included in the latest release, which is why an upgrade for all SPOs to the newest version of cardano-node-10.1.4 is highly advisable.

Another big allocation of effort was towards tackling some of the technical debt accrued over the years.

The most significant change by far in this report is the removal of crypto parameterization from every era definition in Ledger. This change was not only a huge simplification for the Ledger codebase, but it will be just as big of a simplification for all of the downstream users of Ledger. Most importantly, this change will finally allow us to switch to the newer version of the GHC compiler, because it addresses the performance regression that the newer compiler version was susceptible to.

One more major accomplishment that we can share is a drastic change to how serialization of UTxO happens in the ledger state. This change is planned to solve a long standing problem with blocks being missed due to the garbage collector kicking in at the time when the ledger snapshot was being created. Moreover this change will have a significant positive impact on UTxOHD performance when it will finally be released.

Another big milestone, with respect to tackling technical debt is a release of our cryptographic library, which was undergoing some major changes throughout the last couple of years. It was finally released and integrated into Ledger with all of the other downstream components set to follow.

We can also finally conclude our work on defining CDDL specification in Haskell as is it is now completely generated from a Haskell definition for all of the eras. Thanks to this effort we not only have a better confidence in the accuracy of our CDDL specification, due to extra type checking and testing we now get thanks to Haskell, but it also reduces duplication and complexity that usedq to stem from manual serialization specification definition for every Ledger era.

Low level summary

Features

  • pull-4778 - Huddle for Alonzo
  • pull-4790 - Add functions to convert hashes to and from VRFVerKeyHash
  • pull-4785 - CDDL:babbage: Switch to using Huddle/Cuddle
  • pull-4792 - Refactor Conway CDDL to reuse Babbage CDDL
  • pull-4776 - Create CLI for plutus-debug
  • pull-4788 - Get rid of crypto parametrization
  • pull-4800 - Move Crypto class to cardano-protocol-tpraos
  • pull-4810 - Deprecate AuxiliaryDataHash
  • pull-4813 - Add a check to MEMPOOL rule that prevents unelected CC from voting
  • pull-4828 - Fix cddl for update_committee cold credential
  • pull-4831 - Cleanup pointer serialization
  • pull-4811 - Integration of MemPack

Testing

  • pull-4783 - Fixed the certStateSpec
  • pull-4780 - Fix issues that prevent basic sumbitTx from passing conformance
  • pull-4766 - Use non-zero costmodels in Imp tests
  • pull-4791 - Move the list of predicate failures inside OpaqueErrorString
  • pull-4796 - Made it possible to use Imp logging in the conformance hook
  • pull-4740 - Constrained generators for EPOCH rule
  • pull-4732 - Tools for constrained generation of types that need witnessing
  • pull-4812 - Enumerate individual conway tests in conformance Imp
  • pull-4801 - Updated SpecTranslate instance of AlonzoScript, debug info improvements
  • pull-4817 - Included the hash in plutus script translation
  • pull-4821 - Enable Imp conformance for DELEG
  • pull-4822 - Improve error handling in constrained genFromSpec
  • pull-4819 - Removed hash size proofs

Infrastructure and releasing

  • pull-4787 - Use cabal-gild to format cabal files
  • pull-4793 - Fix bounds on quichckeck-instances and cardano-crypto-class
  • pull-4795 - Update haskellNix and CHaP and upgrade ghc-9.8.2 to 9.8.4
  • pull-4699 - Upgrade cardano-base dependency
  • pull-4803 - Add missing version bump in cardano-ledger-shelley-ma-test
  • pull-4805 - Add missing version bump in cardano-ledger-alonzo-test
  • pull-4809 - Fix formal-ledger-specifications SRP check in ci
  • pull-4816 - Backport release cardano-ledger-conway-1.18.1.0
  • pull-4815 - Backport release cardano-ledger-conway-1.17.4.0
  • pull-4824 - Pin ghc version in gen-hie CI job
  • pull-4825 - Bump jinja2 from 3.1.4 to 3.1.5 in /doc
  • pull-4833 - cabal.project: Update index-states