Skip to main content

· 4 min read
Michael Karg

High level summary

  • Benchmarking: Node versions 8.9.3 and 8.11.0; new PlutusV3 plus addtional DRep benchmarks; re-evaluation of network latency.
  • Development: BLST workload for PlutusV3 was implemented; improved error/shutdown behaviour for tx-generator is in testing phase.
  • Workbench: UTxO-HD tracer configs harmonized. New plutusv3 profiles supporting experimental budgets. Work on Haskell profile definition is in validation phase.
  • Tracing: New metrics and handle registry feature merged to master. Work on metrics naming ongoing. Factoring out RTView component has begun.

Low level overview

Benchmarking

Runs and analyses of full sets of release benchmarks have been performed for Node versions 8.9.3 and 8.11.0.

For comparison of how the Conway ledger performs when injecting large amounts of DReps and delegations versus one with zero DReps we've run additional configurations with existing workloads from release benchmarking. So far we've found that the number of DReps in ledger scales well and does not lead to notable performance penalties.

Additionally, we've successfully run the baseline for the upcoming PlutusV3 benchmarks on our Nomad cluster. Those will, given the new V3 cost model, serve to determine headroom, or constraint, regarding resource usage and network metrics when operating under various execution budgets.

Last not least, with much appreciated support and feedback from the network team, we performed a re-evaluation of the network the latency matrix for our benchmarking cluster. The cluster stretches over three regions globally. Due to unknown changes in the underlying hardware infrastructure, a slight delay between Europe and Asia/Pacific regions could be measured. We needed to adjust some existing baselines accordingly - otherwise, this delay could be falsely attributed to a software regression.

Development

We have implemented a benchmarking workload using PlutusV3's new BLST internals. As those do little memory allocation, but require more CPU steps, this workload will allow us to focus on that particular aspect of block and transaction budgets.

The tx-generator service will now label each submission thread with its submission target. Additionally, it has been equipped with custom signal handlers. This will improve both how gracefully shutdowns can be performed, and how precise error reporting is done when losing connection to a submission target. Last not least, the service now supports a configurable KeepAlive timeout for the NodeToNode mini-protocol - accounting for very long major GC pauses on submission targets under very specific benchmarking workloads. Those features have entered testing phase.

Workbench

Thanks to feedback from the consensus team, we've harmonized tracing configurations for our benchmarks between regular and UTxO-HD node. As the latter is more verbose by default, this is a confounding factor for our metrics: We're analysing north of 90 traces per second per cluster node, so all node flavours are required to be equally verbose.

The benchmarks based on the BLST workload now additionally support scaling budget components up or down at will. This means we can run a given cost model against custom execution budgets, controlling the point where the workload will exhaust it. This enables comparison of performance impact of potential changes to those budgets.

Porting our performance workbench's profile definitions to Haskell has been nearly completed, and an adequate test suite been implemented. This new component has now entered validation phase to make sure it correctly replicates all existing profile content.

Tracing

Two new metrics for cardano-node have landed in master - both for new and legacy tracing systems. They provide detailed build info, and indicate wether the node is a block producer or not.

We're now working on closing the gap in the metric naming schema between new and legacy tracing. The aim is to allow for a seamless interchange, without additional configuration required, so that all existing monitoring services can rely on identical metric names with identical semantics.

Furthermore, work has begun to factor out the RTView ("real-time view") component of cardano-tracer in the new tracing system. Unfortunately, the component has remained in prototype stage for over a year, and has revealed some design shortcomings. It's aim is to provide an interactive, real-time dashboard based on metrics from all nodes connected to cardano-tracer. The current design has all front-end side code baked into the backend service, requiring to rebuild the entire service in Haskell even for simple changes in the dashboard. We decided to isolate the component in the current code base, which still allows for optionally enabling it for a build. The long term goal however is to convert it into a downstream service: It will ingest metrics by reforwarding, or querying a REST API, and will provide a clear separation of frontend facing code. Thus we, and anybody, can use their favourite web technology for visualization of metrics.

· 2 min read
John Lotoski

High level summary

The SRE team continues work on cardano environment improvements and general environment maintenance.

Some notable recent changes, updates or improvements include:

  • Sanchonet was respun for cardano-node 8.11.0-pre

  • Private chain was respun twice for pre-sancho respin testing and short epoch testing with cardano-node 8.11.0-pre

  • Shelley-qa, two-thirds of preview and one-third of preprod networks were deployed to cardano-node 8.11.0-pre

  • Sanchonet, private chain and shelley-qa networks had dbsync sancho-4-3-0 deployed

  • A dbsync show_current_forging prepared statement was added to the cardano-parts profile-cardano-postgres nixosModule to aid with debugging chain quality issues

  • Three documents were added to cardano-playground to better explain some operations procedures: KES rotation, chain quality debugging and new network creation. Found at: docs/explain

  • A new mithril dashboard template is available in cardano-parts

Lower level summary

Capkgs:

  • Avoid git API rate limit errors on update github action via netrc usage and corresponding secret: capkgs-commit

Cardano-parts

  • Sets cardano-node-ng to 8.11.0-pre and cardano-db-sync-ng to sancho-4-3-0. Adds a dbsync prepared statement, mithril dashboard template, updates the node application dashboard template, improves justfile recipe templates and tunes some systemd dependencies. Iohk-nix-ng was updated for sanchonet and private chain respins. More detail is available in the PR description: cardano-parts-pull-41

Cardano-mainnet

  • Rotates KES, pins iogp4 as -ng, adds a mithril dashboard, updates the node application dashboard, improves justfile recipes and tunes systemd node and mithril services to avoid some edge case errors. See the PR description for more details: cardano-mainnet-pull-15

Cardano-ogmios

Cardano-playground

  • Respins sancho and private chains and deploys cardano-node 8.11.0-pre and cardano-db-sync sancho-4-3-0 to appropriate envs and machines. Adds a mithril dashboard template, updates the node application dashboard template, improves justfile recipe templates. Adds three new explainer readme documents. See the PR description for more details: cardano-playground-pull-24

· 2 min read
Alexey Kuleshevich

High level summary

Most of the focus was on the conformance testing this time around. We had completed conformance tests for CERT and RATIFY rules and progressed on some of the others. This also resulted in some improvements to the constraint-generators framework. Besides that we've also fixed Stake Pool Operator stake distribution calculation that is used for voting by including proposal deposits that are currently locked in the system. One of the Ledger team members was also performing duties of a release engineer, so we also facilitated the latest cardano-node-8.11 release.

Low level summary

Features and fixes

  • pull-4324 - Proposal deposits in SPO voting stake
  • pull-4316 - Complete EraScript hierarchy with missing classes
  • pull-4287 - Fix various minor issues in the Shelley & Babbage specs

Testing

  • pull-4320 - CERT conformance
  • pull-4334 - RATIFY conformance
  • pull-4337 - Fix RATIFY conformance
  • pull-4325 - constrained-generators: soundness tests and bugfixes
  • pull-4323 - constrained-generators: clean up interface
  • pull-4336 - constrained-generators: Introduce fromList_ :: (HasSpec fn a, Ord a) => Term fn [a] -> Term fn (Set a)

Infrastructure and releasing

  • pull-4333 - Fix babbage-test and conway-test versions
  • pull-4332 - Update CHANGELOGs
  • pull-4343 - Bump requests from 2.31.0 to 2.32.0 in /doc

· 2 min read
Jean-Philippe Raynaud

High level overview

This week, the Mithril team continued implementing the certification of Cardano transactions in Mithril networks. They worked on scaling proof generation for mainnet by prototyping optimizations and benchmarking performance improvements. They also made progress on low-latency certification by completing the retrieval of the chain tip and importing transactions from the Cardano mini-protocol with Pallas. Additionally, they worked on a new explorer page to display in/out SPOs for the latest Cardano epochs.

Finally, the team upgraded the testing-sanchonet network following the SanchoNet network respin, created a module for building test transactions, and began removing the deprecated snapshot command from the client CLI.

Low level overview

  • Completed the issue Aggregator stress test crashes during signer registration #1676
  • Completed the issue Prune Cardano transactions stored on signer #1645
  • Completed the issue ChainObserver supports retrieving the Chain Point of the tip of the chain #1589
  • Completed the issue Prepare testing-sanchonet for respin with Cardano 8.11-pre #1694
  • Completed the issue MacOS Rust tests are flaky in CI #1556
  • Worked on the issue Prototype optimizations for increasing Cardano transactions proof generation throughput #1687
  • Worked on the issue Retrieve Cardano blocks with chainsync in pallas PoC #1590
  • Worked on the issue Explorer display in/out SPOs in registered signers page #1686
  • Worked on the issue Create a test Cardano transactions builder #1667
  • Worked on the issue Cardano signatures are not produced on testing-sanchonet and testing-mainnet #1681
  • Worked on the issue Remove snapshot command in client CLI #1690
  • Worked on the issue Mithril Signer Local Error Policy : Error 182 - MuxError #1632

· One min read
Damian Nadales

High level summary

  • Released Consensus for Node 8.11 (#1101)
  • Improved the Praos chain order:
    • Restricted VRF tiebreaker based on slot distance (#1047)
    • Small tweak to the issue number tiebreaker (#1086)
  • Wrote overview on the statistics on the leader schedule (#1096)
  • Integrated robustness refinement for concluding that a node is caught up in the context of bootstrap peers (#1031)
  • The P&T team managed to complete the UTXO-HD benchmarks using the LMDB backend and the results are promising.
  • We're working on setting up the Consensus Technical Working Group within Intersect, so if you'd like to participate please reach out to Damian Nadales.