Skip to main content

· One min read
John Lotoski

High level summary

The SRE team continues work on cardano environment improvements and general environment maintenance.

Some notable recent changes, updates or improvements include:

  • Sanchonet environment was re-spun starting from slot 7171200 and updated to cardano-node 8.4.0-pre.
  • The use of cardano-node docker hub will be deprecated in preference of GHCR

Lower level summary

Capkgs

  • Refactor parsing scripts, add github action automation, various bugfixes and cleanup: capkgs-compare

Cardano-parts

  • Updates secrets layout scheme, adds sops enc/dec for jobs, adds cloud monitoring profile, updates flake templates and other improvements/fixes: cardano-parts-pull-8

Cardano-playground

  • Updates for new cardano-parts secrets handling and layout, TF workspace handling, group multivalue DNS support, grafana cloud monitoring and other improvements: cardano-playground

Cardano-world

· One min read
Jean-Philippe Raynaud

High level overview

The Mithril team has released a new distribution 2337.0, which includes the following enhancements: support for zstandard compression of snapshot archives, support for the Cardano node version in snapshot metadata, and support for recording snapshot download statistics in the aggregator.

They also completed the refactoring and standardization of the errors in the Mithril nodes and published an Architectural decision record on the documentation website. Additionally, they kept working on adding Cloudflare protection to the infrastructure.

Finally, the team fixed a performance issue on the stress test tool for the aggregator and made some improvements to the documentation for SPOs.

Low level overview

  • Completed the issue Release new 2337 distribution #1219
  • Completed the issue Errors refactoring #798
  • Completed the issue Client traffic creates performance bottleneck in aggregator #1207
  • Completed the issue Record statistics about the downloaded snapshot in the aggregator #1127
  • Completed the issue Create a SPO checklist for KES keys update #1267
  • Worked on the issue Spike: Run client in browser WASM PoC #1254
  • Worked on the issue Benchmark aggregator performances #1220
  • Worked on the issue Activate Cloudflare protection of infrastructure #1230

· 2 min read
Carlos LopezDeLara

2023-09-13 - 2023-09-26

High level summary

  • cardano-node 8.4.0-pre release suitable for SanchoNet.
  • CLI continues making progress integrating governance features. During this sprint we integrated the info and new-committee governance actions.
  • The team continued moving to the ERA top-level commands structure. Removed --conway-era flag from the legacy commands making conway era commands only accessible via cardano-cli conway.
  • stake-pool command is now under the ERA top level structure.
  • API continues integration with governance features, it is worth to higlight that now ProposeNewCommitee uses the right key type (cc-cold)

cardano-cli

cardano-api

cardano-node

cardano-testnet

docs

CI & project maintenance

· 2 min read
Sebastian Nagel

High-level summary

This week, the Hydra team conducted the monthly review meeting in collaboration with Mithril, enhancing project coordination.

The team improved the gen-hydra-key node command for smoother usability and identified concrete steps to enhance network resiliency in feature items #188, #1080, and #1079. Additionally, they contributed the aiken-mode editor integration to the aiken-lang organization, updated dependencies to utilize cardano-api 8.20, and published the Hydra security advisory CVE-2023-42806 with a workaround available for users.

These efforts demonstrate the team\'s commitment to project improvement, security, and open-source community collaboration.

What did the team achieve this week

  • Conducted the monthly review meeting together with Mithril
  • Improved gen-hydra-key node command #1077
  • Established a clear plan to improve resiliency of network and manifested feature items #188, #1080 and #1079
  • Moved aiken-mode (created by SN) to aiken-lang organization
  • Updated dependencies to using cardano-api 8.20 #1075
  • Published security advisory CVE-2023-42806 (workaround available)

What are the goals of next week

  • Write-up the monthly report for September
  • Finish "network resilience to disconnects" #188
  • Finish kupo integration with hydra #1078
  • Discuss and decide on using aiken or not
  • Address the published security advisory CVE-2023-42806 (to not require workaround)
  • Ideally, release 0.13.0

· 3 min read
Michael Karg

High level summary

  • Benchmarking: We've performed both low-level network and high-level variance analysis of our benchmarking clusters.
  • Infrastructure: Our reporting pipeline was adjusted to classify various workloads easily reducing rework time.
  • Tracing: Work on machine-readable tracing of tracer configuration is ongoing.
  • Nomad backend: We've been able to eliminate several possible confounders on the nomad cluster.
  • Team: We're currently onboarding a new team member: Welcome to Cardano Performance & Tracing, Baldur Blöndal!

Low level overview

Benchmarking

As part of the effort to bring the Nomad backend into production use, we've been equipping both that and the existing benchmarking backend with means to measure and document network latency for each run. Furthermore we've implemented means to capture TCP packets for a limited time window during a benchmarking run - which will allow us to spot differences in the behaviour of the underlying networking stack at OS level.

Additionally, we're running variance analysis in parallel on both backends to ascertain confidence in metrics originating from either. We've concluded that baseline profile runs aren't directly comparable between the two, so we decided to compare standard deviations instead to validate the measurements from nomad.

Infrastructure

Reporting on benchmarks does require human time and effort to rework the final document. Improvements to the reporting pipeline have been merged to master. They reduce the time necessary to do so by various changes to the template and the workload classification logic in analysis.

Beyond that, we've looked into issues where services would quit with an unjustified exit failure upon shutdown - under rare circumstances. By reworking shutdown logic for trace-dispatcher and tx-generator we were able to address those issues.

Tracing

After various steps in constructing a configuration upon node startup, it is vital to document which runtime configuration the node arrived eventually. We're working on providing a machine-readable JSON/YAML trace message for that purpose.

This will facilitate hot-reloading a node's tracer configuration in the future: users will be able to take such a trace message, apply their intended change and hot-reload it immediately into the node.

Nomad backend

As with the existing benchmarking cluster, nomad is currently under scrutiny with regard to the reliability of metrics it produces, as well as the behaviour of its OS-level network stack. For instance, differing kernel versions can have an impact on our measurements, as we'd be basically using two different instruments to take them.

Along the way we've already been successful in eliminating some possible confounders that had been introduced by the nomad service or the slightly different system architecture of the new cluster.

New team member

Baldur Blöndal is an extremely capable and experienced Haskell developer. Also, he's an excellent fit for our existing team. So I'm very pleased to welcome him onboard with IOG, and with Performance & Tracing. He will be working on cardano-tracer, the component receiving, processing and making available node traces and metrics.