Skip to main content

· 3 min read
Jean-Philippe Raynaud

High level overview

The Mithril team completed the design of the signer deployment model for the SPOs to run Mithril on their Cardano mainnet infrastructure, and implemented the associated Mithril Relay in the Mithril networks. They started working on the design and implementation of a stress test tool for benchmarking the aggregator performances. They worked on the refactoring of the Mithril Stake Distribution entity and the uniformization of the date types in the nodes. They also worked on implementing a new tool command in the aggregator and its first sub-command that helps avoiding re-genesis of the certificate chain when the structure of the certificate is updated. Additionally, they worked on implementing some monitoring for the Mithril infrastructure, and worked on a retry mechanism for the artifact creation of the aggregator.

Finally, they fixed some bugs, and they completed the upgrade of the Mithril networks to Cardano node v.8.1.1.

Low level overview

  • Worked on the epic that prepares the Mithril infrastructure for mainnet #767:
    • Worked on the issue Add infrastructure monitoring #987
  • Completed the epic Prepare Mithril Signer deployment model for SPO #862:
    • Completed the issue Design recommended deployment model for SPOs on 'mainnet' and 'preview'/'preprod' #961
    • Completed the issue Adapt infrastructure to use Mithril Relay #1018
    • Completed the issue Announce the new signer deployment model in a dev blog post #1017
  • Worked on the epic Benchmark performances of Mithril Aggregator #904:
    • Worked on the issue Design & implement basic stress test tool for aggregator #991
  • Worked on bugs:
    • Completed the issue Aggregator does not exit on critical error #993
    • Completed the issue Computation of master certificate of an epoch is incorrect #1006
    • Completed the issue End to end tests are flaky #954
    • Worked on the issue 'testing-preview' network does not create certificates #1015
  • Worked on optimizations:
    • Completed the issue Dates format is not standardized #946
    • Completed the issue Add 'recompute-certificates-hash' command to aggregator #1001
    • Completed the issue Add a retry mechanism for artifact creation in aggregator #984
    • Completed the issue Log node version at startup in Aggregator/Signer #944
    • Completed the issue Reactivate Publish Results job in CI #978
    • Completed the issue Clean 'pending_snapshot' directory of aggregator #983
    • Completed the issue Update OpenAPI spec examples #1000
  • Worked on refactoring:
    • Completed the issue Refactor 'MithrilStakeDistribution' entity #967
    • Completed the issue Refactoring client #982
    • Completed the issue Refactor download code in client #1010
    • Worked on the issue Factorize protocol crypto operations #669
  • Worked on dependencies:
    • Completed the issue Upgrade Cardano node to '8.1.1' #973

· 2 min read
Damian Nadales

High level summary

During the past two weeks the team working on the Genesis implementation continued to engage with the researchers, which resulted in various simplifications of the correctness argument for the historical Genesis window. They also decided on an approach for a syncing node to decide that it is (no longer) caught up. This functionality was requested by the networking team.

The team working on the UTxO-HD implementation ran ad-hoc benchmarks that showed performance issues, which are being investigated. They also merged several improvements required for the first UTxO-HD release, and added a package for easing integration with other downstream components.

Regarding our support activities, we integrated the latest Ledger changes into Consensus in preparation for release 8.2 of node.

Genesis

  • We continued to engage with the researchers on our probabilistic model for historical Genesis window, resulting in various simplifications that make the correctness argument more clear while not being excessively conservative.

  • We decided on an approach of how to implement functionality requested by the Networking team; namely, how a syncing node can safely conclude that it is (no longer) caught up. Certain parameters are still subject to discussion with the researchers, and we have still have to agree on a concrete API for this functionality with the Networking team.

UTxO-HD

  • We merged the last of the PRs that were part of UTxO-HD improvements for version 0.1: expose UTxO-HD configuration options in the node, refactor ledger tables, and expose a method of computing the UTxO set size.
  • We added a new "legacy" cardano block in a new ouroboros-consensus-cardano-legacy-block package that should ease the transition for some downstream packages to UTxO-HD, like db-sync. This is really only useful for downstream packages that use the parts of consensus that don't involve the storage components, in which case we can largely ignore ledger tables. Ignoring ledger tables could also make functionality like block (re-)application more performant for the legacy Cardano block as compared to the actual (UTxO-HD compatible) Cardano block.
  • We performed ad-hoc benchmarks of the UTxO-HD implementation, observing a regression in sync speed in the LMDB implementation as well as a regression in memory usage on the in-memory implementation. We are investigating this.

· 2 min read
Michael Karg

High level summary

  • Benchmarking: We've performed several new benchmarks and a performance investigation in preparation of switching the default compiler to GHC9.
  • Infrastructure: The first batch of refactoring and documentation for our tx-generator has been merged to master.
  • Tracing: We've looked into an issue where the tracing system's concurrency could prevent a graceful node shutdown.
  • Nomad backend: Our new cloud backend has seen various improvements regarding deployment and monitoring; validation runs for the backend are ongoing.

Low level overview

Benchmarking

The compiler switch to GHC9 as the default build platform for cardano-node and its components still has noticeable effects on system-wide performance metrics. An investigation into the different resource usage profiles of compiler versions does seem to indicate GHC9's significantly different inlining behaviour may produce those effects. We're currently locating the specific places in component code that have the most extensive effect in that regard.

Using the forge-stress approximation we set up, we could determine that above effect is not due to a range of RTS parameters, as for example the number of capabilites used by the node.

Infrastructure

The tx-generator is a crucial part of our tooling responsible for producing very specific workloads for our benchmarking cluster. In an effort to flesh out an API to make it reusable for more general use cases, a first set of refactorings has been merged to master. Additionally, this merge contained systematic documentation both for internal and for exposed areas of the code base.

Tracing

The tracing system's concurrency could under certain conditions prevent a graceful shutdown of the node. This issue did occur only after adding specific new traces on a development branch. We could localize and address that issue.

Nomad backend

With the data gathered from running the new nomad cloud backend, we've been able to address many, many small and medium-sized improvements. The deployment process has been restructured for better efficiency, and the healthcheck system could be fine-tuned to recognize severity of various conditions that might occur. Optimization of fetching all run data from the cloud for evaluation is in progress.

Additionally, we're continuing the new backend's validation by setting up test runs and looking into comparative analyses with metrics gathered from the current cluster backend.

· One min read
James Chapman

The team works on applied research and consulting in formal methods that is directly applicable to evidence based engineering in Core Tech and beyond.

High level summary

This sprint the teams presented two papers at ICE 2023.

Details

· One min read
Franco Testagrossa
Pascal Grange

High-level summary

This week, the Hydra team shared progress updates during the monthly review meeting (monthly report and video recording available soon) and started experimenting on preview network with the new commit from external wallet feature.

What did the team achieve this week

  • Monthly report & review meeting, demonstrating commit from external wallet
  • Published regular benchmarks for Hydra
  • Moved forward the journey for external commits using multiple script UTxOs #903
  • Changed the API to only put transaction id in snapshots, instead of the full transactions #922 -> this is now evolved into fully addressing #728
  • Fuel marking is now optional as one can now commit from an external wallet #924
  • Add flag option to display node version on tui #934

What are the goals of next week

  • Complete external commits using multiple script UTxOs #903
  • New release 0.11.0
  • Dirtroad solution of improved persistence performance #913