Skip to main content

Performance & Tracing Update

· 5 min read
Michael Karg
Performance and Tracing Team Lead

High level summary

  • Benchmarking: Compiler benchmarks on 10.6.2; Trace evaluation feature benchmarks.
  • Development: Started new project tx-centrifuge: A tx submission service generating extremely high, continuous workload.
  • Infrastructure: Small maintenance items, such as fixing profiled nix builds for local benchmarking.
  • Tracing: New tracing system now its own project: Hermod Tracing; New library cardano-timeseries-io, which accumulates metrics into queryable timeseries, released.
  • Leios: cardano-recon-framework (formerly LTL Trace Verifier) integrated and in use.
  • Node Diversity: Formal trace schema definition nearing merge; Trace forwarding in native Rust on hiatus.

Low level overview

Benchmarking

We've repeated the GHC9.12 compiler benchmarks on Node 10.6.2, which we now know to be completely free of regressions or any space leak. This confirmed our earlier findings that the code generated by GHC9.12 is on par performance-wise as far as block production, diffusion and adoption metrics go, but it exhibits unexplained increases in CPU time used, Allocations & Minor GCs. Several potential suspects for causing this have been identified with a profiled build. However, many of those will be replaced or changed in the 10.7 release, so that this benchmark will have to be re-run on Node 10.7.

The feature for new tracing, which forces a lazy trace value in a controlled section of code, is slated for inclusion in Node 10.7. To that end, we backported it to Node 10.6.2 and performed feature benchmarks for it - to ensure it won't distort the upcoming 10.7 performance baseline. Indeed we found the performance impact of that feature to be negligible in all categories of observed metrics.

Development

We've started a new project - tx-centrifuge - for transaction submission (i.e. workload generation) during benchmarks and other scenarios. It is meant to be complementary to the existing tx-generator. The latter is tailored very much to our Praos benchmarking use case and the implementation is based on a rather monolithic design. tx-centrifuge's approach however is a different one. It's built for seamless scaling, both horizontally and vertically. This means it will be able to saturate a network running Leios over extended periods of time, due to its massive tx output. Furthermore, it's able to cut down the setup phase (where UTxOs are created for benchmarking) and immediately launch into the benchmark phase. This also enables it to function as a potentially long-running, configurable submission service for scenarios other than benchmarking. The implementation is currently in prototype stage.

Infrastructure

As far as infrastructure is concerned, we've addressed various small-sized maintenance tasks. This includes fixing profiled nix builds for local benchmarks, migrating benchmarking profiles and configs to the upcoming Node 10.7 release and increasing robustness of the locli analysis tool in dealing with incomplete / partial trace output.

Tracing

Our new tracing system has been set up as its own project - and named the Hermod Tracing System. As of now, we've only migrated the core package trace-dispatcher. This marks the first step of eventually moving all tracing and metrics related packages out of the cardano-node project, and bundling them with consistent branding, API and documentation. Eventually, the system will be generalized so that it can be used by any Haskell application - not just cardano-node. Seeing that the dmq-node already adopted it, we have reason to assume it might be considered by the broader community as go-to choice to add principled observability to an application.

We've built and released a new Haskell library cardano-timeseries-io (cardano-node PR#6495). The library builds and stores timeseries of metrics from multiple source applications, much like Prometheus. It can process queries over those timeseries in a query language quite similar to PromQL. Integration into cardano-tracer, the trace / metrics processing service, is ongoing work. It will allow for custom monitoring solutions and alerts directly from cardano-tracer, without the need to scrape metrics and maintain them externally. It is not meant to replace existing Prometheus endpoints, rather provide richer functionality out of the box if desired: cardano-node PR#6473.

Leios

We've released cardano-recon-framework, formerly known as the Linear Temporal Logic (LTL) Trace Verifier (cardano-node PR#6454). It's already seen adoption, and is used productively to verify system properties and conformance exclusively based on live trace output. We've been asked by Formal Methods Engineering to extend the LTL fragment the framework uses, such that a wider range of properties can be expressed; work on that is already ongoing.

Node Diversity

The comprehensive formal schema definition of all the Node's existing trace messages is nearing integration / merging. The initial version will be able to extract all definitions from the actual implementation into a fully validated JSON schema. Future work will address completing the automated verification suite, adding a mechanism to amend the extracted schema manually (e.g. with comments or refinement types) and a pipeline to facilitate usage, such as automatic derivation of a parser, or rendering of a human-readable specification PDF.

Due to resourcing issues, the trace / metrics forwarding mini-protocol implementation in native Rust, unfortunately, had to be put on hiatus for the forseeable future.

Hydra Team Update

· One min read
Noon van der Silk
Software Engineering Lead

What did the team achieve?

  • Continued progress on partial fanout #1667, #2538
  • Final touches on the "Directly Open Heads" feature #1329
  • Investigating a blocker when upgrading to cardano-api 10.21 #2520
  • Working on more resilient snapshots #2551

What's next?

  • Merge approach for resilient snapshots #2551
  • Merge "Directly Open Heads" #1329
  • Get more structure in our benchmarks #2439
  • Release 2.0.0!

Consensus Team Update

· One min read
Damian Nadales
Consensus Team Lead

High level summary

  • Peras protocol development (Treasury Funding Initiative 17: Maintenance and Support):
    • Reviewed and merged the implementation of a model of the committee selection scheme, which defines how voters are chosen for Peras's fast-finality voting rounds (#1839).
    • Reviewed and merged the implementation of the node tracking Peras certificate progress on-chain, needed for the protocol's chain-selection rule (#1864).
  • Node improvements (Treasury Funding Initiative 17: Maintenance and Support):
    • Released ouroboros-consensus-1.0.0.0 (#1926).
    • Reworked internal resource management for ledger state access, improving node robustness (#1910).
    • Fixed a bug in the immutable DB's chunk enumeration logic which only affected test coverage (#1923).
    • Integrated the Networking team's ouroboros-network 1.0 release (#1918, #1927, #1929).

Mithril Team Update

· 4 min read
Jean-Philippe Raynaud
Mithril Tech Lead

High level overview

This week, the Mithril team completed several SNARK-related milestones: they finished implementing the Halo2 circuit in the STM library, completed work on SNARK aggregation primitives for creating and verifying SNARK proofs, and finalized circuit refactoring, including modularity enhancements and a switch to a transcript hash function. They also completed the technical report for the recursive Halo2 circuit. The team continued work on the full review of the recursive SNARK circuit prototype, the impact assessment of SNARK on Mithril protocol security, wiring the SNARK proof into the aggregate signature, and activating the SNARK prover in the dev networks.

They also completed implementing the new prover for Cardano blocks and transactions, the client library, and enhanced the signed entity type configuration. They continued progressing on the client CLI implementation, partial block range support, and usage examples for Cardano blocks and transactions. Additionally, they kept working on the release process improvements, the removal of the legacy Cardano database backend in the client, and the enhancement of the protocol security page on the website.

Finally, they completed static builds of Mithril nodes in CI, finalized testing of the DMQ node 0.3.0 pre-release, and updated the Midnight ZK library audit status.

Low level overview

Features

  • Completed the issue Implement Halo2 circuit in STM library #2895
  • Completed the issue Refactor SNARK circuit - Modularity enhancement with gadgets #3039
  • Completed the issue SNARK aggregation primitives: Create SNARK proof with circuit #3040
  • Completed the issue SNARK aggregation primitives: Verify SNARK proof #3041
  • Completed the issue Switch transcript hash function of circuit #3067
  • Completed the issue Prepare technical report for recursive Halo2 circuit #2981
  • Completed the issue Implement new prover for Cardano Blocks and Transactions #2987
  • Completed the issue Implement Cardano Blocks and Transactions in client library #3031
  • Completed the issue Enhance the support for signed entity types with configuration #3030
  • Worked on the issue Full review of recursive SNARK circuit prototype #2982
  • Worked on the issue Impact of SNARK on Mithril protocol security #2803
  • Worked on the issue SNARK aggregation primitives: Wire SNARK proof in aggregate signature #3042
  • Worked on the issue Activate SNARK prover in dev network #3104
  • Worked on the issue Implement Cardano Blocks and Transactions in client CLI #3032
  • Worked on the issue Support partial block range in Cardano blocks and transactions #3099
  • Worked on the issue Implement examples for Cardano Blocks and Transactions #3100

Protocol maintenance

  • Completed the issue Implement static build of Mithril nodes in CI #2989
  • Completed the issue Test DMQ node 0.3.0 pre-release in Mithril nodes #3053
  • Completed the issue Update Midnight ZK library audit status #2983
  • Worked on the issue Remove v1 backend for Cardano database in client library and CLI #3080
  • Worked on the issue Update release process to anticipate on unreleased Cardano node #3070
  • Worked on the issue Enhance protocol security page on website #2703

Network Team Update

· 4 min read
Marcin Szamotulski
Network Team Lead

Overview of sprints: 108 and 109.

Cardano Node 10.7.0

We relased

  • ouroboros-network-1.0.0.0 (later refined with ouroboros-network-1.1.0.0
  • cardano-diffusion-1.1.0.0 which are integrated with the ongoing cardano-node-10.7 release.

See release board for all PRs included in these releases.

Leios

PR / IssueStatus
Mux egress prioritisationin progress
Performance improvement for tx-submission v2in progress
TxSubmissionV2 demoin progress
Expand network sectionin review

Ouroboros-Network

PR / IssueStatusNotes
tx-submission robustness and testing enhancementsmerged
RNG handling in SRV lookups for public root peersclosedcontribution by Saviour Uzoukwu
cardano ping implemented with ouroboros-networkblocked
mux: refactored ReadBuffer & socket API changesmerged
ledger-peers: introduced SignLedgerPeersKindmerged
cardano-ping: output valid JSON when no pongs are receivedmerged
Add {To,From}JSON instances for TxSubmissionLogicVersionmerged
[Cardano Topology: module exports][pr#5339]merged
Haddock improvementsmerged

DMQ-Node

PR / IssueStatusnotes
SigSubmission v2merged
Integration with cardano-tracerin progress
[Churn test][#5303]merged
Updated dependencies of dmq-nodemerged
Improved documentationmerged
Fixed number of unacknowledged sigIds on the outbound sidemerged
[Added pattern synonyms for EraIndex (CardanoEras c)][pr#1919]mergedouroboros-consensus change needed for Ledger peers
Ledger peersmerged
[dmq-node-0.3.0.0 pre-release][pr#37]merged
Haddocks, docs & release scriptsmerged

IOSim

PR / IssueStatus
Added flushEventLog to MonadEventlogmerged
io-sim improvementsmerged
PR review commentsmerged
ghc-9.14 supportmerged
io-sim: io-classes constraintmerged