Skip to main content

· 2 min read
Jean-Philippe Raynaud

High level overview

The Mithril team has released the new distribution 2248.1 of their nodes. They have published the first version of the Mithril cryptographic library on crates.io, the Rust community’s crate registry. They have implemented an optimization on the individual signatures that no longer embed the verification key and stake. They have also enhanced their testing strategy by implementing a workflow that tests that the client binaries produced for multiple platforms (Linux, MacOS and Windows) are able to verify and restore snapshots.

Finally, they have kept on simplifying the aggregator node's multi-signer by removing the signer registration and the certificate creation from its responsibilities.

Low level overview

  • Implemented removing verification key and stake from single signatures #619
  • Completed the extraction of the signer registration from the multi-signer #642
  • Completed the extraction of the certificate creation from the multi-signer #638
  • Implemented a workflow to test client binaries (Linux / MacOS / Windows) #601
  • Completed the signature of the artifacts produced by the CI #587
  • Fixed the protocol parameters transition #627
  • Worked on optimizing the snapshot digest computation #510
  • Worked on enforcing the API protocol versions in the client and signer #633
  • Worked on deactivating the non certified signer registration mode #621
  • Worked on the re-genesis of the test networks #651

· 3 min read
Damian Nadales

High level summary

During the past two weeks, the Consensus team finalized the QSM tests for the backing store and Mempool on the UTxO-HD branch with important discoveries regarding parallel QSM testing. We also worked with the Ledger team to envisage the modifications that are required in Ledger and Consensus to accommodate the changes in the crypto VRF and KES. The db-analyser now supports bechmarking the ledger operations, which will allow us to identify, debug, and profile potential performance problems. We drafted a document that defines how to manage the versions of Consensus-related packages. The top level documentation of ouroboros-network now features a description of the consensus components and provides a hyperlinked map to the modules documentation.

Workstreams

UTxO HD prototype

Whereas we had passing sequential state-machine tests for the mempool, the parallel case proved to be more challenging than we thought. The operation of adding a list of transactions to the mempool is not atomic and, as a result, when adding a list of transactions, transactions from other processes can be added in between. The mempool implementation handles this correctly, however this required us to redesign the parallel model we had to take the lack of atomicity into account.

Backing store property tests

We finished refactoring the backing store property tests. The second review round is ongoing.

LSM tree implementation

We are working on benchmarking (in terms of time and number of IO operations) fetching/looking up data from disk.

Genesis

We worked on the design of a mechanism to prevent a DoS attack on our Genesis design related to rollbacks. This was arguably the biggest outstanding question.

During the discussions around Genesis, we noticed a design boundary that nicely delineates a fundamental component. We almost have a full Haskell prototype of it. It will be very nicely self-contained, perhaps even usable in the ultimate implementation!

New VRF and KES crypto integration

We collaborated with the Ledger team on preparing the ledger state and crypto types to avoid huge allocation on the epoch boundary when changing aspects of the crypto that will only manifest in headers, not in the ledger states.

Technical debt

We merged the pull-request that adds a support to db-analyser for benchmarking ledger operations. This will allow us to identify, debug, and profile potential performance problems. The benchmark focus on the main 5 ledger operations that are involved in chain syncing, block forging, and block validation, namely:

  1. Forecast.
  2. Header tick.
  3. Header application.
  4. Block tick.
  5. Block application.

The following figure shows a plot of the benchmarking results for the first 65 million blocks (approximately) of the Cardano chain. The thin yellow lines under the x-axis show the epoch boundaries, whereas the thick yellow lines correspond to the era transitions.

As we can see in this figure, era and epoch boundaries require more computation time. The ledger team are aware of this problem, and we are working to improve this situation.

Fostering collaboration

We drafted a document motivating and defining how Consensus (and possibly other core teams) will/should manage our package versions. This pull-request garnered many great discussions from our team members and other teams too: Sebastian Nagel, Arnaud Bailly, Michael Peyton-Jones, Ziyang Liu, et al. We want to thank you all for your input, and we found this discussion very enlightening!

We merged the pull request that adds an overview of consensus to the top level documentation of ouroboros-network. This overview describes the consensus components and adds a hyperlinked map to the modules documentation.

· One min read
Kostas Dermentzis

High level summary

The DBSync team continued testing release 13.1.0.0. The QA team has reported that no issues have been found. The DBSync team also worked on cherry-picks back to master and on fixing bugs.

Lower level summary

  • Release is cherry-picked back to master, which uses the new rollback mechanism which uses reverse indexes, same as the release #1320 This also fixes a bug number of issues on master.
  • Depenencies upgrade and CHaP integration #1324
  • AdaPots fix #1323. This fixes an issue where the per epoch AdaPots didn't match the epoch boundary, but they also included changes from the first block of the epoch.
  • Deposits Event fix #3212. This pr adjusts the Deposits ledger events, so that it can be better used by db-sync. This can reduce the number of queries that db-sync does during syncing an make syncing faster.

· 2 min read
Jordan Millar

2022-12-14 - 2022-12-27

High level summary

PRs merged in this sprint focused on clean up and resolving existing issues. The majority of the time during this sprint was spent on the In Progress PRs which have dependencies on consensus. This has been since rectified i.e cardano-node dependencies have been bumped.

Completed

docs

CI & project maintenance

Developer experience

cardano-cli

cardano-api

cardano-node

cardano-testnet

In Progress

CI & project mainteance

cardano-cli

cardano-api

cardano-node

cardano-testnet

· 4 min read
Serge Kosyrev

High level summary

  1. SECP benchmarking enablement was completed: we are now able to do local runs of the SECP workloads. The next step is to port this to the AWS environment.
  2. A new workstream for Plutus cost modeling improvement: we've planned and started implementing the smart contract call overhead measurement machinery.
  3. The new tracing system: after doing more benchmarking to address inter-run variance, we discovered that the regression, while still there, is small enough not to be release critical. Nevertheless, we're continuing with the further performance-oriented rework of the internals.
  4. Infrastructure: a significant refactoring of the workbench internals was merged. We also started improving the denotation for ever-evolving protocol parameters. Comparative analysis of multi-run batches implementation started.
  5. Open sourcing: our plans matured sufficiently so that we now expect actual deployment work to start this week.

Performance

The SECP benchmarking workload has been fully implemented in the workbench. We are now porting it over to AWS, and after that we'll be running the model cluster workload.

We've also started implementing mechanics for the upcoming investigation of the Plutus smart contract call overhead, which is expected to lead us to improved Plutus cost modeling.

Tracing

After the initial model-scale performance data caused us to panic, among other things we've done more benchmarks, and it turned out that inter-run variance increase was the culprit. The actual regression averages to barely noticeable 1-2% in key metrics -- which is certainly not release critical.

To understand the impact of the new tracing system, we have to bear in mind the extra functionality it provides:

  1. We are now processing all messages generated by the system, without making any shortcuts that the old system had to resort to. That causes the new tracing to do more work, but is more useful for all users and developers involved -- since it leads to a simple, non-confusing configuration. Incidentally, that's also the area where we are reworking the internals, to deduce and enable the optimisations that are implied by the particular configuration.
  2. The new tracing system is benchmarked with remote tracing as the default backend (whereas the old one was using local, builtin log storage mechanism). In some sense it's the fair benchmark, because that's the way we expect SPO's to set up tracing. That, however also causes it to do more work.

All that said, since we've established the performance of the new system to be adequate for the release, we won't be delaying it much further.

In addition, we're still pursuing our performance-enhancing rework of the new tracing internals.

Infrastructure

After implementing the multi-backend capability in the workbench, we got the opportunity to reassess the generic/backend boundaries and perform some long-awaited cleanups and simplifications in that area. The results of this work have been merged and will serve as a solid foundation for the CI and cloud backends.

Moving to analysis, we've also improved provenance of the raw data, by collecting more identification information and statistics about it. This means, e.g. that we now record checksums, message frequencies and timestamps from the log files coming into analysis. This will be used to enable us to see more data anomalies earlier, and lift that information directly into the generated reports.

A new feature is now under implementation -- the ability to provide comparative analysis of multi-run batches. Previously we only had automation for two aspects separately, so we only could either:

  • compare individual runs (used for different node configurations / versions)
  • collect variance statistics from a batch of runs (used to enhance statistical confidence for a single node configuration / version) Naturally, combining these two capabilities was a long-desired feature of our analysis pipeline.