Skip to main content

· 4 min read
Damian Nadales

Consensus Quarterly Update

2022-09 - 2022-11

Main achievements

UTxO HD

  • As a consequence of the errors observed when running distributed mempool benchmarks, we re-designed the UTxO HD mempool integration, which fixed these errors and lead to a simpler and more maintainable design.

  • We focused on increasing test coverage for the UTxO-HD prototype. In particular, we added property tests for:

    • Backing store (work ongoing)
    • Era transitions
  • The property tests we added uncovered several bugs, which is a great result given the exponential increase in the cost of finding bugs as they are closer to deployment.

  • One of the errors found by our tests required us to work on improvements in the Haskell bindings for LMDB. This work is ongoing.

  • We started working on the mempool property tests that will exercise the new code paths that UTxO HD introduced.

  • We developed, benchmarked and tested an implementation of sequences of differences based on "anti-diffs". Performance results of diff sequence operations show that we achieved a speedup of about 4x across several scenarios. Note: this speedup is taking into account diff sequence operations only, so the consensus-wide speedup is less than 4x.

  • We integrated the "anti-diff" prototype into the UTxO HD feature branch.

Genesis

  • We wrote a simulator that demonstrates soundness of an abstract implementation of the new chain selection rule.
  • We elaborated a draft specification for the Genesis implementation (currently awaiting feedback from other architects).
  • We elaborated a draft specification for the ChainSync Jumping optimization. In particular, this includes a proof sketch that the latter preserves liveness and safety in all cases.
  • With the Networking team, we co-designed the eclipse avoidance mechanism, specifically its coherence with the Genesis implementation plan's security and its dependence on the new ChainSync Jumping optimization.
  • We implemented a prototype for ChainSync Jumping. Initial benchmarks showed a performance degradation wrt the baseline. Our optimization attempts so far have brought the performance closer to the baseline, but not yet to parity.

Conway era

  • We did most of the heavy lifting required to integrate the Conway era into the Consensus layer.

Technical debt

  • We started working on enabling CI nightly tests, which revealed several test failures due to thunks being found it data structures used by the ledger and consensus. We made a lot of progress fixing those thunk errors, but some errors still remain.

  • We elaborated a db-analyser benchmark for the ledger operations. This led us to the identification of high processing time at epoch boundaries, and we could not observe any performance degradation that can be attributed to era changes.

  • We fixed a source of flakiness in the ChainDB QSM test.

  • We clarified a common source of confusion around VRF tie-breaking and cross-era chain selection.

  • We fixed a bug in the maximum-allowed ledger major protocol version.

Fostering collaboration

  • We spent time making cardano-updates the central source of information for the core teams stakeholders.
  • We went through the Galois gap analysis and extracted actionable points to take on next.
  • Bart and Yogesh continued with their onboarding and stated making substantial contributions to consensus.

Next steps

UTxO HD

  • Finish the mempool property tests.
  • Benchmark the latest version of the prototype.
  • Elaborate a document that describes new integration test scenarios and pass it to the SDET team.
  • Bring query UTxO by address command performance on par with the baseline version.

Genesis

  • Receive and incorporate Duncan's feedback on the first draft specification for the Genesis implementation.
  • Begin prototyping the first genesis implementation, unless the first draft needs major changes.
  • Draft a second revision of the Genesis report.
  • Review the second revision with a wider audience, which includes at least Alexander Russell. That feedback will drive a third and hopefully final revision.
  • Investigate how to mitigate the ~30% slowdown we have observed so far in the ChainSync jumping prototype, and try to mitigate it. In particular, we might need to optimize the existing BlockFetch logic.

Tech debt

  • Enabling nightly CI tests.

Fostering collaboration

  • Merge the tutorial document Galois wrote; requires CI integration.
  • Come up with our own documentation improvements, many of which were suggested in the Galois gap analysis.
  • Try to hire a new team member.

· 4 min read
Marcin Szamotulski

Network Quarterly Update

2022-09 - 2022-11

Summary of most important improvements

During this quarter the networking team delivered low level specification of peer sharing & eclipse evasion. We held a session with the consensus & the scientists; we got a positive feedback on the design.

Further we focused on implementation of peer sharing. We produced a detail design and an early implementation.

We prepared the P2P Single Relay Release (cardano-node-1.35.5). It includes over 130 patches of network stack improvements over the previous version 1.35.4, which were accomplished over a longer period of time. Among them are both bug fixes and UX improvements for stake pool operators like simplified format of the topology file, or improvements in the logged messages:

We also provide better integration with systemd (socket activation improvements) or improvements in the networking stack:

  • exit policies,
  • peer metrics improvements,
  • DNS TTL improvements (which make it harder to misconfigure the system, an issue discovered by the performance & monitoring team),
  • do not trigger inbound idle timeout for node-to-client connections (pr #3844), an issue reported to us by Matthias Benkort from Cardano Foundation.

Duncan has been making progress with the input endorsers demo. His simulation provides a useful animated visualisation and live quantification of behaviour of the modeled design.

We also improved our e2e diffusion simulation by implementing header-body split, similar to what the real implementation does.

We also made some advances towards our future goals of P2P release for block producer nodes (pr #3800 - in review) & for Daedalus users (pr #3690 - merged).

Detailed log

  • We expanded diffusion simulation with block-fetch protocol bringing it closer to the production system.

  • We addressed some additional technical depth in diffusion simulation

  • We slightly improved documentation & CI of io-sim and typed-protocols repositories for open-source contributors.

  • We closed a number of issues towards publishing io-sim on Hackage (only two essential issues are left open).

  • We pushed a branch of typed-protocols which captures one of the developer UX problems in the API which we need to solve.

  • We identified and fixed an issue related to systemd sockets.

  • We identified and fixed an issue in consensus initialisation not giving feedback on early errors.

  • We deployed RT View, identified a number of issues which were communicated to the performance & monitoring team.

  • We finished high level & detailed design of peer sharing, very early implementation of peer sharing is done (note that peer sharing cannot be safely deployed without eclipse evasion & genesis).

  • We finished high level design of eclipse evasion, and started working on a detailed design.

  • We were assigned the role of release engineer for 1.35.5 release (the P2P single relay release); we prepared a cardano-node for 1.35.5 release which contains more than 130 patches of just network stack improvements done over last few months.

  • We diagnosed and fixed an tricky bug in the peer state actions (a component which sits between outbound governor and connection manager). That bug was introduced earlier this year and never released. It was caught by the QA testing framework. We expanded our diffusion simulation to cover such case and also mitigated a chance for reintroducing such a bug in future.

  • We identified and quite likely mitigated a misconfiguration in the benchmarking cluster (next benchmarking run will confirm our hypothesis).

  • We simplified the format of p2p topology file, we got positive feedback from SPOs.

  • We raised severities of some of the logging messages, which is an important improvement for SPOs, exchanges and other users of the system.

  • We worked on input endorsers simulation which gives both animated and quantified live feedback on network operation, using a simplified model of a TCP/IP network.

Next quarter

  • Release the Single Relay P2P Release 1.35.5.

  • Carry on with Peer Sharing (review, testing).

  • Deliver a talk at Conference on Principles of Distributed Systems 2022 in Brussels, Belgium.

  • Present Detailed Design of Eclipse Evasion and start implementation phase.

  • Work on P2P Block Producer release.

  • Carry on with publishing of io-sim on Hackage.

· 2 min read
Jared Corduan

Ledger Quarterly Update

2022-09 - 2022-11-04

  • We finished a minimal ledger era capable of master key rotation. This will be re-purposed our upcoming work.
  • We have the humble beginnings of a proper ledger API.
  • We improved the problematic cost model serialization (recall the song and dance about updating the cost model one epoch after the hard fork).
  • We have added benchmarks for problematic areas.
  • Massive repository restructure and cleanup.
    • Unified and consistent variable name schemes (not completely finished, but nearly there).
    • Massive reduction in type constraints, which causes a lot of developer friction, in our code and also downstream.
    • More organized module structures.
    • Improved generators for our property tests.
    • We removed our dependency on cardano-prelude.
  • The formal ledger model has come a long way.
    • We created a fork of Agda that provides some meta-programming support for the ledger rules.
    • We have a large amount of the basic UTxO support in the model.
    • We can generate a good looking PDF from the model.
    • We can produce Haskell from the model.
    • We have a nice finite set theory library that we can use for many of the ledger rules.
    • We have nix support for the model.

Next steps

  • Individual tracking of deposits. [issue-3113]
  • Versioned CBOR encoders/decoders. [issue-3014]
  • New ledger era transaction body (and the surround work associated with it).
  • Designs for the next ledger era.

· 2 min read
Jordan Millar

Node-Api-Cli Quarterly Update

2022-09 - 2022-11-04

  • Various improvements to tests/CI/GHC 9.2.4 preparations/upgrade to cabal-3.8.1.0
  • Major clean up of stale iusses + PRs.
  • Implementation of stale-bot to mitigate against a proliferation of outdated issues and PRs
  • cardano-api refactoring with the aim of exposing more user friendly functions, particularly concerning transaction construction and querying the node.
  • cardano-cli refactoring with the aim of moving reusable functions to cardano-api. We have made strides here and have managed to improve the interface of transaction construction and validation.
  • General documentation updates and improvements
  • Addition of tx-mempool command which allows users to:
    • Query the node about the current mempool's capacity and sizes
    • Request the next transaction from the mempool's current list
    • Query if a particular transaction exists in the mempool
  • Initial refactoring of cardano-testnet

Next quarter

  • cardano-api
    • Working with Konstantinos and his team to make cardano-api better for dapp developers - we have a google doc for this, I can send it to you privately.
  • cardano-testnet
  • Serenity
    • Continued refactoring of cardano-api and cardano-cli, with the particular focus on extracting re-usable components of cardano-cli and moving them to cardano-api. This is harder to define but will manifest in stuff moving from cardano-cli to cardano-api and is tied in to the cardano-api work specified above.
  • General bug fixing and smaller feature requests for the api/cli that are always coming in. Robert is primarily handling this at the moment as he is relatively new.