Skip to main content

· 3 min read
Alexey Kuleshevich

High level summary

Some of the most important and final Conway features were implemented since the last report:

  • Bootstrap phase is fully implemented
  • HardForkInitiaztion governance action will now correctly take us into the next era that will follow the Conway era.
  • DRep stake distribution now also includes the amount in the reward account and deposits that were left for the governance proposals.
  • CostModels updates for plutus scripts made more flexible, which will allow us adding new primitives for all plutus versions starting with the Conway era.

As always, besides new features, we also wrote a lot of testing functionality. We now have our first and fully functional conformance test for a GOV rule, with a few more in the works. Many improvements and bugfixes to constraint based generating functionality. Last, but not least, we did a major and long awaited improvement to our CI setup that makes it much easier to spot failing tests and deal with potential flakiness.

Low level summary

Conway

  • pull-4275 - Restrict gov actions during bootstrap
  • pull-4253 - Hardfork Initiation into a new era
  • pull-4273 - DRepDistr: Iterate over the DRep delegations in UMap
  • pull-4309 - Add proposal deposits to DRep active voting stake.
  • pull-4284 - Flexible costmodel params
  • pull-4328 - Disable drep thresholds in bootstrap

Testing

  • pull-4295 - Improve generator in ImpTestsState
  • pull-4292 - constrained-generators: add genHint for maps
  • pull-4298 - constrained-generators: utility function for asserting over a reified value
  • pull-4300 - constrained-generators: hotfix of latest derp...
  • pull-4297 - constrained-generators: Fix ifElse dependencies
  • pull-4301 - constrained-generators: Add monitoring capability to get a handle on test case distribution
  • pull-4315 - constrained-generators: Improve error messages and make the tree generator reasonably sized
  • pull-4317 - constrained-generators: Fix bug in reifies
  • pull-4299 - Fix strange CI failure.
  • pull-4285 - Start Conway Imp tests with an initial committee and constitution
  • pull-4303 - Fix test caused by erroneous merge
  • pull-4310 - Fix OMap.assocList
  • pull-4268 - Enable conformance tests for GOV rule

Infrastructure and releasing

  • pull-4276 - Use a separate job for each test suite in GitHub CI
  • pull-4304 - Ensure the CI complete step fails when tests fail
  • pull-4308 - Add a CI status check to prevent merging PRs that contain merges
  • pull-4305 - Use the correct iohk action for installing Haskell in GitHub CI
  • pull-4322 - Bump jinja2 from 3.1.3 to 3.1.4 in /doc

· 5 min read
Michael Karg

High level summary

  • Benchmarking: We've performed and analysed benchmarks in the Conway era, with DReps injected.
  • Development: Tracing DRep data has been implemented; improved error reporting in tx-generator and analysis quick queries are ongoing work.
  • Workbench: We now fully supports the new CLI create-testnet-data command and DRep injection into Conway genesis. Haskell profile definition work is ongoing.
  • Tracing: Various additions to Node metrics are being worked on, such as build info and block producer role. Metrics naming will be further harmonized.
  • UTxO Growth: We've finalized analysis and reports of all benchmarks targeting UTxO scaling scenarios.
  • UTxO-HD / LMDB: We've performed multiple runs benchmarking the LMDB (on-disk) backend of UTxO-HD.

Low level overview

Benchmarking

We've run and analyzed a full set of benchmarks comparing the Conway ledger against the Babbage one, on Node 8.10.1-pre. For Conway, our additional goal was to measure a vanilla ledger state against one with a large amount of DReps - and delegations to those DReps - present. The benchmarks used our existing value and Plutus workloads to remain comparable to each other.

Development

Additional ledger queries for the tracing system have been implemented and merged to master. Those capture the amount of, and the number of existing delegations to, DReps as trace output - and thus enable creating a metric on top of it, which can then be monitored.

The (in our case) non-deterministic nature of shutting down different cluster setups - both local and cloud-based - carries the possibility that our transaction generation service occasionally misclassifies a regular shutdown as an error. Furthermore, in the case of network malfunctions, the service's errors are too unspecific. By implementing thread labels for submission threads, corresponding to each submission target, and by adding custom smart signal handlers, we'll improve the generator's error reporting significantly.

The initial tests for quick queries are being developed further. We're moving towards a principled, and generalized, syntax that supports both prepared, parametrizable queries from the application code, as well as ad-hoc queries stated e.g. on the command line.

Workbench

The performance workbench now fully supports the new cardano-cli command create-test-data. We use it to inject both stake delegated to stake pools into genesis, and - recently added - stake delegated to DReps as well. It has been proven very useful and versatile so far, and will eventually replace the current create-staked command.

Work on porting our performance workbench's profile definitions to Haskell, and providing them with an appropriate test suite, is still ongoing; currently, we're integrating all new profile families that came out of the UTxO growth scenarios.

Tracing

New metrics are being implemented for the tracing system. They will also be part of Prometheus output and as such accessible to monitoring services. There'll be cardano-node's detailed build info, as well as a node's block producer status, meaning the presence of forger credentials. Those new metrics are being backported to the legacy tracing system, too.

Furthermore, we've determined the need to revisit metrics naming. There's still a divergence between naming in the legacy and the new system. While this could be mitigated by passing in extra config options, we think that a transition to the new system should not impose any unnecessary effort for node operators. A design to fully harmonize the existing naming schemata is currently being set up.

UTxO Growth

The UTxO Growth benchmarking series has been finalized. We've finished analyses and reports for all scenarios that were tested and explored.

The overarching questions were, given a network of 32GB host systems, how large can the UTxO set grow in general, how large can it grow before the nodes have to operate close to the RAM limit over extended periods of time, and how does scaling the UTxO set size affect network metrics, such as block diffusion.

A dedicated "UTxO Scaling Squad" was set up, who was driving the entire process, and we enjoyed a very focused and productive collaboration with them.

UTxO-HD / LMDB

Last not least, we were able to benchmark UTxO-HD's on-disk backend on a network of block producing nodes, on a recent 8.9.1 version of cardano-node. The setup allowed of using a direct access SSD device for performance critical disk I/O, whereas the bulk of ChainDB and ledger snapshots remained on a standard AWS EBS volume.

The benchmarks comprised both optimistic and pessimistic RAM assumptions for the host OS to further optimize I/O via page cache, as well as medium and large UTxO set sizes - the latter almost tripling current mainnet's size. The results were promising; the LMDB backend has proven to be able to accomodate large UTxO sets using significantly less RAM than the default all-in-memory node - and with a more than reasonable trade-off performance-wise. Furthermore, running with pessimistic assumptions, the performance impact on LMDB was very moderate only.

· 3 min read
Marcin Szamotulski

High-level overview of sprint 60

Edited on 8th of May: new EGK counters will be included in `cardano-node-8.9.3`, added links to `cardano-node-8.9.3` PR and `ouroboros-network-0.15` release.

Peer-Sharing Improvements

We continued working on improving peer sharing. As part of this work light peer sharing (e.g. including inbound peers to the known set of outbound governor), was restructured. Now, sending more peers than what was requested by the peer-sharing client is a protocol error, and the connection will be terminated; This hasn't been a resource attack vector since we always limited the number of peers taken by the outbound-governor and the number of peers has always been limited by the size of the mux ingress queue reserved for peer-sharing mini-protocol. These changes will be released in cardano-node-8.9.3. See ouroboros-network#4868

We also merged the work on outbound governor counters, which initially started as just an extension for peer-sharing counters but turned into a larger refactorisation. We announced it in the previous report. These changes will be included in 8.9.3. See ouroboros-network#4845, ouroboros-network#4861.

Light peer sharing (inbound peers) refactorisation allowed us to refactor the inbound governor loop: we restructured it so that the internal state is kept pure (and thus not shared with other threads), while the public part is computed incrementally (with good amortised costs and thus leading to good performance) and exposed to other components (e.g. the outbound-governor), see ouroboros-network#4871 (which is built on top of ouroboros-network#4868).

The PR [cardano-nod#5831] integrates ouroboros-network-0.15 with cardano-node-8.9.x branch. All included PRs / issues in ouroboros-network-0.15 are listed here.

Genesis

We implemented the API needed by the consensus layer for Genesis; see ouroboros-network#4815, ouroboros-network#4846.

We continued working on outbound governor changes to support Genesis:

Bootstrap Peers

Karl Knutsson ([CF]) found and fixed some problems related to big-ledger and public root peers. Here's an excerpt from the changelog file:

  • updated the big-ledger retry state in case of an exception;
  • reset public root retry state when transitioning between LedgerStateJudgements;
  • reduced public root retry timer;
  • don't classify a config file with public-root/bootstrap-peers IP addresses only as a DNS error. See ouroboros-network#4867.

Churn

We merged a refactorisation which synchronises churn with the outbound governor, see ouroboros-network#4617.

Minor Improvements

A few other minor improvements were merged:

Testing

We added quickcheck-monoids package and also submitted an upstream patch to QuickCheck to include a version of the standard All / Any monoids, which are helpful when writing more complex properties. We will use quickcheck-monoids until the upstream PR will be released. It will be available from CHaP. See quickcheck#397.

· One min read
Sebastian Nagel

High-level summary

This week, the Hydra team has been working on refactoring and detecting network protocol version mismatches. They have also merged the /commit endpoint changes including a follow-up fix about fee calculation. Besides this, they applied minor workflow fixes by adding docker images to nix checks and disabling mithril integration testing on preview (until mithril 2418 is released).

What did the team achieve this week

  • Refactor connectivity and detect network protocol version mismatches #1381
  • Merged and completed #1350, including a follow-up fix about fee calculation
  • Add docker images to nix checks
  • Disable mithril-client testing on Preview

What are the goals of next week

  • Restructure documentation including a how to about streaming plugins #1325
  • Add arm64 docker images as requested in #1404
  • Release 0.17.0

· One min read
Damian Nadales

High level summary

  • Reworked the argument for the different databases used in Consensus, in preparation for UTxO-HD (#1059).
  • Helped review the first Peras Innovation draft report.
  • Continued working on VRF restriction based on slot distance. The corresponding PR (#1047) went through its first round of reviews.
  • Provided support to the Networking team to review their work on querying big ledger peers (#1067).
  • Continued working on open-sourcing fs-api and fs-sim.
  • Performed other minor refactorings in the codebase (#1073 and #1070).