Skip to main content

· 2 min read
Jean-Philippe Raynaud

High level overview

This week, the Mithril team released a new version of the Protocol Insights dashboard. They also completed implementing the new status route in the aggregator and upgraded the explorer to display its information. Additionally, the team completed the refactoring of the beacon used to snapshot the Cardano database and started working on the activation of the Pythagoras Mithril era on the pre-release-preview network.

Finally, they worked on removing the legacy store adapters in the signer and aggregator and explored solutions for signer registration when multiple aggregators are running on a Mithril network.

Low level overview

  • Published a dev blog post about the new Protocol Insights dashboard
  • Completed the issue Create a new /status route in aggregator #2071
  • Completed the issue Remove network field from CardanoDbBeacon #1957
  • Completed the issue Refactor pruning with upkeep service in signer/aggregator #2075
  • Completed the issue Implement the new metrics in the Mithril Protocol Insights dashboard #2076
  • Completed the issue Add command to create Genesis keypair in aggregator #2074
  • Completed the issue testing-preview and testing-sanchonet aggregators panic with FOREIGN KEY constraint failed error #2120
  • Completed the issue Display aggregator status information in explorer #2073
  • Completed the issue Failures of some STM property based tests #2109
  • Worked on the issue Make client WASM npm package compatible with NodeJS #2091
  • Worked on the issue Get rid of store adapters in signer and aggregator #2118
  • Worked on the issue Activate Pythagoras Mithril era #2034
  • Worked on the issue Schedule nightly builds with a workflow dispatcher #2092
  • Worked on the issue Explore Signer Registration Solutions #2029

· 4 min read
Michael Karg

High level summary

  • Benchmarking: Further Governance action / voting benchmarks on Node 10.0.
  • Development: New protoype for database-backed persistence layer in our analysis tool locli.
  • Workbench: More fine-grained genesis caching; export cluster topology for Leios simulation.
  • Tracing: Final round of metrics alignment complete; prepared for typed-protocols-0.3 bump; new tracing system rollout starting with Node 10.2.

Low level overview

Benchmarking

We've been working on improving the voting workload for benchmarks along two axes: Firstly, reduce the (slight) overhead that decentralized vote submission induces. Secondly, introduce a scaling parameter - namely the number of votes submitted per transaction, and hence the number of proposals to be considered simultaneously for tallying and ratification. On the way, we improved upon timing of submissions, as this has caused benchmarks to abort mid-run every now and then: in those cases, a newly created UTxO entry just hadn't settled across the cluster when it was supposed to be reused for consumption.

Scaling of the voting workload is currently under analysis.

Development

Our analysis and reporting tool, locli ("LogObject CLI") has a few drawbacks as far as system resource usage goes; it requires a huge amount of RAM, and initialization (i.e. loading and parsing trace output) is quite slow. Moreover, there is no intermediate, potentially exposable or queryable, representation of data besides the trace messages themselves.

We're working on a prototype that introduces a database persistence layer as that intermediate representation. Not only does that open up raw benchmarking data to other means of querying or processing outside locli. Initializing the tool from the database has also shown to require much less RAM, and to improve duration of the initialization phase. Furthermore, on-disk representation is much more efficient that way - which is no small benefit when raw benchmarking data for a single run can occupy north of 64GiB.

The prototype has yet to be fully integrated into the analysis pipeline for validation, however, initial observations are promising.

Workbench

For our benchmarks, we rely on staked geneses, as the cluster needs control all stake, and such, block production, to yield meaningful performance metrics. As creating a staked genesis of that extent is an expensive operation, we use a caching mechanism for those. Small changes in the benchmarking profile, such as protocol version or parameters, Plutus cost models or execution budgets would usually trigger the creation of a new cache entry. We've now factored out from cache entry resolution all those variables that do not impact staking itself. We then created a mechnanism to patch those changes into genesis files after cache retrieval, when preparing them for a benchmarking run. This adds flexibility for creating profiles, and reduces the time to deploy a run to the cluster.

We also delivered a comprehensive description of our cluster to the Leios innovation team. This includes the definition of our artificially constrained topology, as well as a latency matrix for node connections in that topology, assigning a weight to all edges in the graph. The Leios team intends to use that material to implement a large-scale simulation of the algorithm, and thus gain representative timings for diffusion and propagation.

Tracing

The alignment of metrics names between legacy and new tracing system is now complete - which should minimize the migration effort of existing dashboards for the community. The only differences that remain are motivated by increasing compliance with existing standards like e.g. OpenMetrics. Furthermore, a few metrics still missing in the new system have now been ported over, such as node.start.time or served.block.latest.

We're all set for the expected bump to typed-protocols-0.3: both forwarder packages trace-forward and ekg-forward for the new tracing system have been adapted to the new API and are passing all tests.

Last not least, we've settled on a rollout plan for the new tracing system. The new system will set to be the default with the upcoming Node release 10.2. This is achieved by a change of configuration only - there is no need for different Node builds. The cardano-node binary will contain both tracing systems for a considerable grace period: 3 - 6 months after release. This should give the community ample time to adjust for necessary changes in downstream services or dashboards that consume trace or metrics output.

We'll provide a comprehensive hands-on migration guide summarizing those changes for the user.

· 2 min read
John Lotoski

High level summary

The SRE team continues work on Cardano environment improvements and general maintenance.

Some notable recent changes, updates or improvements include:

  • Cardano-node release 10.1.2 was deployed to all environments

  • Dbsync release 13.6.0.1 was deployed to all environments

  • Just recipe query-gov-action-status was added to aid in live voting analysis of governance actions

  • The mainnet bootstrap cluster was scaled temporarily to accommodate a significant increase in client load which developed during the past week

  • With scheduled end of year vacation time and holidays starting, the cadence of work is expected to slow a bit in the following few node SRE updates

Repository Work

Capkgs

Cardano-parts

  • Sets cardano-node to 10.1.2, dbsync to 13.6.0.1, mithril to v2445.0, faucet to 10.1. Governance recipes were moved to their own governance recipe file and a query-gov-action-status recipe for live vote analysis was added. New tracing system module improvements were made to prevent unexpected metrics export stoppage along with other miscellaneous improvements. More detail is available in the release notes: cardano-parts-release-v2024-11-18

Cardano-playground

  • Sets cardano-node to 10.1.2, dbsync to 13.6.0.1, mithril to v2445.0, faucet to 10.1. Governance recipes were moved to their own governance recipe file and a query-gov-action-status recipe for live vote analysis was added. KES rotations were done for multiple environments. More detail is available in the PR description: cardano-playground-pull-36

Cardano-mainnet

  • Sets cardano-node to 10.1.2, dbsync to 13.6.0.1, mithril to v2445.0. Governance recipes were moved to their own governance recipe file and a query-gov-action-status recipe for live vote analysis was added. Bootstrap threshold alerts were adjusted and blockperf was added to temporary bootstrap scaling machines. More detail is available in the PR description: cardano-mainnet-pull-26

· One min read
Noon van der Silk

High-level summary

This last few weeks have been focused on incremental commits, re-writing more validators in Aiken, and the associated changes that have come about as the our scripts sizes increase. We continue along on prioritising incremental commits and a 0.20.0 release, as well as some repository cleanup and additional functionality based on user requests.

What did the team achieve?

  • Benchmarked memory limits on number of Txns #1724
  • Re-wrote Initial validators script to Aiken #1734
  • Bump to PlutusV3 #1734
  • Continued progress on incremental commits #199

What's next?

  • Move hydra-explorer out of the mono-repo #1716
  • Add ability to filter the API by UTxO address #1719
  • Continued work on incremental commits #199
  • Investigate options for customised ledger in a Hydra Head #1727
  • Plan the 0.20.0 release
  • Continue to support Hydra Doom

· One min read
Damian Nadales

High level summary

  • Reviewed the UTxO HD PR, and started addressing Review comments.
  • Engaged with Researchers about to discuss the HFC simplification proposal.
  • Reverted the Babbage->Conway era transition workaround, clarifying the semantics around stake from pointer addresses (see #1297).
  • Well-Typed worked on two features for lsm-tree:
    • snapshots (for persisting ledger snapshots)
    • table union (for storing more parts of the ledger state on disk)
  • Addressed minor tech debt issues (#1269).