Skip to main content

2 posts tagged with "performance-tracing"

View All Tags

· 4 min read
Michael Karg

2023-07 - 2023-09

Main achievements

  • Release benchmarking
  • Developing and running UTxO-HD benchmarks - in-memory flavour
  • P2P benchmarks, facilitating rollout
  • Production-readiness of the new Nomad cluster has been reached
  • Optimization of and introspection capability for the new tracing system
  • GHC9 performance investigation (and possible remedy)
  • Conensus QTAs: first real-world application of prototype

Release benchmarking

Ongoing release benchmarking is a crucial safeguard to cardano-node's release cycle from a performance perspective. We've performed and analyzed benchmarks for node versions 8.2.x to 8.5 throughout Q3.

UTxO-HD benchmarks

Targeting a specific new feature in benchmarks requires development effort and fine-tuning the machinery. In Q3, we achieved that for the in-memory flavour of UTxO-HD, enabling benchmark delivery.

P2P benchmarks

In Q3, we performed additional P2P benchmarks to facilitate the comprehensive rollout of that feature.

New nomad cluster

The new hardware cluster for benchmarks, which is controlled through the new nomad backend, has received various rounds of validation and adjustments in Q3 - in addition to finalizing integration with the rest of our pipeline. The confidence in metrics gathered on the cluster is now sufficient for us to consider it ready for production use.

New tracing system

Our new tracing system has received various rounds of optimization in Q3. We could verify in our benchmarks that it is roughly on par with the legacy system while offering a richer feature set and greater flexibility.

Additionally, in Q3 we equipped the system with an introspection capability. This is now used for generating end user documentation that stays in-sync with definitions in code, and for automated consistency checking of the entire system.

GHC9 performance

In Q3, a joint investigation with DevX into GHC9's behaviour revealed where and how GHC9 misses opportunities for optimization of generated code. This led to an approach to annotate our codebase accordingly to re-enable those optimizations - which is still being validated.

Consensus QTAs

In collaboration with Consensus and DevX, we advanced the Consensus QTAs prototype capturing ledger operations' performance characteristics. It's now applicable, and being applied, to a real-world task - namely gathering evidence of the effect of aforementioned changes allowing for performant GHC9 builds.

Next steps

Benchmarking:

In Q4, the focus will be on:

  • facilitating the next mainnet release
  • benchmarking runs in the Conway era
  • developing benchmarks / workloads for Conway-exclusive actions
  • implementing a specialized benchmark setup for the UTxO-HD on-disk variant
  • developing new Plutus benchmarks to safeguard Plutus V3
  • benchmarks regarding the rollout of P2P

Performance

For certain blocking performance issues we've located the cause, or even found a solution in a cross-team effort. In Q4 we'll advance that work to ensure the ongoing release cycle for mainnet, as well as make GHC9 become a viable release platform.

New tracing system

For the new tracing system, we'll finalize optimization - current results are already on par with the legacy system. Furthermore, we will finish up comprehensive documentation, as well as description of a recommended setup, for which we can provide initial support.

UTxO-HD monitoring

We'll augment our analysis pipeline so it can process monitoring data from UTxO-HD nodes connected to mainnet in a meaningful way.

Nomad backend

From Q4 on, this backend will be in production use. We plan on adding various UX and flexibility improvements, and further fine-tuning some profiles for nomad.

Workbench

We will prepare for a future move of our performance workbench into a separate project. This entails restructuring, refactoring and reimplementation of certain few components that currently assume to always be in sync with cardano-node.

Consensus component QTAs (co-development)

In Q4 there will be ongoing work with and support for the existing prototype. We plan to identify a fixed set of input data that yields results of high informative value, and to formalize the process to a point that enables future automation.

· 3 min read
Michael Karg

2023-10 - 2024-01

Main achievements

  • Release benchmarking, leading up to next mainnet release
  • Conway benchmarking of existing Babbage workloads
  • P2P benchmarks, validating viability as default topology
  • Added basic PlutusV3 capability of our tooling
  • Publication of benchmarking reports accompanying a mainnet release
  • GHC9 performance investigation
  • Finalized and validated all optimizations for the new tracing system
  • New Nomad benchmarking cluster: production use
  • Adjustment of our infrastructure to cover the migration to IntersectMBO
  • Conensus QTAs: prototype developed into alpha-stage benchmark
  • Successful on-boarding of a new team member

Release benchmarking

We've performed and analyzed benchmarks for node versions 8.6.x to 8.7.3, which is projected to be the next mainnet release, throughout Q4. Along that way, we have identified, located and handled all performance blockers.

Additionally, we've started publishing benchmarking reports here on Cardano Updates. The format is meant to increase transparency and provide insight into those measurements that accompany mainnet releases - demonstrating the absence of performance regressions and development of specific metrics over time.

Conway benchmarks

Furthermore, we've done first ever benchmarking of the Conway ledger. To that end, we've ported our Babbage workloads to Conway for immediate comparability. Fortunately, we've have not found any performance regression in the Conway ledger.

P2P benchmarks

In Q4, we've validated P2P topology to be viable as default for both relay and block producer nodes. As a consequence, we've switched to P2P topology for benchmarking baselines ourselves.

GHC9 performance

In Q4, evaluation of GHC9.2's and GHC9.6's optimizer in the context of the Cardano code base has been completed. Eventually, GHC9.6 has shown itself to be much more suitable from a performance perspective. We're convinced that with a few select annotations in the code, GHC9.6's optimizer can produce a result on par performance-wise with GHC8.10 - which just was a great release in that regard. With GHC9.2 unfortunately, the changes would have to be more invasive - and thus more time-consuming.

New nomad cluster

We’ve moved the new Nomad cluster into production use and established new baselines for each workload on it. Additionally, we’ve shut down the legacy cardano-ops benchmarking cluster, and archived all raw data from it.

Consensus component QTAs

We’ve developed the existing prototype into an automatable, self-contained benchmark called beacon, as well as systematized workloads and run structure for it. Moreover, we’ve demonstrated usefulness of the metrics, their reproducibility, and identified domains viable for QTAs with system-level benchmarks.

New team member

We're happy to welcome a new joiner to our team! We've successfully onboarded him in Q4; he has taken over the cardano-tracer service - the node-external component of the new tracing system - and has already landed several valuable contributions.