Skip to main content

44 posts tagged with "performance-tracing"

View All Tags

· 4 min read
Michael Karg

High level summary

  • Benchmarking: Release benchmarks for Node 10.1.4; performance evaluation of ledger metrics trace location.
  • Development: Database-backed quick queries for locli analysis tool.
  • Infrastructure: Voting workload definition merged to master, work on Haskell profile definition now continues.
  • Tracing: C library for trace forwarding and documentation ongoing; improved fallback configs.
  • Community: new Discord channel #tracing-monitoring supporting new tracing system rollout.

Low level overview

Benchmarking

We've run and analyzed a full set of release benchmarks for Node version 10.1.4. We could not observe any performance risks, and expect network performance to very closely match that of previous 10.1.x releases.

Furthermore, we've been investigating the location on the 'hot code path' where metrics from ledger are traced - such as UTxO set size or delegation map size. This currently happens at slot start, when the block forging loop is kicked off. We aim to decouple emitting those traces from the forging loop, and instead moving them to a separate thread. This thread could potentially wake up after a pre-defined time has passed, like e.g. 2/3 of a slot's duration. That would ensure getting those values out of ledger does not occur simultaneously to block production proper.

Moreover, as a new feature, it would enable tracing those metrics on nodes that do not run a forging loop themselves. And last not least, it would free up the way to providing additional metrics at the new location - like DRep count, or DRep delegations - without negatively affecting performance. Initial prototyping has yielded promising results so far.

Development

Parametrizable quick queries, a new feature of our analysis tool locli, have commenced development. They rely on the new database storage backend for raw benchmarking data to be efficient. These quick queries are based on a filter-reduce framework, with composable reducers, which provide a clean way to express exposing very specific points or correlations from the raw benchmarking data.

The quick query feature also incorporates ad-hoc plotting of the query results, and will incorporate exporting the result into exchange formats like CSV or JSON in the future.

Infrastructure

The voting workload definition has been cleanly integrated with the workbench. This also includes an abstract definition of concurrent workloads - which was previously unnecessary, as exactly one workload would be handled by exactly one and the same service. The integration, along with the added flexibility, has been merged to master.

We're now actively working again on the Haskell definition of benchmarking workloads, including a test suite. Most of this improvement had already been done; it still needs final realignment with the current state of all existing workloads. It will allow us to trade hard-to-maintain large jq definitions with concise testable code, and recursive shell script invocations with using a well-defined command line interface only once.

Tracing

Good progress has been made on the small, self-contained C library that implements trace forwarding. It will allow processes in any language that can call to C via a foreign function interface to use cardano-tracer as a target to forward traces and metrics. The initial prototype has already evolved into a library design, which intends to offer to the host application a simple way to encode to Cardano's schema of trace messages - and to use its forwarding protocol asynchronously, as to minimize interruption of the application's native control flow.

In preparation of the new tracing system's release, we've also revisited the fallback configuration values the system will use if it is accidentally misconfigured by the user. The forwarder component uses a bounded queue buffer for trace output to compensate for a possibly unreliable connection to cardano-tracer. The fallback bounds were chosen to conserve trace output at all cost - as it turns out, too high of a memory cost, if trace forwarding does not happen at all, due to faulty configuration. We've adjusted this and other fallback values to sensible defaults to guarantee a functional system even in case of configuration errors.

Community

Our team will host a new channel #tracing-monitoring on IOG's Technical Community discord server. The migration to the new tracing system might affect existing automations built by the community, or how existing configuration need adjusting to achieve the intended outcome. In the channel, we'll offer support for the community in all those regards, as well as answer more general questions regarding the Node's tracing systems.

Additionally, we're currently releasing our documentation improvements to the excellent Cardano Developer Portal, linked below (links on the website may not have been updated yet).

· 5 min read
Michael Karg

High level summary

  • Benchmarking: Finalized voting benchmarks on Node 10.0; workload implementation being generalized to be mergeable.
  • Development: Database-backend for our analysis tool locli merged; several metrics improvements with new tracing.
  • Tracing: C library for trace forwarding started; documentation improved; timing issue in forwarder fixed.

Low level overview

Benchmarking

The voting benchmarks have now finished. The exact implementation of how the voting workload is set up and submitted has been finalized and is currently being prepared for merging into master. This will add those benchmarks to the repertoire we can run on any future node version, and track potential performance changes of voting over time.

The setup allows us to add voting as an additional workload on top of the existing release benchmarking workloads - typically "value-only" and "Plutus loop". The value workload operates at 12TPS and always results in full blocks; we can draw a straight line comparison when a certain, constant percentage of each blocks is filled with vote transactions. The Plutus workload however is throttled by spending the block execution budget, and not so much by transaction size and TPS - contrary to voting submissions. This results in a large variance in block size that the network produces, and restricting analysis to the blocks that are actually comparable to each other greatly reduces sample size.

This means that in practice, we've found "voting on top of value-only" to represent the performance implications of voting most accurately. This workload will serve as a base for comparison over time, and will be run selectively on new versions, whenever the proposal / voting feature of the Conway ledger is touched.

As a conclusion to those benchmarks we've ascertained that:

  1. there is a performance cost to voting, vote tallying and proposal enactment
  2. on the system level, this cost is very reasonable and poses no operational risk
  3. on the service level, processing an individual voting transaction is even slightly cheaper performance-wise than a transaction consuming and creating multiple UTxO entries

Development

The analysis and reporting tool, locli ("LogObject CLI") now comes equipped with a database-backed persistence layer. This new backend has been validated by using it to re-analyse past benchmarks. Performance workbench integration has been completed, and by means of a new envvar this backend can be enabled for use in automations. It currently co-exists in locli with the default file system based persistence backend.

Apart from opening up raw benchmarking data to the full power of SQL queries, or quickly marshalling it into another format to feed into other applications, the new storage backend has considerable advantages regarding execution speed and resource usage. It both requires less RAM (around 30% less) during runtime, and less disk space - about 90% less! Standard analysis of a cluster run can now be performed in less than an hour, whereas it took around 2h before.

Currently, we're working on implementing parametrizable quick queries of that raw data - complete with adding plotting capabilities to locli. The queries are meant to easily extract and visualize very specific correlations that are not part of standard analysis, catering to the oftentimes investigative nature of performance analysis.

Furthermore, The new tracing system now provides direct insight into the chain tip's hash, exposing tipBlockHash, tipBlockParentHash and tipBlockIssuerVerificationKeyHash both as trace messages and metrics. Additionally, we've merged a fix for issue cardano-node#5751: the metric forging_enabled now correctly also observes the presence of the CLI option --non-producing-node.

Tracing

The new tracing system allows for trace and metrics forwarding from some process to cardano-tracer. For any Haskell application, the forwarder package can easily be included as a library. For applications written in other programming languages, we've decided a small, self-contained C library that handles forwarding is a viable way to provide this functionality to a much wider range of ecosystems. The C library will implement protocol handshake and possibly muxing, the forwarder protocol messages being used, and CBOR-based encodings of trace messages and metrics - which only exists in Haskell currently. We've just started the prototype.

We've been working hard on updating and improving the documentation for the new tracing system on https://developers.cardano.org (not merged yet). The aim was to provide a quick start guide to "just get it set up and running", without presupposing any knowledge of tracing, or Haskell. Moreover, for users coming from the legacy tracing system, we wanted to highlight the key differences between systems - and possibly different assumptions when operating them.

Last not least, we caught a very interesting timing issue in the forwarder: each service connected to cardano-tracer bears both an internal and external name for the connection (both unique), where the external name is chosen by the service itself. Upon forwarder initialization, so called data points are set up within the service, into which data can then be traced (such as that external name), and which are actively polled / queried by cardano-tracer. As these are all concurrent events, the external name wasn't yet available in the data point, if initialization of forwarding happened "too fast". Once located, fixing this was trivial by enforcing a relative ordering of concurrent events just during initialization.

Happy New Year!

It's been an amazing year for the Performance & Tracing team. We're proud to have contributed to Cardano's transition into the age of Voltaire, and reliably safeguarded performance of the Cardano network - and to have finalized our new tracing system. A huge thanks to all those who've been helpful, supportive - and who've presented us with new ideas and challanges.

Have a Happy New Year 2025!

· 4 min read
Michael Karg

High level summary

  • Benchmarking: Further Governance action / voting benchmarks on Node 10.0.
  • Development: New protoype for database-backed persistence layer in our analysis tool locli.
  • Workbench: More fine-grained genesis caching; export cluster topology for Leios simulation.
  • Tracing: Final round of metrics alignment complete; prepared for typed-protocols-0.3 bump; new tracing system rollout starting with Node 10.2.

Low level overview

Benchmarking

We've been working on improving the voting workload for benchmarks along two axes: Firstly, reduce the (slight) overhead that decentralized vote submission induces. Secondly, introduce a scaling parameter - namely the number of votes submitted per transaction, and hence the number of proposals to be considered simultaneously for tallying and ratification. On the way, we improved upon timing of submissions, as this has caused benchmarks to abort mid-run every now and then: in those cases, a newly created UTxO entry just hadn't settled across the cluster when it was supposed to be reused for consumption.

Scaling of the voting workload is currently under analysis.

Development

Our analysis and reporting tool, locli ("LogObject CLI") has a few drawbacks as far as system resource usage goes; it requires a huge amount of RAM, and initialization (i.e. loading and parsing trace output) is quite slow. Moreover, there is no intermediate, potentially exposable or queryable, representation of data besides the trace messages themselves.

We're working on a prototype that introduces a database persistence layer as that intermediate representation. Not only does that open up raw benchmarking data to other means of querying or processing outside locli. Initializing the tool from the database has also shown to require much less RAM, and to improve duration of the initialization phase. Furthermore, on-disk representation is much more efficient that way - which is no small benefit when raw benchmarking data for a single run can occupy north of 64GiB.

The prototype has yet to be fully integrated into the analysis pipeline for validation, however, initial observations are promising.

Workbench

For our benchmarks, we rely on staked geneses, as the cluster needs control all stake, and such, block production, to yield meaningful performance metrics. As creating a staked genesis of that extent is an expensive operation, we use a caching mechanism for those. Small changes in the benchmarking profile, such as protocol version or parameters, Plutus cost models or execution budgets would usually trigger the creation of a new cache entry. We've now factored out from cache entry resolution all those variables that do not impact staking itself. We then created a mechnanism to patch those changes into genesis files after cache retrieval, when preparing them for a benchmarking run. This adds flexibility for creating profiles, and reduces the time to deploy a run to the cluster.

We also delivered a comprehensive description of our cluster to the Leios innovation team. This includes the definition of our artificially constrained topology, as well as a latency matrix for node connections in that topology, assigning a weight to all edges in the graph. The Leios team intends to use that material to implement a large-scale simulation of the algorithm, and thus gain representative timings for diffusion and propagation.

Tracing

The alignment of metrics names between legacy and new tracing system is now complete - which should minimize the migration effort of existing dashboards for the community. The only differences that remain are motivated by increasing compliance with existing standards like e.g. OpenMetrics. Furthermore, a few metrics still missing in the new system have now been ported over, such as node.start.time or served.block.latest.

We're all set for the expected bump to typed-protocols-0.3: both forwarder packages trace-forward and ekg-forward for the new tracing system have been adapted to the new API and are passing all tests.

Last not least, we've settled on a rollout plan for the new tracing system. The new system will set to be the default with the upcoming Node release 10.2. This is achieved by a change of configuration only - there is no need for different Node builds. The cardano-node binary will contain both tracing systems for a considerable grace period: 3 - 6 months after release. This should give the community ample time to adjust for necessary changes in downstream services or dashboards that consume trace or metrics output.

We'll provide a comprehensive hands-on migration guide summarizing those changes for the user.

· 3 min read
Michael Karg

High level summary

  • Benchmarking: Governance action / voting benchmarks on Node 10.0; performed PlutusV3 RIPEMD-160 benchmarks.
  • Development: Governance action workload fully implemented; generator-based is submission ongoing work.
  • Tracing: New tracing system production ready - cardano-tracer-0.3 released; work advancing on typed-protocols-0.3 bump and metrics naming.

Low level overview

Benchmarking

We've run and analyzed the new voting workload on Node 10.0. This workload is a stream of voting transactions submitted on top of the existing value workload from release benchmarking. The delta in the comparison can claim to demonstrate the "performance cost of voting" in the Conway ledger era. The workload itself is a puppeteer of 10.000 DReps overall, who vote on up to 5 governance actions simultaneously. We made sure these are mutually independent proposals, that vote tallying occurs, and that the actions get ratified and enacted (and hence removed from the ledger). Then, voting moves on to the next actions - keeping the number of actions needing vote tallying stable over the benchmark. We could observe a very slight performance cost of voting; it's deemed to be a reasonable one given the stress we've put the system under.

The results can be found here along with those from release benchmkarks.

Furthermore, we've developed and run a new Plutus benchmark targeting the RIPEMD-160 internal. We've compared the resulting performance observations against other Plutus workloads - both memory-constrained and (same as RIPEMD-160) CPU-constrained. We have concluded that there are no performance risks to that algorithm in PlutusV3, given existing execution budgets, and that it's consistently priced wrt. other CPU-intensive internals.

Development

The voting workload is currently implemented using decentralized submission via cardano-cli on each of our cluster machines. It has proven reliable - and scalable, at least to some extent. We're already working on improvements that reduce the (very slight) overhead of using the CLI for submission. Additionally, we're aiming for a linear performance comparison when submitting twice the number votes per transaction at the same TPS rate - forcing double the work for vote tallying.

Implementation of that workload using the centralized (and much better scalable) tx-generator submission service is still ongoing.

Tracing

Metrics naming is currently receiving a last round of consistency checking, so that it's aligned as closely as possible between legacy and new tracing system. In the process, we're adressing aspects of documentation, and incorporating feedback to define a few metrics in the new system that previously were present in the legacy one only.

For migrating to the new typed-protocols-0.3 API, two of the new tracing system's packages are affected. The work for ekg-forward-0.7 is completed and merged to master - yet to be released on CHaP. Work on the second package, trace-forward, is ongoing.

We've finally released cardano-tracer-0.3, which incorporates all features, enhancements and fixes that have been reported on here over the past months. This release marks 100% production readiness of the new tracing system. We're focusing now on making documentation and example scripts and configs yet more user-friendly for community rollout. We're very much looking forward to receiving feedback - and have time and space reserved to address it, as well as to provide intial support for the migration away from the legacy system.

· 2 min read
Michael Karg

High level summary

  • Benchmarking: Started release benchmarks for Node 10.0.
  • Development: Governance action workload - alternative tx submission method built, passes tests.
  • Tracing: Preparing the bump to typed-protocols-0.3.

Low level overview

Benchmarking

We've started the benchmarking process for the freshly tagged, fully Chang 2 capable Node version 10.0 pre-release.

Development

Calibrating a governance action / voting workload within our submission service tx-generator is more involved than anticipated.

As measurements for performance impact of voting are required very shortly, we have - in parallel - created a nix / bash based solution. That one uses cardano-cli for creating and submitting proposals and voting transactions, while the generator can run any other known workload simultaneously. Thus, we expect to get a clear performance delta between voting vs. no voting going on. This setup has already been deployed, and is passing testing - soon to be used for the first real-world voting benchmarks.

The implementation however is less flexible, much less parametrizable, and in its design tied to the very specific, fixed topology of the Nomad cluster. The workload definition inside tx-generator will thus continue, and eventually be used as the standard for benchmarks targeting voting / governance.

Tracing

The new tracing system, more specifically, the components that forward metrics and traces to cardano-tracer, contain well-defined peers in the sense of the typed-protocols package. The upcoming bump to recently released version 0.3 contains breaking changes in the package API. We've begun necessary downstream adjustments in our packages, re-defining aforementioned peers using the new API.