Skip to main content

11 posts tagged with "benchmarking-reports"

View All Tags

· 3 min read
Michael Karg

Setup

This report compares benchmarking runs for 2 different flavours of cardano-node:

  • 10.2-regular - regular Node performance baseline from the 10.2.x release benchmarks.
  • 10.2-utxohd - the UTxO-HD build of the Node based on that same version.

For this benchmark, we're gathering various metrics under the value-only workload used in release benchmarks: Each transaction consumes 2 inputs and creates 2 outputs, changing the UTxO set. This workload produces full blocks (> 80kB) exclusively. Moreover, it's the workload that produces most stress on the UTxO set. Thus, it's the most meaningful workload when it comes to benchmarking UTxO-HD.

We target the in-memory backing store of UTxO-HD - LedgerDB V2 in this case. The on-disk backend is not used.

Benchmarking is performed on a cluster of 52 block producing nodes spread across 3 different AWS regions, interconnected using a static, restricted topology.

Observations

Resource Usage

  1. With UTxO-HD's in-memory backend, the memory footprint increases slightly by 3%.
  2. Process CPU usage is moderately reduced by 9% with UTxO-HD.
  3. Additionally, CPU 85% spans decrease in duration by 24% (~1.1 slots).

Caveat: Individual metrics can't be evaluated in isolate; the resource usage profile as a whole provides insight into the system's performance and responsiveness.

Forging Loop

  1. Block context acquisition improves by 3ms (or 11%), while Ledger ticking takes 3ms (or 10%) longer.
  2. Creating a mempool snapshot is significantly faster - by 16ms (or 21%).
  3. As a result, a UTxO-HD block producing node is able to announce a new header 17ms (or 12%) earlier into a slot.
  4. Additionally, adoption time on the forger is slightly improved - by 4ms (or 5%).

Peer propagation

  1. Block fetch duration increases moderately by 13ms or 4%.
  2. Adoption times on the peers improve very slightly - by 2ms or 2%.

End-to-end propagation

This metric encompasses block diffusion and adoption across specific percentages of the benchmarking cluster, with 0.80 adoption meaning adoption on 80% of all cluster nodes.

  1. There is no significant difference in cluster adoption times between regular and UTxO-HD node.

Conclusion

Regarding the UTxO-HD build using the in-memory LedgerDB V2 backend, we can conclude that:

  1. it is lighter on CPU usage compared to the regular node, albeit requiring just slightly more RAM.
  2. it poses no performance risk to block producers; on the contrary, the changes in forging loop metrics seem favourable compared to the regular node.
  3. network performance would be expeceted to be on par with the regular node.
  4. even under stress, there is no measurable performance regression compared to the regular node.
  5. as a consequence of the above, performance-wise, it's a viable replacement for the regular in-memory solution.

Attachment

Full report for value-only workload, PDF downloadable here.

· 3 min read
Michael Karg

Setup

As part of the release benchmarking cycle, we're comparing benchmarking runs for 2 different versions of cardano-node:

  • 10.1.4 - baseline from a previous mainnet release
  • 10.2.1 - the current release

For this benchmark, we're gathering various metrics under 2 different workloads:

  1. value-only: Each transaction consumes 2 inputs and creates 2 outputs, changing the UTxO set. This workload produces full blocks (> 80kB) exclusively.
  2. Plutus: Each transaction contains a Plutus script exhausting the per-tx execution budget. This workload produces small blocks (< 3kB) exclusively.

Benchmarking is performed on a cluster of 52 block producing nodes spread across 3 different AWS regions, interconnected using a static, restricted topology. All runs were performed in the Conway era.

Observations

These benchmarks are about evaluating specific corner cases in a constrained environment that allows for reliable reproduction of results; they're not trying to directly recreate the operational conditions on Mainnet.

Resource Usage

  1. CPU usage increases moderately by 12% under value, and very slightly by 2% under Plutus workload.
  2. CPU 85% spans increase by 14% (~0.6 slots) under value workload, but decrease by 6% (~0.8 slots) under Plutus workload.
  3. Only under value workload, we observe a slight increase in Allocation rate and Minor GCs of 9% and 8%

Caveat: Individual metrics can't be evaluated in isolate; the resource usage profile as a whole provides insight into the system's performance and responsiveness.

Forging Loop

  1. Adoption time on the forger improves by 3ms (or 4%) - and 5ms (or 9%) under Plutus workload.
  2. Block context acquisition takes 3ms (or 12%) longer under value workload.
  3. Under Plutus workload only, ledger ticking improves by 3ms (or 12%).

The metric 'Slot start to announced' (see in attachments) is cumulative, and demonstrates how far into a slot the block producing node first announces the new header.

Peer propagation

  1. Block fetch duration improves clearly by 16ms (or 4%) under value-only workload.
  2. Under Plutus workload, we can measure an improvement by 4ms (or 7%) for adoption times on the peers.

End-to-end propagation

This metric encompasses block diffusion and adoption across specific percentages of the benchmarking cluster, with 0.80 adoption meaning adoption on 80% of all cluster nodes.

As a result of the above, on 10.2.1 exhibits:

  1. a slight 3% improvement in cluster adoption times in the 80th centile and above under value workload.
  2. a near-jitter 1% - 3% improvement in cluster adoption times under Plutus workload.

Conclusion

  1. We could not detect any significant regressions, or performance risks, on 10.2.1.
  2. 10.2.1 comes with slightly increased CPU usage, and no changes to RAM footprint.
  3. Diffusion metrics very slightly improve - mainly due to block fetch being more efficient for full blocks, and adoption for blocks exclusively containing Plutus transactions.
  4. This points to network performance of 10.2.1 being on par with or very slightly better than 10.1.4.

Attachments

Full report for value-only workload, PDF downloadable here.

Full report for Plutus workload, PDF downloadable here.

NB. The benchmarks for 10.2.1 were performed on tag 10.2.0. The patch version bump did not include changes relevant to performance; thus, measurements and observations performed on 10.2.0 remain valid.

· 3 min read
Michael Karg

Setup

As part of the release benchmarking cycle, we're comparing benchmarking runs for 2 different versions of cardano-node:

  • 10.1.1 - baseline from a previous mainnet release
  • 10.1.4 - the current mainnet release

For this benchmark, we're gathering various metrics under 2 different workloads:

  1. value-only: Each transaction consumes 2 inputs and creates 2 outputs, changing the UTxO set. This workload produces full blocks (> 80kB) exclusively.
  2. Plutus: Each transaction contains a Plutus script exhausting the per-tx execution budget. This workload produces small blocks (< 3kB) exclusively.

Benchmarking is performed on a cluster of 52 block producing nodes spread across 3 different AWS regions, interconnected using a static, restricted topology. All runs were performed in the Conway era.

Observations

These benchmarks are about evaluating specific corner cases in a constrained environment that allows for reliable reproduction of results; they're not trying to directly recreate the operational conditions on Mainnet.

Resource Usage

  1. CPU 85% spans slightly increase by 6% or ~0.2 slots (26% or ~2.9 slots under Plutus workload).
  2. We can observe a tiny increase in memory usage by 1-2% (132-160 MiB).

Caveat: Individual metrics can't be evaluated in isolate; the resource usage profile as a whole provides insight into the system's performance and responsiveness.

Forging Loop

  1. Under value workload, Ledger Ticking and Self Adoption exhibit a very slight increase (2ms each).
  2. Block Context Acquisition has improved by 2ms.
  3. Under Plutus workload, there are no significant changes to forger metrics.

The metric 'Slot start to announced' (see in attachments) is cumulative, and demonstrates how far into a slot the block producing node first announces the new header.

Peer propagation

  1. There's a minor increase of 1% (3ms) in Block Fetch duration under value workload only.
  2. Under Plutus workload, we can measure a small improvement by 2% for adoption times on the peers.

End-to-end propagation

This metric encompasses block diffusion and adoption across specific percentages of the benchmarking cluster, with 0.80 adoption meaning adoption on 80% of all cluster nodes.

As a result of the above, on 10.1.4 we can observe:

  1. a tiny increase in cluster adoption times of 1%-2% in the 80th centile and above under value workload.
  2. an improvement in cluster adoption times of 3%-4% in the tail end (95th centile and above) under Plutus workload.

Conclusion

  1. For 10.1.4, we could not detect any regressions or performance risks.
  2. All increases or decreases in forger and peer metrics are 3ms and less. This indicates network performance of 10.1.4 will very closely match that of 10.1.1 and subsequent patch releases.
  3. There's no significant change in the resource usage pattern. The increased CPU 85% spans tend to barely manifest when the system is under heavy load (value workload); as such, they pose no cause for concern.

Attachments

Full report for value-only workload, PDF downloadable here.

Full report for Plutus workload, PDF downloadable here.

NB. The benchmarks for 10.1.1 were performed on tag 10.0.0-pre. The minor version bump did not include changes relevant to performance; thus, measurements taken on 10.0.0-pre remain a valid baseline.

· 4 min read
Michael Karg

Setup

As part of the release benchmarking cycle, we're comparing benchmarking runs for 2 different versions of cardano-node:

  • 9.2.0 - baseline from a previous mainnet release
  • 10.1.1 - the current mainnet release

For this benchmark, we're gathering various metrics under 3 different workloads:

  1. value-only: Each transaction consumes 2 inputs and creates 2 outputs, changing the UTxO set. This workload produces full blocks (> 80kB) exclusively.
  2. Plutus: Each transaction contains a Plutus script exhausting the per-tx execution budget. This workload produces small blocks (< 3kB) exclusively.
  3. value+voting: On top of above value workload, this one has DReps vote on and ratify governance actions - forcing additional computation for vote tallying and proposal enactment.

Benchmarking is performed on a cluster of 52 block producing nodes spread across 3 different AWS regions, interconnected using a static, restricted topology. All runs were performed in the Conway era.

Observations

These benchmarks are about evaluating specific corner cases in a constrained environment that allows for reliable reproduction of results; they're not trying to directly recreate the operational conditions on Mainnet.

Resource Usage

  1. 10.1.1 shows an improvement of 4% (8% under Plutus workload) in Process CPU usage.
  2. Allocation Rate improves by 8% (11% under Plutus workload), while Heap Size remains unchanged.
  3. CPU 85% spans decrease by 18% (5% under Plutus workload).
  4. Compared to value-only workload, ongoing voting leads to a slight increase of 5% in Process CPU usage.

Caveat: Individual metrics can't be evaluated in isolate; the resource usage profile as a whole provides insight into the system's performance and responsiveness.

Forging Loop

  1. Under Plutus workload, 10.1.1 exhibits a formidable speedup of 70ms in the forging loop - due to mempool snapshots being produced much more quickly.
  2. Under value workload, there are no significant changes to forger metrics.
  3. With voting added on top of the value workload, we can observe mempool snapshots and adoption time on the block producer rise by 10ms each.

The metric 'Slot start to announced' (see in attachments) is cumulative, and demonstrates how far into a slot the block producing node first announces the new header.

Peer propagation

  1. Block Fetch duration increases slightly by 16ms (or 5%) under value workload.
  2. Under Plutus workload, there are no significant changes to peer-related metrics.
  3. With the additional voting workload, peer adoption times rise by 12ms on average - confirming the observation for adoption time on the block producer.

End-to-end propagation

This metric encompasses block diffusion and adoption across specific percentages of the benchmarking cluster, with 0.80 adoption meaning adoption on 80% of all cluster nodes.

  1. 10.1.1 exhibits a slight increase of 2% - 3% in cluster adoption times under value workload.
  2. Under Plutus workload however, we observe significant improvement of 18% up to the 50th centile, and 9% - 13% in the 80th centile and above.
  3. While the former is due to slightly increased Block Fetch duration, the latter is the consequence of much quicker mempool snapshots involving Plutus transactions.
  4. Submitting the additional voting workload, we can observe a consistent 4% - 6% increase in cluster adoption times across all centiles.

Conclusion

  • We do not detect any perfomance regression in 10.1.1 compared to 9.2.0.
  • To the contrary - 10.1.1 is lighter on the Node process resource usage overall.
  • Improved forging and diffusion timings can be expected for blocks heavy on Plutus transactions.
  • Stressing the governance / voting capabalities of the Conway ledger lets us ascertain an (expected) performance cost of voting.
  • This cost has demonstrated to be reasonable, and to not contain lurking perfomance risks to the system.
  • It is expected to manifest only during periods of heavy vote tallying / proposal enactment, slightly affecting block adoption times.

NB. The same amount of DReps are registered for each workload. However, only under value+voting do they become active by submitting votes. This requires an increased UTxO set size, so it uses a baseline seperate from value-only, resulting in slightly different absolute values.

Contact

As for publishing such benchmarking results, we are aware that more context and detail may be needed with regard to specfic metrics or benchmarking methodology.

We are still looking to gather questions, both general and specific, so that we can provide a suitable FAQ and possibly improve presentation in the future.

Attachments

Full report for value-only workload, PDF downloadable here.

Full report for Plutus workload, PDF downloadable here.

Full report for value+voting workload, PDF downloadable here.

NB. The release benchmarks for 10.1.1 were performed on tag 10.0.0-pre. The minor version bump did not include changes relevant to performance; thus, measurements taken on 10.0.0-pre remain valid.

· 3 min read
Michael Karg

Setup

As part of the release benchmarking cycle, we're comparing benchmarking runs for 3 different versions of cardano-node:

  • 8.7.2 - baseline for previous mainnet release
  • 8.8.0 - an intermediate reference point
  • 8.9.0 - the next mainnet release

For each version, we're gathering various metrics under 2 different workloads:

  1. value-only: Each transaction consumes 2 inputs and creates 2 outputs, changing the UTxO set. This workload produces full blocks (> 80kB) exclusively.
  2. Plutus: Each transaction contains a Plutus script exhausting the per-tx execution budget. This workload produces small blocks (< 3kB) exclusively.

Benchmarking is performed on a cluster of 52 block producing nodes spread across 3 different AWS regions, interconnected using a static, restricted topology. All runs were performed in the Babbage era.

Observations

These benchmarks are about evaluating specific corner cases in a constrained environment that allows for reliable reproduction of results; they're not trying to directly recreate the operational conditions on Mainnet.

The observations stated refer to the direct comparison between the 8.7.2 and 8.9.0 versions.

Resource Usage

  1. Overall CPU usage exhibits a small to moderate (5% - 8%) increase.
  2. Memory usage is very slightly decreased by 1%.

Caveat: Individual metrics can't be evaluated in isolate; the resource usage profile as a whole provides insight into the system's performance and responsiveness.

Forging Loop

  1. For full blocks, Mempool Snapshotting improves by 4% (or 3ms).
  2. For small blocks, Self Adoption times improve by 8% (or 4ms).
  3. All other forger metrics do not exhibit significant change.

The metric 'Slot start to announced' (see in attachments) is cumulative, and demonstrates how far into a slot the block producing node first announces the new header.

Peer propagation

  1. For full blocks, Block Fetch duration shows a notable improvement by 10ms (or 3%).

End-to-end propagation

This metric encompasses block diffusion and adoption across specific percentages of the benchmarking cluster, with 0.80 adoption meaning adoption on 80% of all cluster nodes.

End-to-end propagation times on 8.9.0 exhibit a small improvement by 2% across all centiles for full blocks, whereas they remain largely unchanged for small blocks.

Conclusion

  • The performance changes observed between 8.9.0 and 8.7.2 are only minor - with 8.9.0 slightly improving on 8.7.2. Therefore, we'd expect 8.9.0 Mainnet performance to be akin to 8.7.2.
  • We have demonstrated no performance regression has been introduced in 8.9.0.

Contact

As for publishing such benchmarking results, we are aware that more context and detail may be needed with regard to specfic metrics or benchmarking methodology.

We are still looking to gather questions, both general and specific, so that we can provide a suitable FAQ and possibly improve presentation in the future.

Attachments

Full report for value-only workload, PDF downloadable here.

Full report for Plutus workload, PDF downloadable here.

NB. Mainnet release 8.7.3 did not include any performance-related changes; measurements taken on 8.7.2 remain valid.

· 3 min read
Michael Karg

Setup

As part of the release benchmarking cycle, we're comparing benchmarking runs for 2 different versions of cardano-node:

  • 8.9.0 - baseline for previous mainnet release
  • 8.9.1 - the next mainnet release

For each version, we're gathering various metrics under 2 different workloads:

  1. value-only: Each transaction consumes 2 inputs and creates 2 outputs, changing the UTxO set. This workload produces full blocks (> 80kB) exclusively.
  2. Plutus: Each transaction contains a Plutus script exhausting the per-tx execution budget. This workload produces small blocks (< 3kB) exclusively.

Benchmarking is performed on a cluster of 52 block producing nodes spread across 3 different AWS regions, interconnected using a static, restricted topology. All runs were performed in the Babbage era.

Observations

These benchmarks are about evaluating specific corner cases in a constrained environment that allows for reliable reproduction of results; they're not trying to directly recreate the operational conditions on Mainnet.

Resource Usage

  1. We can observe an overall decrease in CPU usage (2% - 4%); only GC CPU usage under value workload increases by 3%.
  2. Under value workload only, Allocation rate is very slightly decreased (1%) with no change to Heap Size.

Caveat: Individual metrics can't be evaluated in isolate; the resource usage profile as a whole provides insight into the system's performance and responsiveness.

Forging Loop

  1. Mempool Snapshot duration increases slightly by 2ms under value workload.
  2. Self-Adoption time increases by 3ms.
  3. All other forger metrics do not exhibit significant change.

The metric 'Slot start to announced' (see in attachments) is cumulative, and demonstrates how far into a slot the block producing node first announces the new header.

Peer propagation

  1. Under value workload only, Block Fetch duration and Fetched to Sending show a slight increase of 2ms each.

End-to-end propagation

This metric encompasses block diffusion and adoption across specific percentages of the benchmarking cluster, with 0.80 adoption meaning adoption on 80% of all cluster nodes.

End-to-end propagation times on 8.9.1 exhibit a small increase by 1% - 2% for full blocks, while remaining virtually unchanged for small blocks.

Conclusion

  • The performance changes measured between 8.9.1 and 8.9.0 are very minor. Mainnnet performance of 8.9.1 is expected to be akin to 8.9.0.
  • We have not observed any performance regression being introduced in 8.9.1.

Contact

As for publishing such benchmarking results, we are aware that more context and detail may be needed with regard to specfic metrics or benchmarking methodology.

We are still looking to gather questions, both general and specific, so that we can provide a suitable FAQ and possibly improve presentation in the future.

Attachments

Full report for value-only workload, PDF downloadable here.

Full report for Plutus workload, PDF downloadable here.

· 3 min read
Michael Karg

Setup

As part of the release benchmarking cycle, we're comparing benchmarking runs for 2 different versions of cardano-node:

  • 8.9.1 - baseline from a previous mainnet release
  • 8.9.3 - the current mainnet release

For each version, we're gathering various metrics under 2 different workloads:

  1. value-only: Each transaction consumes 2 inputs and creates 2 outputs, changing the UTxO set. This workload produces full blocks (> 80kB) exclusively.
  2. Plutus: Each transaction contains a Plutus script exhausting the per-tx execution budget. This workload produces small blocks (< 3kB) exclusively.

Benchmarking is performed on a cluster of 52 block producing nodes spread across 3 different AWS regions, interconnected using a static, restricted topology. All runs were performed in the Babbage era.

Observations

These benchmarks are about evaluating specific corner cases in a constrained environment that allows for reliable reproduction of results; they're not trying to directly recreate the operational conditions on Mainnet.

Resource Usage

  1. Under value workload, CPU usage increases slightly on 8.9.3: 4% for Process, 3% for Mutator and 8% for GC.
  2. Additionally, Allocation rate and minor GCs increase slightly by 3% each.
  3. Under Plutus workload only, the GC live dataset increases by 10% or 318MB.
  4. CPU 85% spans increase by 14% of slot duration under value workload, whereas they shorten by 5% of slot duration under Plutus workload.

Caveat: Individual metrics can't be evaluated in isolate; the resource usage profile as a whole provides insight into the system's performance and responsiveness.

Forging Loop

  1. There are no significant changes to metrics related to block forging.

The metric 'Slot start to announced' (see in attachments) is cumulative, and demonstrates how far into a slot the block producing node first announces the new header.

Peer propagation

  1. Block Fetch duration improves by 7ms (or 2%) under value workload, and by 4ms (or 3%) under Plutus workload.
  2. Under Plutus workload, Fetched to sending improves by 2ms (or 5%).

End-to-end propagation

This metric encompasses block diffusion and adoption across specific percentages of the benchmarking cluster, with 0.80 adoption meaning adoption on 80% of all cluster nodes.

  1. Under value workload, cluster adoption times exhibit a minor improvement (1%) up to the 80th centile on 8.9.3.
  2. Under Plutus workload, we can observe a minor improvement overall (1% - 2%), whilst full adoption is unchanged.

Conclusion

  • The performance changes measured between 8.9.3 and 8.9.1 are very minor, with 8.9.3 improving slightly over 8.9.1.
  • Mainnnet performance of 8.9.3 is expected to be akin to 8.9.1.
  • We have not observed any performance regression being introduced in 8.9.3.

Contact

As for publishing such benchmarking results, we are aware that more context and detail may be needed with regard to specfic metrics or benchmarking methodology.

We are still looking to gather questions, both general and specific, so that we can provide a suitable FAQ and possibly improve presentation in the future.

Attachments

Full report for value-only workload, PDF downloadable here.

Full report for Plutus workload, PDF downloadable here.

NB. The baseline for 8.9.1 had to be re-established due to changes in the underlying network infrastructure. This means, absolute values may differ from the previous measurements taken from that version.

· 3 min read
Michael Karg

Setup

As part of the release benchmarking cycle, we're comparing benchmarking runs for 2 different versions of cardano-node:

  • 8.9.3 - baseline from a previous mainnet release
  • 8.12.1 - the current mainnet release

For each version, we're gathering various metrics under 2 different workloads:

  1. value-only: Each transaction consumes 2 inputs and creates 2 outputs, changing the UTxO set. This workload produces full blocks (> 80kB) exclusively.
  2. Plutus: Each transaction contains a Plutus script exhausting the per-tx execution budget. This workload produces small blocks (< 3kB) exclusively.

Benchmarking is performed on a cluster of 52 block producing nodes spread across 3 different AWS regions, interconnected using a static, restricted topology. All runs were performed in the Babbage era.

Observations

These benchmarks are about evaluating specific corner cases in a constrained environment that allows for reliable reproduction of results; they're not trying to directly recreate the operational conditions on Mainnet.

Resource Usage

  1. Under value workload, CPU usage is improved by 2% - 4%, and by 14% for GCs. Under Plutus workload, CPU usage improves only slightly by 1%.
  2. Allocation Rate and Minor GCs improve by 5% and 6% - under Plutus workload, there's a slight improvement of 1%.
  3. RAM usage is reduced by 3%; reduction under Plutus workload is even larger - namely 10%.

Caveat: Individual metrics can't be evaluated in isolate; the resource usage profile as a whole provides insight into the system's performance and responsiveness.

Forging Loop

  1. Mempool snapshotting improves by 5ms or 7% (3ms or 4% under Plutus workload).
  2. Adoption time on the block producer improves by 4ms or 6% - under value workload only.

The metric 'Slot start to announced' (see in attachments) is cumulative, and demonstrates how far into a slot the block producing node first announces the new header.

Peer propagation

  1. Block Fetch duration increases slightly by 6ms or 2% (2ms under Plutus workload).
  2. Adoption times on the peers improve slightly by 2ms or 3% (1ms under Plutus workload)

End-to-end propagation

This metric encompasses block diffusion and adoption across specific percentages of the benchmarking cluster, with 0.80 adoption meaning adoption on 80% of all cluster nodes.

  1. Under value workload / full blocks there are no significant changes to cluster adoption times.
  2. Under Plutus workload / small blocks we can observe a (near-jitter) improvement of 0% - 2% in cluster adoption times.

Conclusion

  • The performance changes measured between 8.12.1 and 8.9.3 are most distinct in the resource usage footprint - with 8.12.1 improving over 8.9.3.
  • On Mainnnet, 8.12.1 is expected to deliver equal or slightly better performance than 8.9.3 - as well as lowering the Node's resource usage somewhat in doing so.
  • We have not observed any performance regression being introduced in 8.12.1.

Contact

As for publishing such benchmarking results, we are aware that more context and detail may be needed with regard to specfic metrics or benchmarking methodology.

We are still looking to gather questions, both general and specific, so that we can provide a suitable FAQ and possibly improve presentation in the future.

Attachments

Full report for value-only workload, PDF downloadable here.

Full report for Plutus workload, PDF downloadable here.

NB. The release benchmarks for 8.12.1 were performed on tag 8.12.0-pre. The patch version bump did not include changes relevant to performance; thus, measurements taken on 8.12.0-pre remain valid.

· 4 min read
Michael Karg

Setup

As part of the release benchmarking cycle, we're comparing benchmarking runs for 2 different versions of cardano-node:

  • 8.12.1 - baseline from a previous mainnet release
  • 9.0.0 - the current mainnet release

For each version, we're gathering various metrics under 2 different workloads:

  1. value-only: Each transaction consumes 2 inputs and creates 2 outputs, changing the UTxO set. This workload produces full blocks (> 80kB) exclusively.
  2. Plutus: Each transaction contains a Plutus script exhausting the per-tx execution budget. This workload produces small blocks (< 3kB) exclusively.

Benchmarking is performed on a cluster of 52 block producing nodes spread across 3 different AWS regions, interconnected using a static, restricted topology. All runs were performed in the Conway era.

Observations

These benchmarks are about evaluating specific corner cases in a constrained environment that allows for reliable reproduction of results; they're not trying to directly recreate the operational conditions on Mainnet.

Resource Usage

  1. Under value workload Process and Mutator CPU usage are slightly higher on 9.0 - 7% - 8% (4% each under Plutus workload). GC CPU is increased by 11%, but decreases under Putus workload by 3%.
  2. Only under value workload, Allocation Rate and Minor GCs increase by 5% and the live GC dataset grows by 3%. Heap size is constant.
  3. CPU 85% spans are 8% shorter (3% under Plutus workload).

Caveat: Individual metrics can't be evaluated in isolate; the resource usage profile as a whole provides insight into the system's performance and responsiveness.

Forging Loop

  1. Mempool Snapshotting and Self Adoption time on the block producer increase very slightly under value workload - 2ms (or 3%) each.
  2. Under Plutus workload, however, a decrease in Self Adoption time by 2ms (or 4%) is the only significant change in the forging loop.

The metric 'Slot start to announced' (see in attachments) is cumulative, and demonstrates how far into a slot the block producing node first announces the new header.

Peer propagation

  1. Block Fetch duration is 21ms faster (6%) - 7ms or 5% under Plutus workload.
  2. Fetched to Sending increases slightly by 3ms (7%) - only under value workload.
  3. Adoption times on the peers increase slightly by 4ms (5%) - under Plutus workload, however, they are 3ms (6%) faster.

End-to-end propagation

This metric encompasses block diffusion and adoption across specific percentages of the benchmarking cluster, with 0.80 adoption meaning adoption on 80% of all cluster nodes.

  1. Under value workload / full blocks on 9.0, we can observe a 4% - 5% improvement of cluster adoption times in the 80th centile and above.
  2. Under Plutus workload / small blocks, the corresponding improvement is 5% - 6%.
  3. The main contributing factor is the improvement in Block Fetch duration.

Conclusion

  • Network performance clearly improves by ~%5 for 80% to full cluster adoption - independent of workload.
  • RAM usage is unchanged on 9.0. The slight rise in CPU usage is expected, given improved network performance, and does not pose cause for concern.
  • We have not observed any performance regression being introduced in 9.0.0..

NB. These benchmarks were performed in the Conway ledger era. As such, they do not cover the one-time performance cost of transitioning from Babbage and enabling the new features of the Conway ledger.

Contact

As for publishing such benchmarking results, we are aware that more context and detail may be needed with regard to specfic metrics or benchmarking methodology.

We are still looking to gather questions, both general and specific, so that we can provide a suitable FAQ and possibly improve presentation in the future.

Attachments

Full report for value-only workload, PDF downloadable here.

Full report for Plutus workload, PDF downloadable here.

· 3 min read
Michael Karg

Setup

As part of the release benchmarking cycle, we're comparing benchmarking runs for 2 different versions of cardano-node:

  • 9.1.1 - baseline from a previous mainnet release
  • 9.2.0 - the current mainnet release

For each version, we're gathering various metrics under 2 different workloads:

  1. value-only: Each transaction consumes 2 inputs and creates 2 outputs, changing the UTxO set. This workload produces full blocks (> 80kB) exclusively.
  2. Plutus: Each transaction contains a Plutus script exhausting the per-tx execution budget. This workload produces small blocks (< 3kB) exclusively.

Benchmarking is performed on a cluster of 52 block producing nodes spread across 3 different AWS regions, interconnected using a static, restricted topology. All runs were performed in the Conway era.

Observations

These benchmarks are about evaluating specific corner cases in a constrained environment that allows for reliable reproduction of results; they're not trying to directly recreate the operational conditions on Mainnet.

Resource Usage

  1. Under value workload, 9.2.0 shows an increase by 7% in Process CPU usage.
  2. Additionally, Allocation Rate and Minor GCs increase by 6% each, while Heap Size remains unchanged.
  3. Furthermore, CPU 85% spans increase by 10%.
  4. Under Plutus workload however, there's just one significant observation: a larger portion of the heap is considered live (6% or ~190MB) with the overall Heap Size remaining constant.

Caveat: Individual metrics can't be evaluated in isolate; the resource usage profile as a whole provides insight into the system's performance and responsiveness.

Forging Loop

  1. For the forger metrics, we can observe minor (1ms - 2ms) improvements in Ledger Ticking, Mempool Snapshotting and Self Adoption under value workload.
  2. Under Plutus workload, there are minor (1ms - 2ms) increases in Ledger Ticking and Mempool Snapshotting.

The metric 'Slot start to announced' (see in attachments) is cumulative, and demonstrates how far into a slot the block producing node first announces the new header.

Peer propagation

  1. Block Fetch duration has improved by 11ms (or 3%) under value workload.
  2. Under Plutus workload, peer Adoption times are slightly increased by 2ms (3%).

End-to-end propagation

This metric encompasses block diffusion and adoption across specific percentages of the benchmarking cluster, with 0.80 adoption meaning adoption on 80% of all cluster nodes.

  1. Under value workload / full blocks, 9.2.0 exhibits a slight improvement of 1% - 3% in cluster adoption times.
  2. Under Plutus workload / small blocks, there's a very minor increase by 1%.

Conclusion

  • We can not detect any perfomance regression in 9.2.0 compared to 9.1.1.
  • Under heavy value workload, 9.2.0 seems to perform work somewhat more eagerly. This would correlate with the slightly increased CPU usage, but also with the improvements in the forging and peer related metrics.
  • The clearly increased efficiency of Block Fetch under heavy workload is the main contributing factor to the slight overall network performance improvement.

NB. These benchmarks were performed using an adjusted, post-Chang hardfork performance baseline to account for added features in the Conway ledger era. Thus, absolute measurements might differ now from those taken using the previous baseline.

Contact

As for publishing such benchmarking results, we are aware that more context and detail may be needed with regard to specfic metrics or benchmarking methodology.

We are still looking to gather questions, both general and specific, so that we can provide a suitable FAQ and possibly improve presentation in the future.

Attachments

Full report for value-only workload, PDF downloadable here.

Full report for Plutus workload, PDF downloadable here.