Skip to main content

· 3 min read
Michael Karg

Setup

As part of the release benchmarking cycle, we're comparing benchmarking runs for 2 different versions of cardano-node:

  • 8.9.3 - baseline from a previous mainnet release
  • 8.12.1 - the current mainnet release

For each version, we're gathering various metrics under 2 different workloads:

  1. value-only: Each transaction consumes 2 inputs and creates 2 outputs, changing the UTxO set. This workload produces full blocks (> 80kB) exclusively.
  2. Plutus: Each transaction contains a Plutus script exhausting the per-tx execution budget. This workload produces small blocks (< 3kB) exclusively.

Benchmarking is performed on a cluster of 52 block producing nodes spread across 3 different AWS regions, interconnected using a static, restricted topology. All runs were performed in the Babbage era.

Observations

These benchmarks are about evaluating specific corner cases in a constrained environment that allows for reliable reproduction of results; they're not trying to directly recreate the operational conditions on Mainnet.

Resource Usage

  1. Under value workload, CPU usage is improved by 2% - 4%, and by 14% for GCs. Under Plutus workload, CPU usage improves only slightly by 1%.
  2. Allocation Rate and Minor GCs improve by 5% and 6% - under Plutus workload, there's a slight improvement of 1%.
  3. RAM usage is reduced by 3%; reduction under Plutus workload is even larger - namely 10%.

Caveat: Individual metrics can't be evaluated in isolate; the resource usage profile as a whole provides insight into the system's performance and responsiveness.

Forging Loop

  1. Mempool snapshotting improves by 5ms or 7% (3ms or 4% under Plutus workload).
  2. Adoption time on the block producer improves by 4ms or 6% - under value workload only.

The metric 'Slot start to announced' (see in attachments) is cumulative, and demonstrates how far into a slot the block producing node first announces the new header.

Peer propagation

  1. Block Fetch duration increases slightly by 6ms or 2% (2ms under Plutus workload).
  2. Adoption times on the peers improve slightly by 2ms or 3% (1ms under Plutus workload)

End-to-end propagation

This metric encompasses block diffusion and adoption across specific percentages of the benchmarking cluster, with 0.80 adoption meaning adoption on 80% of all cluster nodes.

  1. Under value workload / full blocks there are no significant changes to cluster adoption times.
  2. Under Plutus workload / small blocks we can observe a (near-jitter) improvement of 0% - 2% in cluster adoption times.

Conclusion

  • The performance changes measured between 8.12.1 and 8.9.3 are most distinct in the resource usage footprint - with 8.12.1 improving over 8.9.3.
  • On Mainnnet, 8.12.1 is expected to deliver equal or slightly better performance than 8.9.3 - as well as lowering the Node's resource usage somewhat in doing so.
  • We have not observed any performance regression being introduced in 8.12.1.

Contact

As for publishing such benchmarking results, we are aware that more context and detail may be needed with regard to specfic metrics or benchmarking methodology.

We are still looking to gather questions, both general and specific, so that we can provide a suitable FAQ and possibly improve presentation in the future.

Attachments

Full report for value-only workload, PDF downloadable here.

Full report for Plutus workload, PDF downloadable here.

NB. The release benchmarks for 8.12.1 were performed on tag 8.12.0-pre. The patch version bump did not include changes relevant to performance; thus, measurements taken on 8.12.0-pre remain valid.

· 4 min read
Michael Karg

Setup

As part of the release benchmarking cycle, we're comparing benchmarking runs for 2 different versions of cardano-node:

  • 8.12.1 - baseline from a previous mainnet release
  • 9.0.0 - the current mainnet release

For each version, we're gathering various metrics under 2 different workloads:

  1. value-only: Each transaction consumes 2 inputs and creates 2 outputs, changing the UTxO set. This workload produces full blocks (> 80kB) exclusively.
  2. Plutus: Each transaction contains a Plutus script exhausting the per-tx execution budget. This workload produces small blocks (< 3kB) exclusively.

Benchmarking is performed on a cluster of 52 block producing nodes spread across 3 different AWS regions, interconnected using a static, restricted topology. All runs were performed in the Conway era.

Observations

These benchmarks are about evaluating specific corner cases in a constrained environment that allows for reliable reproduction of results; they're not trying to directly recreate the operational conditions on Mainnet.

Resource Usage

  1. Under value workload Process and Mutator CPU usage are slightly higher on 9.0 - 7% - 8% (4% each under Plutus workload). GC CPU is increased by 11%, but decreases under Putus workload by 3%.
  2. Only under value workload, Allocation Rate and Minor GCs increase by 5% and the live GC dataset grows by 3%. Heap size is constant.
  3. CPU 85% spans are 8% shorter (3% under Plutus workload).

Caveat: Individual metrics can't be evaluated in isolate; the resource usage profile as a whole provides insight into the system's performance and responsiveness.

Forging Loop

  1. Mempool Snapshotting and Self Adoption time on the block producer increase very slightly under value workload - 2ms (or 3%) each.
  2. Under Plutus workload, however, a decrease in Self Adoption time by 2ms (or 4%) is the only significant change in the forging loop.

The metric 'Slot start to announced' (see in attachments) is cumulative, and demonstrates how far into a slot the block producing node first announces the new header.

Peer propagation

  1. Block Fetch duration is 21ms faster (6%) - 7ms or 5% under Plutus workload.
  2. Fetched to Sending increases slightly by 3ms (7%) - only under value workload.
  3. Adoption times on the peers increase slightly by 4ms (5%) - under Plutus workload, however, they are 3ms (6%) faster.

End-to-end propagation

This metric encompasses block diffusion and adoption across specific percentages of the benchmarking cluster, with 0.80 adoption meaning adoption on 80% of all cluster nodes.

  1. Under value workload / full blocks on 9.0, we can observe a 4% - 5% improvement of cluster adoption times in the 80th centile and above.
  2. Under Plutus workload / small blocks, the corresponding improvement is 5% - 6%.
  3. The main contributing factor is the improvement in Block Fetch duration.

Conclusion

  • Network performance clearly improves by ~%5 for 80% to full cluster adoption - independent of workload.
  • RAM usage is unchanged on 9.0. The slight rise in CPU usage is expected, given improved network performance, and does not pose cause for concern.
  • We have not observed any performance regression being introduced in 9.0.0..

NB. These benchmarks were performed in the Conway ledger era. As such, they do not cover the one-time performance cost of transitioning from Babbage and enabling the new features of the Conway ledger.

Contact

As for publishing such benchmarking results, we are aware that more context and detail may be needed with regard to specfic metrics or benchmarking methodology.

We are still looking to gather questions, both general and specific, so that we can provide a suitable FAQ and possibly improve presentation in the future.

Attachments

Full report for value-only workload, PDF downloadable here.

Full report for Plutus workload, PDF downloadable here.

· 3 min read
Michael Karg

Setup

As part of the release benchmarking cycle, we're comparing benchmarking runs for 2 different versions of cardano-node:

  • 9.1.1 - baseline from a previous mainnet release
  • 9.2.0 - the current mainnet release

For each version, we're gathering various metrics under 2 different workloads:

  1. value-only: Each transaction consumes 2 inputs and creates 2 outputs, changing the UTxO set. This workload produces full blocks (> 80kB) exclusively.
  2. Plutus: Each transaction contains a Plutus script exhausting the per-tx execution budget. This workload produces small blocks (< 3kB) exclusively.

Benchmarking is performed on a cluster of 52 block producing nodes spread across 3 different AWS regions, interconnected using a static, restricted topology. All runs were performed in the Conway era.

Observations

These benchmarks are about evaluating specific corner cases in a constrained environment that allows for reliable reproduction of results; they're not trying to directly recreate the operational conditions on Mainnet.

Resource Usage

  1. Under value workload, 9.2.0 shows an increase by 7% in Process CPU usage.
  2. Additionally, Allocation Rate and Minor GCs increase by 6% each, while Heap Size remains unchanged.
  3. Furthermore, CPU 85% spans increase by 10%.
  4. Under Plutus workload however, there's just one significant observation: a larger portion of the heap is considered live (6% or ~190MB) with the overall Heap Size remaining constant.

Caveat: Individual metrics can't be evaluated in isolate; the resource usage profile as a whole provides insight into the system's performance and responsiveness.

Forging Loop

  1. For the forger metrics, we can observe minor (1ms - 2ms) improvements in Ledger Ticking, Mempool Snapshotting and Self Adoption under value workload.
  2. Under Plutus workload, there are minor (1ms - 2ms) increases in Ledger Ticking and Mempool Snapshotting.

The metric 'Slot start to announced' (see in attachments) is cumulative, and demonstrates how far into a slot the block producing node first announces the new header.

Peer propagation

  1. Block Fetch duration has improved by 11ms (or 3%) under value workload.
  2. Under Plutus workload, peer Adoption times are slightly increased by 2ms (3%).

End-to-end propagation

This metric encompasses block diffusion and adoption across specific percentages of the benchmarking cluster, with 0.80 adoption meaning adoption on 80% of all cluster nodes.

  1. Under value workload / full blocks, 9.2.0 exhibits a slight improvement of 1% - 3% in cluster adoption times.
  2. Under Plutus workload / small blocks, there's a very minor increase by 1%.

Conclusion

  • We can not detect any perfomance regression in 9.2.0 compared to 9.1.1.
  • Under heavy value workload, 9.2.0 seems to perform work somewhat more eagerly. This would correlate with the slightly increased CPU usage, but also with the improvements in the forging and peer related metrics.
  • The clearly increased efficiency of Block Fetch under heavy workload is the main contributing factor to the slight overall network performance improvement.

NB. These benchmarks were performed using an adjusted, post-Chang hardfork performance baseline to account for added features in the Conway ledger era. Thus, absolute measurements might differ now from those taken using the previous baseline.

Contact

As for publishing such benchmarking results, we are aware that more context and detail may be needed with regard to specfic metrics or benchmarking methodology.

We are still looking to gather questions, both general and specific, so that we can provide a suitable FAQ and possibly improve presentation in the future.

Attachments

Full report for value-only workload, PDF downloadable here.

Full report for Plutus workload, PDF downloadable here.

· 3 min read
Michael Karg

Setup

As part of the release benchmarking cycle, we're comparing benchmarking runs for 3 different versions of cardano-node:

  • 8.1.2 - the last mainnet release
  • 8.7.0-pre - as an intermediate reference point
  • 8.7.2 - the next mainnet release

For each version, we're gathering various metrics under 2 different workloads:

  1. value-only: Each transaction consumes 2 inputs and creates 2 outputs, changing the UTxO set. This workload produces full blocks (> 80kB) exclusively.
  2. Plutus: Each transaction contains a Plutus script exhausting the per-tx execution budget. This workload produces small blocks (< 3kB) exclusively.

Benchmarking is performed on a cluster of 52 block producing nodes spread across 3 different AWS regions, interconnected using a static, restricted topology. All runs were performed in the Babbage era.

Observations

These benchmarks are about evaluating specific corner cases in a constrained environment that allows for reliable reproduction of results; they're not trying to directly recreate the operational conditions on Mainnet.

The observations stated refer to the direct comparison between the 8.1.2 and 8.7.2 versions.

Resource Usage

  1. Plutus workload, having a lower overall absolute CPU load, exhibits an average increase of 27% in Process CPU usage. Value workload, having a higher overall absolute CPU load, exhibits a near-jitter increase of 1%.
  2. Allocation rates increase by ~8.9MB/s (value workload) and ~12.6MB/s (Plutus workload).
  3. Heap sizes increase by 47% - 54%.
  4. CPU 85% span duration shrinks by ~9.7 slots under value workload, and ~5.8 slots under Pluts workload.

Caveat: Individual metrics can't be evaluated in isolate; the resource usage profile as a whole provides insight into the system's performance and responsiveness.

Forging Loop

  1. Block Context Acquisition in the forging loop increases by ~10ms.
  2. Mempool snapshotting shows an increase by 16ms under value workload; under Plutus workload, it increases by 3ms.
  3. Ledger ticking improves slightly by 1-2ms.

The metric 'Slot start to announced' (see in attachements) is cumulative, and demonstrates how far into a slot the block producing node first announces the new header.

Peer propagation

  1. Block fetch time increases for full blocks by 9%. For small blocks, it improves by 7%.
  2. Time to resend a block after fetching increases by 8% for full blocks, whereas it improves by 2% for small blocks.
  3. Block adoption by a peer takes 12% more time for a full block, but happens faster by 4% for a small block.

End-to-end propagation

This metric encompasses block diffusion and adoption across specific percentages of the benchmarking cluster, with 0.80 adoption meaning adoption on 80% of all cluster nodes.

The metric exhibits an increase by ~10% across all centiles for full blocks, whereas it improves by 5-6% for small blocks in the higher (80th and above) centiles.

Contact

This is the first time we're publishing, to a wider audience, such benchmarking results. We are aware that more context and detail may be needed with regard to specfic metrics or benchmarking methodology.

We are looking to gather questions, both general and specific, so that we can provide a suitable FAQ and possibly improve presentation in the future.

Attachments

Full report for value-only workload, PDF downloadable here.

Full report for Plutus workload, PDF downloadable here.

The relese benchmarks for 8.7.2 were performed on tag 8.7.1-pre, which features identical cardano-node components.