Skip to main content

· 2 min read
Jared Corduan

Ledger Update

We have been focused nearly entirely on addressing technical debt.

  • We introduced more consistent naming across eras, this time for the auxiliary data. See 3032.
  • We made clear how the consumed functions differs between eras (which was a previous source of confusion), and added some related support to the fledgling ledger API. See 3016.
  • We added clarity and organizational consistency to the main ledger era type synonyms. See 3017.
  • We removed code duplication related to the input data hashes. See 3018.
  • We split up a large module into smaller components. The large module was actually causing our CI to time out. See 3020.
  • We cleaned up stale information in our cabal files, and upgraded cabal 3.8. See 3023, 3031, and 3028.
  • We made consistent, standalone TxOut (transaction output) modules for every era. See 3024.
  • We brought consistency to a maddening inconsistent use of type variables indicating the specific choice of cryptographic primitives. In particular, all uses of crypto have been renamed to c. See 3027.
  • We did a clean up of the types in the Alonzo era. In particular, we switched to more parametric types that will compose better in the future and which simplifies the constraints. See 3029.
  • We consolidated some existing fragmented logic regarding how we gather the scripts needed for a given transaction. This is a much needed cleanup to prevent future mistakes. See 3019.
  • We fixed a problem with our generators that was causing a fair number of our property tests to fail in CI. See 3039.
  • We have started the work to update Plutus. This will bring support for SECP in the next major protocol version, and also address a problem that we current have evolving the cost models. See 3030.
  • We addressed a small issue that came up when integrating the conway era downstream, namely the lack of some serialization instances. See 3022.

· 2 min read
Jared Corduan

Ledger Update

Since finishing up support for the Vasil Hardfork, the ledger team has been focused on two main things: a new ledger era and technical debt.

New minimal ledger era

We have implemented a new ledger era named conway which is nearly identical to the babbage era. This has been the first time that we have been able to see what a minimal ledger era looks like. We have finished this task, modulo any integration issues that might come up. The only thing that the conway era does differently from the babbage era is provide support for rotating the master keys using the hardfork combinator's state translation. We may end up adding features to the conway era, but it is a nice exercise seeing what it looks like to get a minimal ledger era supported in all the downstream components.

Addressing technical debt

We have been addressing technical debt, mostly in an effort to make the repository a more friendly code base to work in.

  • We have begun work on a ledger API, called cardano-ledger-api.
  • We have done a big re-design of the major type classes used in the ledger. With hindsight on our side, we now have something much more organized and easier to use.
  • We have done a lot of re-naming. The names across eras are now much more uniform, avoid certain confusions that plagued us, and are clearer in where they are from.
  • We have reduced a lot of code duplication that could lead to bugs if you do not have the whole code base in your head.
  • We have added a handful of performance improvements.
  • We added type safety in a number of locations. In particular, the type of values that can be minted in a transaction no longer allow for Lovelace in the type, and some functions which used to handle both timelock scripts and plutus script now correctly enoforce at the type level that only one of them can be used.
  • We made our generators so that they now produce a much richer set of valid serializations. There is room within CBOR to serialize the same data structure in multiple ways, and it is helpful to have the generators use a wide variety.
  • We have begun re-organizing our test suites.

· 9 min read
Damian Nadales

Executive summary

  • We did most of the heavy lifting required to integrate the Conway era.
  • We have property tests for the UTxO HD backing store API implementations. A possible bug was identified. Work is ongoing to make sure the property-tests cover all the relevant cases.
  • We implemented and benchmarked the "anti-diff" prototype to speed up the UTxO HD functionality. Results show a rough speedup of 4x to 5.5x across several scenarios. Note that: "Data reported by tasty-bench is only of indicative and comparative significance.". We are investigating additional performance improvements. The "anti-diff" prototype and benchmarks are still pending code review.
  • We elaborated a draft specification for the Genesis implementation and ChainSync jumping optimization.

Workstreams

Conway

  • Integration PR of the minimal Conway era (Issue #3963, PR #3971).
  • Discussions with Ledger revealed possible sources of confusion about which data should be changed in the Conway era. As a result, a new technical debt issue was raised, which does not block the integration of the Conway era (Issue #3976).

UTxO HD

  • Issue #3954, branch: The functionality of a backing store, which is the interface to the on-disk part of ledger state in UTxO-HD, is tested at a high level through the OnDisk tests. However, some functionalities remain untested, e.g., reads of ranges of keys. As such, we have implemented quickcheck-state-machine tests that exercise backing stores directly. The tests are reusable for different backing store implementations because the tests are implementation-agnostic: Any backing store that conforms to the backing store interface can be plugged into the tests. Work is still ongoing to label/monitor the tests, such that we can verify that interesting cases are being tested. Furthermore, a possible bug has been identified in the LMDB backing store with respect to range reads, though the bug has not been resolved yet.

  • Issue #3946, branch, PR #3882: The "anti-diff" prototype proposes an alternative approach to keeping track of sequences (more specifically, FingerTrees) of diffs. These diff sequences are a component of the in-memory parts of the ledger state in UTxO-HD. Since the consensus code often requires the cumulative diff of a sequence of diffs, the current implementation "caches" cumulative diffs of each subtree in the diff sequence. This caching allows relatively fast reconstruction of the total cumulative diff, but this caching proved to incur a non-negligible cost: when we manipulate diff sequences through splits and appends, we force re-computing a logarithmic number of caches. This is problematic, since we often split and append in consensus: we split when we flush diffs to a backing store or when we roll back blocks, and we append when pushing blocks. The new approach should reduce the overhead of this caching.

    We implemented micro-benchmarks for the "anti-diff" prototype: we first generate a sequence of commands (Forward, Push, Flush, or Rollback) through a simulation, after which we measure the performance of applying the commands to a diff sequence. In this context, Forward means forwarding of values through a diff, whereas Rollback means switching to a different fork by rolling back diffs/blocks and pushing new ones. Moreover, we compare the performance for the two implementations: the "legacy" approach, and the anti-diff approach.

    Some preliminary results were positive, but we needed to revisit the benchmark's configuration to obtain more definitive results. After a discussion with @dcoutts and the consensus team about this configuration (e.g., number of commands generated, choice of the security parameter k), the benchmarks should now be closer to the realistic setting. The following configuration specifies the default configuration that is used in the benchmarking code:

    • Number of commands generated: 10_000
    • Security parameter k: 2160
    • Number of initial backing values: 100
    • Number of key-value pairs deleted by a push: 50
    • Number of key-value pairs inserted by a push: 50
    • Number of key-value pairs forwarded by a forward: 50
    • Probability of a large (in the range [1000, 2000]) rollback: 0.05
    • Probability of a small (in the range [1, 10]) rollback: 0.95
    • Order of commands:
      • An equal number of forward and pushes.
      • 1 flush every 10 pushes.
      • 1 rollback every 100 pushes

    Moreover, we run four benchmark scenarios:

    • Default configuration
    • Without rollbacks
    • With only small rollbacks
    • Without rollbacks, larger flushes (1 flush every 100 pushes)

    How to read results

    Note: this section uses documentation from the tasty-bench package to explain how to read the results of running our benchmarks.

    Running a benchmark scenario gives us the following (curated) output:

    ...
    AntiDiff: OK (18.27s)
    2.527 s ± 47 ms, 2.1 GB allocated, 544 MB copied, 2.2 GB peak memory, 0.23x
    LegacyDiff: OK (32.73s)
    10.829 s ± 148 ms, 6.8 GB allocated, 2.3 GB copied, 2.2 GB peak memory
    ...

    The output says that the first benchmark, which exercises the anti-diff prototype, was repeatedly executed for 18.27 seconds (wall-clock time), its predicted mean CPU time was 2.527 seconds and means of individual samples do not often diverge from it further than ± 47 milliseconds (double standard deviation). We also configure the RTS to collect GC statistics, which enables tasty-bench to estimate and report memory usage. This data is reported as per RTSStats fields: allocated_bytes, copied_bytes and max_mem_in_use_bytes. So, the output of the first benchmark says that a total of 2.1 GB of memory was allocated, that a total of 544 MB of memory were copied, and that the peak memory in usage was 2.2 GB. We read the output for the second benchmark in the same way.

    Furthermore, the benchmark compares the mean CPU times for both the anti-diff and legacy approaches: In this case, the mean CPU time for the anti-diff approach is ~0.23x the mean CPU time for the legacy approach. Conversely, the mean CPU time for the legacy approach is 1 / 0.23 ~= 4.35x the mean CPU time for the anti-diff approach. We will call 0.23x the improvement factor. We will call 4.35x the speedup.

    Note that these improvement factors (and reported results) are subject to noise, randomness, the specific configuration parameters, and the whims of statistics. Data reported by tasty-bench is only of indicative and comparative significance.

    Results

    For each of the 4 scenarios, we list the results of running the anti-diff and legacy approaches 5 times. We run the benchmarks 5 times to get an indication of whether the results are similar across multiple runs. Furthermore, we calculate the accompanying ranges (if applicable) of improvement factors and speedups.

    Note also the decrease in total bytes allocated and total bytes copied for the anti-diff approach compared to the legacy approach.

    Default configuration

    NameMean CPU time2*Stdev (CPU time)Total bytes allocatedTotal bytes copiedPeak memory
    Run 1: AntiDiff2.533 s (0.23x)4.7 ms2.1 GB557 MB2.4 GB
    Run 1: LegacyDiff10.792 s162 ms6.8 GB2.3 GB2.4 GB
    Run 2: AntiDiff2.508 s (0.23x)245 ms2.1 GB515 MB2.2 GB
    Run 2: LegacyDiff10.850 s30 ms6.9 GB2.3 GB2.2 GB
    Run 3: AntiDiff2.562 s (0.23x)5.0 ms2.1 GB552 MB2.2 GB
    Run 3: LegacyDiff10.993 s149 ms6.9 GB2.3 GB2.2 GB
    Run 4: AntiDiff2.168 s (0.22x)5.3 ms1.8 GB434 MB2.0 GB
    Run 4: LegacyDiff9.976 s39 ms6.3 GB2.0 GB2.0 GB
    Run 5: AntiDiff2.527 s (0.23x)47 ms2.1 GB544 MB2.2 GB
    Run 5: LegacyDiff10.829 s148 ms6.8 GB2.3 GB2.2 GB
    • Improvement factor: [0.22, 0.23]
    • Speedup : [1 / 0.23 ~= 4.35, 1 / 0.22 ~= 4.55]

    No rollbacks

    NameMean CPU time2*Stdev (CPU time)Total bytes allocatedTotal bytes copiedPeak memory
    Run 1: AntiDiff1.638 s (0.19x)36 ms1.4 GB181 MB2.4 GB
    Run 1: LegacyDiff8.656 s207 ms5.7 GB1.5 GB2.4 GB
    Run 2: AntiDiff1.638 s (0.19x)75 ms1.4 GB181 MB2.2 GB
    Run 2: LegacyDiff8.654 s322 ms5.7 GB1.5 GB2.2 GB
    Run 3: AntiDiff1.663 s (0.19x)74 ms1.4 GB181 MB2.2 GB
    Run 3: LegacyDiff8.799 s216 ms5.7 GB1.5 GB2.2 GB
    Run 4: AntiDiff1.645 s (0.19x)51 ms1.4 GB181 MB2.0 GB
    Run 4: LegacyDiff8.732 s261 ms5.7 GB1.5 GB2.0 GB
    Run 5: AntiDiff1.639 s (0.19x)19 ms1.4 GB181 MB2.2 GB
    Run 5: LegacyDiff8.653 s234 ms5.7 GB1.5 GB2.2 GB
    • Improvement factor: 0.19
    • Speedup : 1 / 0.19 ~= 5.25

Only small rollbacks

NameMean CPU time2*Stdev (CPU time)Total bytes allocatedTotal bytes copiedPeak memory
Run 1: AntiDiff1.833 s (0.18x)36 ms1.5 GB185 MB2.4 GB
Run 1: LegacyDiff10.362 s867 ms5.8 GB1.6 GB2.4 GB
Run 2: AntiDiff1.696 s (0.19x)30 ms1.5 GB185 MB2.2 GB
Run 2: LegacyDiff8.822 s106 ms5.8 GB1.5 GB2.2 GB
Run 3: AntiDiff1.702 s (0.19x)44 ms1.5 GB186 MB2.2 GB
Run 3: LegacyDiff8.906 s147 ms5.8 GB1.5 GB2.2 GB
Run 4: AntiDiff1.701 s (0.19x)47 ms1.5 GB185 MB2.0 GB
Run 4: LegacyDiff8.949 s197 ms5.8 GB1.5 GB2.0 GB
Run 5: AntiDiff1.677 s (0.19x)55 ms1.5 GB186 MB2.2 GB
Run 5: LegacyDiff8.856 s177 ms5.8 GB1.5 GB2.2 GB
  • Improvement factor: [0.18, 0.19]

  • Speedup : [1 / 0.19 ~= 5.25, 1 / 0.18 ~= 5.55]

    No rollbacks, larger flushes (every 100 pushes)

    NameMean CPU time2*Stdev (CPU time)Total bytes allocatedTotal bytes copiedPeak memory
    Run 1: AntiDiff1.643 s (0.25x)21 ms1.5 GB196 MB2.4 GB
    Run 1: LegacyDiff6.591 s351 ms4.0 GB1.4 GB2.4 GB
    Run 2: AntiDiff1.616 s (0.25x)47 ms1.5 GB196 MB2.2 GB
    Run 2: LegacyDiff6.520 s232 ms4.0 GB1.4 GB2.2 GB
    Run 3: AntiDiff1.640 s (0.25x)34 ms1.5 GB196 MB2.2 GB
    Run 3: LegacyDiff6.540 s150 ms4.0 GB1.4 GB2.2 GB
    Run 4: AntiDiff1.635 s (0.25x)76 ms1.5 GB196 MB2.0 GB
    Run 4: LegacyDiff6.589 s131 ms4.0 GB1.4 GB2.0 GB
    Run 5: AntiDiff1.628 s (0.25x)19 ms1.5 GB196 MB2.2 GB
    Run 5: LegacyDiff6.490 s5.9 ms4.0 GB1.4 GB2.2 GB
  • Improvement factor: 0.25

  • Speedup : 1 / 0.25 ~= 4

Genesis

  • We elaborated a draft of the specification of the Genesis implementation and the ChainSync Jumping optimization. In particular, this includes a proof sketch that the latter preserves liveness and safety in all cases (Issue 3964).
    • @nfrisby's main realization during this sprint was that he had been focusing so far on the case where the selected chain is an extension of the intersection of our peers' ChainSync candidates.
    • This is the main case, ie an "absorbing" state, but it's not the only case.
    • The new proof sketch begins by case splitting on that predicate, and that made the sketch quite a bit easier to follow.
  • We continued working on the "happy path" ChainSync Jumping prototype (Issue 3960).

Technical debt

  • We started working on the issues required to re-enable nightly CI runs.. Nightly CI runs have far more lax time constraints, which gives the option to run significantly more property tests than in our regular CI. To this end, we merged a PR to easily adapt the number of tests globally (PR #3947).

· One min read
Marcin Szamotulski

The networking team took an active part in the project iteration (PI) planning session, see cardano-node backlog for detailed outcomes.

  • We started working on a detailed design / implementation plan for gossip.

  • We merged input-output-hk/ouroboros-network#3859 which sets the ouroboros-network repository for the single relay release.

  • We identified a bug in the network simulator, which is fixed in the input-output-hk/ouroboros-network#3852. The above PR was reviewed.

  • We set the tracing configuration for nodes which we deploy and fixed and identified some deployment hiccups. We identified some bugs in the RT view which were registered by the maintainers. input-output-hk/ouroboros-network-ops#4

  • We fixed typos in network-mux library: input-output-hk/ouroboros-network#3921

  • For easy of debugging we renamed a trace point: input-output-hk/ouroboros-network#3922

  • Duncan iterated on his simulation / visualisation. He also was able to identify and fix a bug in the simulator. The simulation contains 50 nodes. Dashed lines indicate and established connection, while solid lines indicate a TCP connection with fully open TCP window.