Skip to main content

50 posts tagged with "consensus"

View All Tags

· One min read
Damian Nadales

High level summary

This week the Consensus team made progress on two fronts: the question of survivable eclipse duration, which is part of our work supporting Genesis delivery, and how to improve the handling of blocks from the future. Regarding the UTxO-HD branch, we managed to run a node with legacy blocks, which is syncing with mainnet, up to including Alonzo. We also investigated a regression in mempool snapshotting, which was ultimately solved by a Ledger update, and will be fixed in the upcoming Node 8.6 release.

· 2 min read
Damian Nadales

High level summary

During the past two weeks the Consensus team received additional benchmark results for the UTxO-HD feature that show the resource usage for the in-memory backend is not satisfactory for a mainnet release, and we need to wait on the implementation of a new infrastructure to benchmark the LMDB backend (not likely to happen before next year). While we wait on this, we are evaluating the feasibility of making the UTxO-HD feature switchable, which will enable us to release it as an experimental feature. On the Genesis front we produced the first draft for a Survivable Eclipse Duration Model. We released version 8.5.0 of Cardano node, resumed work on subpar handling of blocks from the future, and improved our tracing system to assist problem troubleshooting in the node.

UTxO-HD

  • The Plutus workload benchmark for the in-memory backend showed no regressions for the metrics of interest, but it does show an increase in resource usage.
  • We got additional ad-hoc measurements on memory UTxO-HD consumption. The memory usage of the in-memory backend is not satisfactory for a release. The memory usage of the LMDB backend is considerably lower, but we need to see how much lower we can bring it by running a node whose memory is constrained to 8GB.
  • We resumed work on an alternative solution that will make the UTxO-HD switchable. This will enable us to keep the baseline performance by totally disabling UTxO-HD, while allowing users to experiment with the feature if they wish to do so.

Genesis

  • We produced the first draft for a Survivable Eclipse Duration Model (422).

Support

  • Esgen finished his cycle as release engineer. Node 8.5.0 has been released.
  • We resumed work on the subpar handling of block from the future (4251).
  • We prepared the integration of new tracing events for the next node release. These tracing events will help debugging potential issues in the node (such as the previously mentioned issue).

· 2 min read
Damian Nadales

High level summary

The value-only workload benchmarks showed that the mempool forging regression observed in the UTxO-HD branch was fixed by the latest patch. In spite of the higher resource demands, for the metrics of interest (forging, peer-propagation, end-to-end propagation) we see no regression when using the UTxO-HD version of Cardano node, with the in-memory backend.

On the Genesis front the Researchers continue reviewing different aspects of the design, in particular the argument that the Genesis rule will select the Cardano historical chain. We also merged a fix for the Babbage to Conway transition, and released a new version of Consensus.

Genesis

  • We elicited review from the Researchers on a final draft of the argument that the Genesis rule will select the Cardano historical chain (392).

Support

  • We merged a minimal patch that fixes parameter update bug during the Babbage to Conway transition (366).
  • We enabled richer tracers in cardano-node that can be useful in future debugging (384).
  • Esgen continues with his release engineer activities, and created a new Consensus release.

Fostering collaboration

  • We merged a new section into our documentation that explains the existing hard-fork combinator (HFC) interface and its complexities, which are relate do why the Babagge to Conway transition surprised us in this way. This explanation is step one towards improving the HFC interface (369).

· 2 min read
Damian Nadales

High level summary

We have a proposed fix for the mempool forging regression observed in the UTxO-HD branch. We need to confirm this by running system level benchmarks. We are still working on a fall back mechanism for keeping the baseline performance of Cardano node, if the performance of the UTxO-HD is not enough. On the Genesis front, we confirmed with the researchers that the proposed Genesis design is satisfactory for the historical Cardano chain. We also have a proposed fix for the wrong protocol version bug, found in the Sanchonet, after transitioning to Conway.

UTxO-HD

  • We optimized the mempool revalidation process, which in turn ought to solve the regression observed during system-level benchmarks in the in-memory version (349). System level benchmark results are pending.
  • Regarding the workaround to keep the node's baseline performance if that of the in-memory backend turns out not to be enough for our stakeholders (344), we are still expanding the legacy block package such that we could at some point run the node with a legacy Cardano block. There are some loose ends to wrap up before we can begin the first test run.
  • We also brought the UTxO-HD branch up to date with node version 8.4.0.

Genesis

  • We finished the discussion with the Researchers on how to argue that the proposed Genesis design is satisfactory for the existing historical Cardano chain. We are now drafting the final self-contained argument. (4157)

Support

  • We debugged a bad parameter update on the Babbage to Conway transition in the SanchoNet testnet (339). A superficial patch is within reach and we are in the process of reviewing the PRs related to this fix (340, 354, and 355) However we are investigating a more principled redesign of the epoch transition logic, which required us to revisit the existing interfaces of the ConsensusProtocol type class and the HardForkBlock combinator (345 and 346). This is important to prevent these kind of errors in the future. This is an overdue step in the process of taking full ownership of the HFC: reconsidering original HFC design decisions for which we now have much more context, a few years later.

· 3 min read
Damian Nadales

High level summary

We were able to successfully run the system-level benchmarks for the UTxO-HD implementation, for the first time. There was an important regression in block forging performance that will have to be addressed before UTxO-HD is released. We also revisited the implementation of our query processing logic, which was needed to address the performance regression found in the query-by-address command. The preliminary performance results show that now the performance of this query is on-par with the Cardano baseline version, but we need further confirmation. On the Genesis front, we presented the grinding-aware safety argument for the proposed historical Cardano Genesis windows to the IOG Researchers. The Consensus release engineer finished his rotation: version 8.3.0-pre of cardano-node is releasing 2023 September 5.

UTxO-HD

  • We ran the first successful system-level benchmarks for UTxO-HD (see #203) using the in-memory backend.
    • We observed a factor 12 regression in the forging performance, which we will have to address. There are strong indications that the regression is due to the backing store accesses that take place when taking a mempool snapshot.
    • After the mempool regression is fixed the benchmarks need to be ran again.
    • System-level UTxO-HD benchmarks with the LMDB are still pending.
  • UTxO-HD will eventually be necessary due to the growth of the UTxO set and other ledger state structures that live in memory at the moment. However, we are trying a strategy by which we could preserve the baseline performance of the node, in case SPOs and other node users are not ready to migrate yet (see #344).
  • We implemented a new way of processing queries at the hard-fork block level, which resolves the performance regression observed in GetUTxOByAddress (see this comment). Preliminary results are promising.
  • Regarding the roll out plan, UTxO-HD requires a significant change in the Consensus codebase. Even though we might be able to hide any potential performance impact in the node by keeping all data in memory (#344), the Consensus component was significantly changed, so we might have to postpone releasing this feature to mitigate any risks of conflicting with the implementation of CIP-1694 and release of Conway.

Tech debt

  • We added tests that Consensus emits valid CBOR (#3099). This helped us detect a couple of serialization bugs. The tests still need to be merged into the main branch (#323).

Support

  • Nick Frisby finished his release engineer rotation; cardano-node 8.3.0-pre is releasing 2023 September 5.
  • We helped to investigate a protocol version bug in Sanchonet (see #3491).
  • We started to implement the Network interface for bootstrap peer functionality, from which Genesis will benefit as well (see #91.