Skip to main content

SRE Team Update

· 4 min read
John Lotoski

High level summary

The SRE team continues work on Cardano environment improvements and general maintenance.

Some notable recent changes, updates or improvements include:

  • Cardano-node 10.6.0 has been pre-released with the corresponding long running SRE PRs now merged into this release! See the release notes for details.

  • SRE team identified a ledger replay bug in the 10.6.0 release candidate whereby the legacy tracing system would no longer log ledger replay update statistics. A fix was implemented prior to tagging and pre-releasing.

  • Near the end of this biweekly reporting period, the preview network experienced a network partition. After an intense multi-team and community collaborative effort which the SRE team was participating in from the beginning, the bug causing the partition was identified and a new cardano-node version 10.5.2 was released to fix the issue. This version was deployed to the IOE preview machines immediately upon release and shortly afterwards to the rest of the IOE testnet and mainnet infra.

Repository Work -- Merged

Cardano-mainnet

  • Bump cardano-parts for v2025-11-18

  • Updated CloudFormation terraformState.nix and opentofu/cluster.nix for corresponding tagging updates

  • Added new required flake cluster attribute declaration for new required resource tagging

  • Added a matomo nix module prototype in prep for a legacy bitte cluster matomo migration to prod

  • Fixed script breakages caused by cardano-cli breaking changes

  • Adds a smash delisting

  • Rotates mainnet KES

  • Adjusts an alert for pool 1 infrequent forging threshold noise

    cardano-mainnet-pr-39

Cardano-parts

  • Bumps cardano-node pre-release to 10.6.0, mithril to 2543.1-hotfix and blockperf to a fix branch which includes a patch for the new tracing system proper blockperf configuration

  • Added rsync ssm help bash function and alias to the common machine profile

  • Added peer snapshot files to the ops library function generateStaticHTMLConfigs

  • Added new flakeModule cluster.nix options of infra.generic costCenter, owner and project

  • Added zsh devShell command completion

  • Updated a number of nixosModules to support both new and legacy tracing systems as well as 10.6.0 and 10.5.1 configuration differences

  • Updated template CloudFormation terraform state and opentofu cluster resource definitions for corresponding tagging updates

  • Fixed template script breakages caused by cardano-cli breaking changes included in the 10.6.0 pre-release

  • Fixed the profile-blockperf.nix nixosModule new tracing system configuration

    cardano-parts-release-v2025-11-18

Cardano-playground

  • Added book config updates for 10.6.0 pre-release environments: preprod, preview

  • Added sanchonet environment configs for community disaster test participation

  • Added a "New Pool" document explainer at docs/explain/new-pool.md

  • Added new required flake cluster attribute declaration for new required resource tagging

  • Added a matomo nix module prototype in prep for a legacy bitte cluster matomo migration to prod

  • Added wireguard tunnel endpoints as temporary R2 colo http streaming/timeout bucket workarounds

  • Added misc improvements to playground scripts for governance voting

  • Updated CloudFormation terraform state and opentofu cluster resource declarations for corresponding tagging updates

  • Updated CI to a smaller representative machine subset

  • Updated preview, preprod and non-prod test forgers with KES rotation

  • Fixed script breakages caused by cardano-cli breaking changes

  • Voted on a preview and preprod governance action with drep/pools and CC

    cardano-playground-pr-51

Iohk-nix

  • Merges non-forger and forger configs, with node handling differences internally based on forger status (ie: PeerSharing, TargetNumberOfKnownPeers, TargetNumberOfRootPeers)

  • Includes peerSnapshotFile for all networks, now at v2

  • Allows SRV records for bootstrap resource definitions

  • Adjust default networking mode to p2p without explicit declaration as the only mode for >= 10.6.0 is p2p

  • Bump minNodeVersion to 10.6.0 for default config changes

  • Testnet templates have been adjusted for plutus v3 cost model params at a mainnet matching 251 parameters and Dijkstra genesis added

    iohk-nix-pr-602

Repository Work In Progress -- PRs and Branches