Skip to main content

60 posts tagged with "sre"

View All Tags

· One min read
John Lotoski

High level summary

The SRE team continues work on Cardano environment improvements and general maintenance.

Some notable recent changes, updates or improvements include:

  • A sanchonet relay and pool have been spun up to participate in the community driven disaster recovery testing happening in the near future.

  • Cardano-submit-api configs will be updated to be compatible with both the legacy and new tracing system in the next cardano-node release.

  • A legacy Matomo deployment is being migrated to a newer stack so deprecated resources can be turned off in the near future.

Repository Work -- Merged

Capkgs

  • Adds an exclusions.json file to exclude packages that are known to not build, with compatibility code added to packages.cr and a new justfile recipe: filter-packages. The exclusions file was populated with all currenctly failing evals along with a reason. Go package exclusions were re-added after the missing hydra go deps were re-cached after working around a failing test certificate during haskell.nix bootstrap tests. capkgs-pr-7

Repository Work In Progress -- PRs and Branches

· 2 min read
John Lotoski

High level summary

The SRE team continues work on Cardano environment improvements and general maintenance.

Some notable recent changes, updates or improvements include:

  • A substantial amount of effort was placed into the Hydra CI build system during this biweekly period to investigate the root cause of aborted builds due to both logged invalid store paths and logged missing nar cache files. Nushell scripts were written to examine and repair specific closures as well as to walk all nix cache objects and proactively resolve any dangling narinfo files, effectively resolving the aborted builds. Script repair operations were parallelized to speed up the walk rate across the large object count bucket. The root cause was a cache truncation operation which purged a small percentage of objects filtered by oldest age and non-uniformly deleted narinfo and nar objects which needed to remain paired due to self-references. A more intelligent GC approach will be used in the future.

Repository Work -- Merged

Blockperf

  • Fixes a new tracing system blockperf implementation error for trace detail level. blockperf-pr-33

Capkgs

  • Re-adds regular hydraJob builds in addition to fetch-closure only builds to ensure the full jobset can be rebuilt from source. capkgs-commit-range

Cardano-airgap

  • Adds more boot options for better video driver support, including nouveau nomodeset fallback and open and closed Nvidia drivers. The dconf config file was updated to use the nixos modules declaration. Logout, shutdown, restart and similar gnome operations were fixed. Additional helper packages were added. See the PR header for details. cardano-airgap-pr-9

Devx-ci

  • Adds ci10, a x86_64-linux builder, to be repurposed later for Equinix metal migration. Sets narinfo-cache-positive-ttl back to default value, sets the default user nofile limit to 4096 from default of 1024 to avoid occasional nofile failures. Rekeys required group secrets to include the new machine, adds ci7, ci8 to the r2 tunnel. Adds a github-hydra-bridge-restarter service to detect when the bridge token has expired and auto-rotate within one minute of expiration. devx-ci-pr-135

Repository Work In Progress -- PRs and Branches

· 2 min read
John Lotoski

High level summary

The SRE team continues work on Cardano environment improvements and general maintenance.

Some notable recent changes, updates or improvements include:

  • An on-going, intermittent outage with our nix upstream cache storage provider has been investigated. While the issue still persists and we work with the provider to get it resolved, it appears to be isolated to traffic routing through a particular provider colocation. Installing a wireguard tunnel for our cache traffic to route around the affected colo has brought our build farm machines back to normal operation until the provider resolves the issue.

This biweekly is shorter than usual as SRE members attended Nixcon 2025 to stay sharp on nix skills, relevant technical knowledge and tooling that can benefit our IOE environments and operations. Additionally, the remaining time was skewed towards internal operations rather than feature development during this period. A new team member has also joined the SRE team!

Repository Work -- Merged

Devx-ci

  • Adds an independent pin for GH runner bumps, and adds use of an r2 wg tunnel to eu-central-1 to work around the problematic CF ARN colo reads. devx-ci-pr-134

Repository Work In Progress -- PRs and Branches

· 2 min read
John Lotoski

High level summary

The SRE team continues work on Cardano environment improvements and general maintenance.

Some notable recent changes, updates or improvements include:

  • An Oakhost cloud Mac M4 machine was on-boarded into the mixed Hetzner and prem Darwin CI environment. An internal documentation guide was written for fast Oakhost onboarding if a need to scale Darwin resources further presents itself.

  • Adawallet received a feature upgrade to to handle native tokens in a number of subcommands for adawallet accounts. The cardano-airgap image was updated to include this capability.

  • A new IOHK signing authority key for email signing.authority@iohk.io with GPG key fingerprint ending in 0xCD0171BC90, was placed into production for package signing. The public key can be found here.

Repository Work -- Merged

Adawallet

  • Adds multiasset support to adawallet in select contexts such as send-tx, drain-tx, bulk-drain-tx, migrate-wallet, import-utxos, export-utxos. When the multiasset option is used native tokens are aggregated into a single UTXO with the minimum amount of lovelace required and lovelace only UTXOs are handled separately. A future improvement would be to handle large native tokens collections which may hit transaction size limits. adawallet-pr-24

Cardano-airgap

Devx-ci

  • Adds oak-m4-1 machine to the build cluster with associated monitoring, secrets, hydra ssh and build machine defn. Updates darwin.sh with a -b, --bindir optional arg to allow an easier darwin bootstrapping deployment. Bumps opentofu grafana provider to 4.5.3, makes a common Hetzner linux module, smaller Hetzner linux variant specific modules and refactors the existing Hetzner linux machines under those. Moves hetzner-m1.nix to darwin-state-version.nix module used by both Hetzner and Oakhost darwin machines. Adds a few clarification comments to the distributed builds nixosModule. devx-ci-pr-133

Ops-lib

  • Bumps devShell and machines nix to 2.29.1, bumps deployed machines to nixpkgs 25.05 with a small patch for nixops to continue operating, bumps iohk-nix to current head and niv to latest sources.nix. Cleans up the ssh key overlay and removes sentry relevant package, modules and scripts as usage is deprecated. Makes misc required nixos-25.05 nixpkgs module updates. ops-lib-pr-136

Repository Work In Progress -- PRs and Branches

· 3 min read
John Lotoski

High level summary

The SRE team continues work on Cardano environment improvements and general maintenance.

Some notable recent changes, updates or improvements include:

  • With the exception of a few canary machines still running the legacy cardano-node tracing system, the majority of IOE playground and mainnet cardano-node machines are now running the new tracing system.

  • Adawallet received a feature upgrade to sign messages, enabling it to complete glacier drop claims for adawallet accounts. The cardano-airgap image was updated to include this capability.

Repository Work -- Merged

Adawallet

  • Bumps to node 10.5.1 for cardano-cli 10.11.0.0, adds cardano-signer v1.29.0 to the devShell, adds a yarn devShell, improves the cardano-hw-cli package by switching to default nodejs pkg and greatly simplifies the build. adawallet-pr-22

  • Add an adawallet sign-msg feature which signs with either payment or stake key, enabling glacier drop claims on adawallet accounts. Cleans up some more legacy cardano-hw-cli packaging and adds back bash auto-completion. adawallet-pr-23

Cardano-airgap

  • Updates adawallet for sign-msg support for glacier drop and adds cardano-signer and misc support packages and services to the devShell and iso. cardano-airgap-pr-6

Cardano-mainnet

  • This PR primarily upgrades all machines to the cardano-node new tracing system. It provides alert, dashboard and nix module upgrades for compatibility with the new tracing system. This PR includes improvements from cardano-parts release v2025-08-05. Additional details can be found in the PR description. cardano-mainnet-pr-38

Cardano-node

  • Adds the snapshot-converter binary to the nix overlay and the node OCI container. Adds documentation on how to use the snapshot-converter within the image for changing ledger state type. cardano-node-pr-6299

Cardano-parts

  • This cardano-parts release changes the default tracing system from legacy to the new cardano-node tracing system for deployed machines. See the release notes for details. cardano-parts-release-v2025-08-05

  • Updates cardano-signer to v1.29.0 which allows for Byron era address claims for Midnight Glacier drop. Bumps mithril unstable and adds some flakeModule cluster options for more service granularity. cardano-parts-release-v2025-08-14

Cardano-playground

  • This PR primarily upgrades all playground testnet machines with a few canary exceptions to the cardano-node new tracing system. It provides alert, dashboard and nix module upgrades for compatibility with the new tracing system. This PR includes improvements from cardano-parts release v2025-08-05. Additional details can be found in the PR description. cardano-playground-pr-45

Cardano-signer (nix packaged)

  • The nix packaging for upstream cardano-signer was updated for release 1.29.0 for byron address glacier drop compatibility and a GHA for ci build testing was added. cardano-signer-pr-2

Repository Work In Progress -- PRs and Branches