Skip to main content

51 posts tagged with "sre"

View All Tags

· 2 min read
John Lotoski

High level summary

The SRE team continues work on Cardano environment improvements and general maintenance.

Some notable recent changes, updates or improvements include:

  • Cardano-parts environments can now be migrated from Grafana Agent to Grafana Alloy.

  • Most cardano-parts downstream clusters have now been migrated to grafana alloy use.

Repository Work

Cardano-parts

  • Migrates to grafana alloy from grafana agent. Drops deprecated cardano-node service features and iohk-nix legacy mainnet relay filtering. Fixes cardano-parts jobs for cardano-cli breaking change compatibility. More detail is available in the release notes: cardano-parts-release-v2024-10-22

Cardano-playground

  • Migrates to grafana alloy for metrics collection, rotates sanchonet kes, fixes latest govtool develop nix packaging and nginx deployment and deploys to sanchonet. More detail is available in the PR description: cardano-playground-pull-34

Cardano-mainnet

  • Migrates to grafana alloy for metrics collection, fixes scheduled restart service status file. More detail is available in the PR description: cardano-mainnet-pull-24

Cardano-node

  • Removes legacy mainnet relay filter for bootstrap attr generation, bumps the iohkNix pin for similarly updated topology generation, updates mainnet topology config to match updated iohk-nix topology ordering. cardano-node-pull-6011

Govtool

  • Fixes nix builds for frontend and backend components, required node_nixpkgs pin update and regeneration of yarn.lock. Fixes impure nix builds required by some deployers by adding an optional returnShellEnv bool to the backend package. govtool-pull-2184

Iohk-nix

  • Removes deprecated legacy relays from mainnet env and corresponding filtering for bootstrap generation, simplifies bp config generation, arranges mainnet edgeNodes in alphabetical order. iohk-nix-pull-587

· 2 min read
John Lotoski

High level summary

The SRE team continues work on Cardano environment improvements and general maintenance.

Some notable recent changes, updates or improvements include:

  • All environments have been upgraded to cardano-node 9.2.1.

  • Cardano-faucet 9.2 is available and deployed which is compatible with node 9.2.x and has fixed ipv6 functionality.

  • All deployed machines now default to nix 2.24-maint after an upstream bug causing a hash miscalculation in submodules has been fixed. Nix 2.21 forward required some rework of the colmena deployment recipes as dirty git trees now force an impure colmena deployment.

  • To ease the process of upgrading cardano-parts, releases will now be made instead of only PR merges with migration notes.

Repository Work

Cardano-faucet

  • Binds ipv6 interface in addition to ipv4, parses and logs all ips to a unified ipv6 format and applies hlint and fmt updates. cardano-faucet-pull-14

  • Makes required changes for cardano-api 9.2.0.0 and 9.3.0.0, removes void type sig constraints, bumps haskellNix, CHaP, cardano-api -> 9.3.0.0 for node 9.2.x compatibility. Disables mingw32 builds until alex in current haskellNix pin is updated. cardano-faucet-pull-15

Cardano-parts

  • Sets cardano-node to 9.2.1, cardano-faucet to 9.2. Bumps nix to 2.24-maint and adds ipv6 and nix versioning fixes and other improvements. Begins cardano-parts date-based releases for an improved upgrade process. Adds misc fixes and improvements. More detail is available in the release notes: cardano-parts-release-v2024-10-07

Cardano-playground

  • Deploys cardano-node to 9.2.1, cardano-faucet to 9.2. Bumps nix to 2.24-maint and adds ipv6 and nix versioning fixes and other improvements. Adds a wip node pparams api server. More detail is available in the PR description: cardano-playground-pull-33

Cardano-mainnet

  • Deploys cardano-node to 9.2.1, bumps nix to 2.24-maint and adds ipv6 and nix versioning fixes and other improvements. Converts bootstraps to a new cached-index-patch branch and upgrades CF canary sql queries. More detail is available in the PR description: cardano-mainnet-pull-23

· 2 min read
John Lotoski

High level summary

The SRE team continues work on Cardano environment improvements and general maintenance.

Some notable recent changes, updates or improvements include:

  • All environments have been upgraded to cardano-node 9.2.0.

  • All IOE run cardano-parts clusters (ie: sanchonet, preview, preprod, etc testnets, mainnet and network-team clusters) have been upgraded to support ipv4/ipv6 dual stack operations. This includes each cardano network's respective public access or backbone DNS, now offering AAAA records for ipv6 connections.

Repository Work

Cardano-parts

  • Sets cardano-node to 9.2.0. Adds ipv6 tf, module and recipe support for ipv4/ipv6 dual stack operations. Updates alerts and dashboards for the new tracing system to reflect metrics name changes and legacy metric prefix normalization. Adds misc fixes and improvements. More detail is available in the PR description: cardano-parts-pull-48

Cardano-playground

  • Deploys cardano-node to 9.2.0. Converts all relevant cluster resources and machines to ipv4/6 dual-stack operations. Updates alerts and dashboards for the new tracing system to reflect metrics name changes and legacy metric prefix normalization. More detail is available in the PR description: cardano-playground-pull-32

Cardano-mainnet

  • Deploys cardano-node to 9.2.0. Converts all relevant cluster resources and machines to ipv4/6 dual-stack operations. Adds new bootstrap scaling machine startup and shutdown recipes. Updates alerts and dashboards for the new tracing system to reflect metrics name changes and legacy metric prefix normalization. More detail is available in the PR description: cardano-mainnet-pull-22

Ouroborous-network-ops

  • Deploys cardano-node to 9.2.0. Converts all relevant cluster resources and machines to ipv4/6 dual-stack operations. Updates alerts and dashboards for the new tracing system to reflect metrics name changes and legacy metric prefix normalization. More detail is available in the PR description: ouroborous-network-ops-18

· 2 min read
John Lotoski

High level summary

The SRE team continues work on Cardano environment improvements and general maintenance.

Some notable recent changes, updates or improvements include:

  • Mainnet was hard forked to Conway era!

  • Legacy mainnet relays from cardano-ops cluster were stopped and retired.

  • Legacy cardano-explorer hosted at explorer.cardano.org was retired with landing page and beta explorer services now provided by Cardano Foundation.

  • Cardano-smash production load was cutover from the legacy cardano-world cluster to the replacement cardano-mainnet cluster. Remaining cardano-world resources will be retired in the near future.

  • Cardano-faucet was updated for cardano-node 9.1.x level compatibility.

Repository Work

Cardano Faucet

  • Brings faucet up to cardano-api and cardano-cli level of cardano-node 9.1: bumps relevant flake pins, updates CHaP indexes, applies fixes for upstream breaking changes, removes cardano-addresses srp, adjusts ghc options, fixes ming32 CI builds, applies most hlint and fourmolu style and config suggestions respectively: cardano-faucet-pull-12

Cardano Parts

  • Sets cardano-node to 9.1.1, cardano-db-sync to 13.5.0.2, cardano-faucet to 9.1. Adds alerts, dashboard fixes, nixos iowait optimization, smash and blockperf nixosModule improvements. More detail is available in the PR description: cardano-parts-pull-47

Cardano-mainnet

  • Deploys cardano node to 9.1.1, cardano-db-sync to 13.5.0.2. Improves smash deployments and backup role for production load handling. Improvements made in cardano-parts PR#47 are included in this PR. More detail is available in the PR description: cardano-mainnet-pull-21

Cardano-playground

  • Deploys cardano node to 9.1.1, cardano-db-sync to 13.5.0.2, cardano-faucet to 9.1. Tests RTS parameter optimization and tracing system changes on preview network machines, tests utxo-hd-9.1.1 on mainnet edge nodes. Improvements made in cardano-parts PR#47 are included in this PR. More detail is available in the PR description: cardano-playground-pull-31

Cardano-world

  • Destroy retired legacy explorer metal machines and disable alerting: commit-compare

· 3 min read
John Lotoski

High level summary

The SRE team continues work on Cardano environment improvements and general maintenance.

Some notable recent changes, updates or improvements include:

  • The preprod network was hard forked to Conway era.

  • The nixosModule profile-blockperf in cardano-parts now includes prometheus metrics, automatically scraped with grafana-agent along with a dashboard.

  • A nixosModule profile-tcpdump in cardano-parts is now available to push on-going pcaps to s3 for historical reference.

  • Old dev environments were cleaned up and retired after the completion of the ouroboros-network-ops cluster migration to the cardano-parts stack.

  • Causes of blockperf indicated mainnet relay delayed block headers were investigated and improved with adjustments to RTS parameters and machine class.

  • Conway-era mempool log volume increase was investigated and resolved with ouroboros-network improvements.

  • Scaling capability was added to the cardano-mainnet bootstrap cluster.

Repository Work

Cardano Parts

  • Sets cardano-db-sync (release) to 13.4.0.0. Includes nixosModule improvements to cardano-db-sync snapshots module with a manual trigger, blockperf module new prom metrics, grafana-agent module with auto-blockperf scrape config and a new tcpdump module for persistent pcaps to s3. Recipe improvements for configuration consistency checking and openTofu improved AMI and DNS filtering have been made. The AWS machine reference spec has been updated and one alert tuned for better sensitivity. More detail is available in the PR description: cardano-parts-pull-46

Cardano-mainnet

  • Deploys cardano-db-sync (release) to 13.4.0.0. Deploys nixosModule improvements for cardano-db-sync snapshots module with a manual trigger, blockperf module with new prom metrics, grafana-agent module with auto-blockperf scrape config and a new tcpdump module for persistent pcaps to s3. Recipes improvements for configuration consistency checking and openTofu improved AMI and DNS filtering have been made. Makes changes to pool group relays to eliminate or reduce delayed block headers. Tests additional dev patches for missingBlock errors. Adds bootstrap cluster scaling capability and a bootstrap cluster dashboard. Improvements made in cardano-parts PR#46 are included in this PR. More detail is available in the PR description: cardano-mainnet-pull-20

Cardano-ops (Legacy Mainnet)

  • Over a two week period the legacy relay nodes were scaled down 50% further from the recent machine quantity peak. commit-compare

Cardano-playground

  • Preprod was hard-forked to Conway. Deploys cardano-db-sync to 13.4.0.0. Recipe improvements for configuration consistency checking and openTofu improved AMI and DNS filtering have been made. Improvements made in cardano-parts PR#46 are included in this PR. More detail is available in the PR description: cardano-playground-pull-30

Cardano-world

  • Updates openssh to 9.8p1 on remaining cardano-world (soon-to-be-retired) cluster machines commit