Upgrade etherpad.wikimedia.org to the most recent version (v1.9.7).
Preparation work:
- build new etherpad-lite Debian package for 1.9.7
- build new prometheus-etherpad-exporter package
- prepare new etherpad VM (bookworm, etherpad1004)
- test etherpad-lite 1.9.7 on devtools
- installation works (puppet run and installation for 1.9.7 + exporter)
- mysql and proxy are missing
- apply role(etherpad) to etherpad1004 and set profile::etherpad::service_ensure: stopped for etherpad1004
- run puppet on etherpad1004 and verify successful installation
- make Grafana dashboard compatible with multiple etherpad instances
- announce maintenance windows some days in advance
Maintenance (switch from etherpad1003 to etherpad1004):
- create a downtime for etherpad1003
- stop etherpad-lite.service on etherpad1003 (set profile::etherpad::service_ensure: stopped or stop vm) https://gerrit.wikimedia.org/r/1005961
- wait until etherpad-lite is stopped properly,
- set profile::etherpad::service_ensure: running for etherpad1004 https://gerrit.wikimedia.org/r/1005962
- run puppet on etherpad1004 and verify puppet agent logs
- change etherpad.wmnet CNAME to etherpad1004.eqiad.wmnet and run authdns-update https://gerrit.wikimedia.org/r/1005963
- wait 300s for DNS cache
- visit https://etherpad.wikimedia.org and check if etherpad still works
- open old pads
- create new pads
- check metrics
After maintenance:
- stop etherpad1003 and decommission after grace period
- apply etherpad role to replica in codfw