It can happen after an unexpected crash or sudden stop of one of the Postgres containers that the database can no longer locate a valid checkpoint.
The following log can be observed in the concerned Postgres container
PANIC: could not locate a valid checkpoint record
Restarting the container doesn’t seem to solve automatically the issue as Postgres is looking for a checkpoint record that is probably corrupted.
We would like to reset the write-ahead log and other control information of a PostgreSQL database cluster. The stored data should not be affected.
Proceed with the following steps in the concerned
locate the faulty
add the fields
Save & Re deploy (The faulty container should not immediately restart when it fails)
Open a bash in the pod
su postgres pg_resetwal /var/lib/postgresql/data
Once the database accessible, revert the changes from the steps
📎 Related articleshttps://stackoverflow.com/questions/60604699/postgres-k8s-panic-could-not-locate-a-valid-checkpoint-record-crashloopbachttps://www.postgresql.r2schools.com/postgresql-panic-could-not-locate-a-valid-checkpoint-record/