Milan@discuss.tchncs.de

Milan@discuss.tchncs.de

Well hello again, I have just learned that the host that recently had both nvme drives fail upon drive replacement, now has new problems: the filesystem report permanent data errors affecting the database of both, Matrix server and Telegram bridge.

I have just rented a new machine and am about to restore the database snapshot of the 26. of july, just in case. All the troubleshooting the recent days was very exhausting, however, i will try to do or at least prepare this within the upcoming hours.

Update

After a rescan the errors have gone away, however the drives logged errors too. It’s now the question as to whether the data integrety should be trusted.

Status august 1st

Well … good question… optimizations have been made last night, the restore was successful and … we are back to debugging outgoing federation :(

The new hardware also will be a bit more powerful… and yes, i have not forgotten that i wanted to update that database. It’s just that i was busy debugging federation problems.

References

federation issues after restore: https://github.com/matrix-org/synapse/issues/16025
why we had to restore initially: https://text.tchncs.de/tchncs/about-the-matrix-incident-on-july-26-2023

[Matrix server] Upcoming maintenance and backup restore July 31

[Matrix server] Upcoming maintenance and backup restore July 31

Update

Status august 1st

References