This afternoon, @David_s pinged me on Discord to let me know that all mod blocks had disappeared from nearby grids and the server’s top speed was the default 100 m/s. On further investigation, I discovered that the server no longer had mods configured at all, and hadn’t since at least the preceding restart half an hour earlier. I immediately turned off the server until I could dig further. The server is now back up and running, but I was unable to restore grids damaged by this incident.
Honestly, I still don’t fully understand why the game deconfigured all mods. I read through the server log files, however, and the most recent log file contained entries indicating that it had been unable to contact Steam during mod refresh (a normal part of server startup). This is my best guess as to the smoking gun.
Why Did We Lose Data?
Space Engineers’ default backup configuration creates a new backup save every five minutes, and retains the five most recent backups. This gives us a 25-minute rollback window for any changes, including accidental changes and server configuration problems. We did not discover the problem with the server’s mod list until more than 30 minutes had passed, so none of the remaining backups contained data with the mod list intact.
Space Engineers discards individual blocks on load if the block is associated with a mod that is not loaded in the server’s configuration. This caused the game to throw away modded guns, ammunition, and other objects.
I rely primarily on these backups for ongoing operational continuity. I do not make regular backups of the game, although I do make as-needed backups before upgrades and other major changes. The most recent of these backups was more than a week old and would not have helped recover from this incident.
In order to reduce the risk of future data loss from the same root cause, I’ve made changes to the server configuration. Backups are now less frequent, and are created every ten minutes instead of every five minutes. I’ve also extended the number of backups from five to twelve, giving us, in total, a 120-minute rollback window. Given how observant y’all are, I believe this is adequate to allow us to recover if we’re affected by this problem again.
The increased time between backups may make timewarps worse when the server crashes. I believe this happens infrequently enough that it won’t be an issue, but the server did crash last night. It happens. If this turns out to be a problem, we can revisit it and make other choices.
I took the opportunity while I had the server offline to do some chores:
I’ve updated the MOTD, as outlined in the MOTD Changes thread.
I’ve updated the Trash Removal settings. Grids are now only protected out to three kilometers from the nearest player, down from ten. This should help reduce the sim-speed drops we’ve been seeing lately. (Other trash-removal settings have not changed: stations, powered grids, respawn points, and grids with production facilities are still protected regardless of distance, as are grids with more than 20 blocks.)