BTRFS error (lots of them)

Hi all,

Sorry I’ve been very quiet the last 6 months or so, not been very involved. :pensive:

I had lots of console errors last night on my system drive (nvme - Rockstor 3.92-57 still, 213G free on / ).

It was fine the day before, but last night as I went to shut down from the UI, I couldn’t connect. couldn’t ssh either, and it wasn’t showing up in DHCP leases on the router. Plugged a monitor and keyboard in to see what was going on.
It happened last week, too (not being able to connect to the UI, or by ssh, I didn’t connect a monitor and keyboard on that occasion)

Also on the console there was:

systemd-journald Failed to send WATCHDOG=1 notification message: Transport endpoint is not connected

I guess if the system disk is having problems, logging to it is going to be compromised.

Having done a hard shutdown, it successfully restarted, but there doesn’t seem to be anything in the logs about it… again, I guess if the disk/filesystem completely went haywire you wouldn’t see anything.

I’ll have a closer look today. Any advice on things to check/maintenance to do?

Many thanks again, and best wishes,
Matthew

@HarryHUK Hello again.
Re:

The two main hardware failures are PSU’s and RAM. You could try a memtest86+ boot disk such as we link to in our “Pre-Install Best Practice (PBP)” doc section:
https://rockstor.com/docs/installation/pre-install-howto.html#pre-install
Failing memory will of course corrupt ‘in-flight’ filesystem transactions and this in turn can lead to filesystem corruptions, or at least checksum errors. And with only a single disk btrfs is unable to resource a second to do it’s repair. And unfortunately our system disk defaults to single metadata also.

Another think to check is the cables. They can sometimes work loose via thermal cycling / vibration etc.

Hope that helps and let us know how you get on.

3 Likes

Cheers, @phillxnet .

I built the system just before Christmas. I’ll run the memtest86+ and check the cables.

System disk is an m.2 nvme disk (2280, but the MB only has a hole to screw down a 2242, so it’s just held in place by friction in the slot… Could not find an m.2 nvme disk in 2242 for love nor money)

I have a 2242 m.2 SATA disk, but that would take up a SATA place in the MB SATA controller, which I wanted to keep for data disk swapping/upgrading. Might get a SATA PCI controller to compensate if I replace the system disk with the m.2 sata. In Sept I really must think of installing 4.0.x, and using a fresh disk might be an idea.

EDIT: I’ve just found that 2242 m.2 nvme drives are available again!

3 Likes