BTRFS scrubbing, why is it this slow?

In nearly every article I have read about BTRFS maintenance, they recommend that you scrub regularly.

Some say do it once every week, some say once every month. I guess the frequency is not important as long as you do it on some regular basis.

So I do that, allthough at a frequency more like once every 2-3 months. BTRFS has never reported errors.

But I wonder why it has to be so slow.

If I use iostat to see what is happening on the disks during a scrub, I see that files are being read at a rate of approximately 22 megabytes pr. second (MB/s) from each disk. This is far from the speed of the disks (all of them can read at 75+ MB/s), but no matter. I have have 7 disks, and this means about 154 MB/s is being read.

When I monitor the scrub with “btrfs scrub status” I can see the progress. If I divide the progress out so I get MB/s i get a figure of about 40 MB/s scrubbing speed. Where does the rest of the 154Mb go? Why this headroom?

Since I run RAID6, I could understand if scrubbing was something like 110 MB/s (22x5), since data is only on 5 disks, but 40 MB/s seems a little slow.

I have been looking at processor usage, but thats not a problem, all 4 cores are at 15-20% all the time, so I don’t think the limitation is the processor.

The system has 8Gigs memory, and reports 6Gigs available. So not low on memory.

I must admit that I wonder about this.
A scrub is an essential job, but it puts wear on the disks. Thats why I do it less often.
But it would be better if the system wouldnt have to read 3,85 (154/40) times as much data as is actually being processed / progressed. I find it very wastefull. Not to mention the 52 hour job the current scrubbing is on my system, could be done in perhaps 18 hours or less. That would probably have me do it a little more often.

There is definately room for improvement on the BTRFS front here.

Read faster from the disks, use all the data read from them instead of about 25% of it, and use the processor harder, at least when there is nothing reading to and from the pool.

Anybody have any thoughts to this? Others must have noticed this too?

Running raid 5 on 5x3TB 7200 rpm disk my scrubs are around 15 hours hours. starting at 3 am and normally done around 6pm. I’d run a disk utilization comparison but I’m moving some files back at the moment.

I saw on the mailing list (can’t seem to find it back though) back in December that this is a known issue along with the inability to properly check free space on a RAID 56 array.

Basically they said kernel 4.4 the RAID 56 will likely be considered stable but not production ready because

1)free space still doesn’t work right for 56
2) Scrubs on 56 are still way slower than they should be
3) There still won’t be a way to monitor disks

Mine is just as slow even when I tried loading it up with 10 spindles for testing.

Thanks for the reply :smile:

So its a bug, kind off. Hope they get it fixed / optimized, så its not so incredibly slow…