5.0.11 SCRUB not working?

I’m on Testing updates, installed latest 5.0.10 first then went to testing and did the 5.0.11 thing.

Now, with no Rockon enabled or installed or running, the system does not record Scrub data at all. It also seems to terminate way too early.

Before I uninstalled the F@H Rockon, I got the same result.

Should I do another clean 5.0.10 and try it?

:sunglasses:

1 Like

Hi @Tex1954 ,

Thanks a lot for the report. I can confirm I see the same thing as you on a source build of the latest testing branch.
After a quick look, I think this is just a parsing issue more than a failure to run the scrub itself; see below.

I tested as follows:

  • Leap 15.6 base, just as you: this means kernel 6.4 with corresponding btrfs-progs v6.5.1
  • 1 pool with just one disk (single level)
  • manually click on start a scrub. I then see the same as in your screenshot: status of unknown.

At the command line, I see as if it ran without a problem:

rockdev:~ # btrfs scrub status /mnt2/main_pool
UUID:             900e4fca-a18a-4dca-8350-4bb90e88c466
Scrub started:    Sun Jul  7 17:16:56 2024
Status:           finished
Duration:         0:00:00
Total to scrub:   192.00KiB
Rate:             0.00B/s
Error summary:    no errors found

@phillxnet , let me know if I’m not looking at that the right way as this is your area of expertise. I’ll try to dig how we parse scrub results to see if/what differs from what we expect.

3 Likes

Further information:

>>> PoolScrub.objects.all().values()
<QuerySet [{'id': 1, 'pool_id': 1, 'status': 'unknown', 'pid': 15628, 'start_time': datetime.datetime(2024, 7, 7, 21, 16, 56, 266612, tzinfo=datetime.timezone.utc), 'end_time': None, 'time_left': 0, 'eta': None, 'rate': '', 'kb_scrubbed': None, 'data_extents_scrubbed': 0, 'tree_extents_scrubbed': 0, 'tree_bytes_scrubbed': 0, 'read_errors': 0, 'csum_errors': 0, 'verify_errors': 0, 'no_csum': 0, 'csum_discards': 0, 'super_errors': 0, 'malloc_errors': 0, 'uncorrectable_errors': 0, 'unverified_errors': 0, 'corrected_errors': 0, 'last_physical': 0}]>

>>> returned = scrub_status_raw(mnt_pt="/mnt2/main_pool")
>>> returned
{'status': 'finished', 'duration': 0, 'data_extents_scrubbed': 0, 'tree_extents_scrubbed': 12, 'kb_scrubbed': 0.0, 'tree_bytes_scrubbed': 196608, 'read_errors': 0, 'csum_errors': 0, 'verify_errors': 0, 'no_csum': 0, 'csum_discards': 0, 'super_errors': 0, 'malloc_errors': 0, 'uncorrectable_errors': 0, 'unverified_errors': 0, 'corrected_errors': 0, 'last_physical': 22020096}

and:

rockdev:~ # btrfs scrub status -R /mnt2/main_pool
UUID:             900e4fca-a18a-4dca-8350-4bb90e88c466
Scrub started:    Sun Jul  7 17:16:56 2024
Status:           finished
Duration:         0:00:00
        data_extents_scrubbed: 0
        tree_extents_scrubbed: 12
        data_bytes_scrubbed: 0
        tree_bytes_scrubbed: 196608
        read_errors: 0
        csum_errors: 0
        verify_errors: 0
        no_csum: 0
        csum_discards: 0
        super_errors: 0
        malloc_errors: 0
        uncorrectable_errors: 0
        unverified_errors: 0
        corrected_errors: 0
        last_physical: 22020096

It does seem to parse the scrub “details” ok but the status remains as “unknown”.

2 Likes

I just repeated the test with a pool that has some actual data in it (the output from the test above was on an empty pool).
I now see:

rockdev:/mnt2/test_scrub # btrfs scrub status -R /mnt2/main_pool
UUID:             900e4fca-a18a-4dca-8350-4bb90e88c466
Scrub started:    Mon Jul  8 08:37:37 2024
Status:           finished
Duration:         0:00:01
        data_extents_scrubbed: 65999
        tree_extents_scrubbed: 9964
        data_bytes_scrubbed: 1602797568
        tree_bytes_scrubbed: 163250176
        read_errors: 0
        csum_errors: 0
        verify_errors: 0
        no_csum: 0
        csum_discards: 0
        super_errors: 0
        malloc_errors: 0
        uncorrectable_errors: 0
        unverified_errors: 0
        corrected_errors: 0
        last_physical: 3075473408

…but the webUI still surfaces the status of “unknown” and no data scrubbed:
image

It’s almost as if the status was caught very early on and before any data is recorded as scrubbed yet.

2 Likes

I just tested following the same steps as above but using a Leap 15.5 base:

  • Install Rockstor-5.0.9 Leap 15-5 ISO from the downloads page
  • Create a pool single raid level
  • Manually start a scrub: finishes instantly (pool was empty) and the status is correctly detected as finished.

I then activated the testing update channel and updated to latest 5.0.12 RPM. A manual scrub is still correctly detected as finished.

@phillxnet, it thus seems to be specific to Leap 15-6 so far.

2 Likes

I am finding that the F@H docker no longer works with 15.6 as well…

Go figure…

:sunglasses:

Might not be the correct test, but ran a scrub on a 5.0.12-0 on Leap 15.6 base on the root “pool” … finished quickly, too, this is what it reported:

1 Like

probably could be a separate thread, but, quickly, it doesn’t install/run or it’s missing from the standard catalog (since it was deprecated a few months ago)?

Oh, and I just noticed that a scheduled scrub ran when the system was still 5.0.9-0 on Leap 15.6 and it finished and seemed to have reported the status correctly there, too. But I upgraded directly to 5.0.12-0 without going to version 10 or 11.

LOL!

Seems a lot of things going on. I’m trying to install from the latest release 5.0.9 and then upgrading to 5.0.12 and it isn’t working right yet…

Thanks for testing… curious that it looks fine for you. It seems to be consistently showing unknown to me on any pool.
We’ll need to keep investigating, unfortunately.

1 Like

@Hooverdan, quick question on an important point brought up by @phillxnet: what kernel version are you using on the machine you used to test that? Maybe we’re seeing a consequence of different kernel version in case you are using kernel backports.
I tested using a base Rockstor Leap 15.6 install.

shoot, yes, I should have mentioned that. I am using the kernel backport, so yesterday that was at
6.9.8-lp155.4.gc3f5592-default

1 Like

Excellent!
We do seem to be looking at a sensitivity between kernels, then.

1 Like