Changed raid from 10 but cannot remove disk from pool

I am trying to move a pair of disks out of a pool to be used in a new pool. I have 6 disks, a pair of 2 TB a pair of 3 TB and a pair of 14 TB they were all in a Raid 10 pool that was presented as 3 different shares on the network. I was getting an available size from the pool that was much lower than I expected. I decided to break up the pool and have 3 Raid 1 pools instead. I removed the 2 TB disks from the pool with no problem but when attempting to remove the 3 TB disks I get the following error:

Traceback (most recent call last): File “/opt/rockstor/.venv/lib/python2.7/site-packages/huey/api.py”, line 360, in _execute task_value = task.execute() File “/opt/rockstor/.venv/lib/python2.7/site-packages/huey/api.py”, line 724, in execute return func(*args, **kwargs) File “/opt/rockstor/src/rockstor/fs/btrfs.py”, line 2059, in start_resize_pool raise e CommandException: Error running a command. cmd = /usr/sbin/btrfs device delete /dev/disk/by-id/ata-ST33000651NS_Z2931VBG /dev/disk/by-id/ata-ST33000651NS_Z2936049 /mnt2/Share1. rc = 1. stdout = [‘’]. stderr = [“ERROR: error removing device ‘/dev/disk/by-id/ata-ST33000651NS_Z2931VBG’: unable to go below four devices on raid10”, “ERROR: error removing device ‘/dev/disk/by-id/ata-ST33000651NS_Z2936049’: unable to go below four devices on raid10”, ‘’]

what stood out to me was the “unable to go below four devices on raid10” so I changed the pool to Raid 1 and get the same error. so, I changed the pool Raid again to “single” this time. I still get the “unable to go below four devices on raid10” error when I try to remove a disk from the pool.

Is there a way to remove a disk from a pool while ignoring whatever label is stuck on Raid 10? Or is there a way to correct whatever thinks the pools is still Raid 10?

The UI and available storage would indicate that the pool was converted successfully

@its_me welcome to the Rockstor community forum.

I am not sure what the root cause is for this behavior unfortunately, but after changing to RAID1 or “single” and waiting for its completion of the activity (I don’t remember whether the UI shows a progress after executing the command when I did a RAID change a long time ago) did you then also perform a reboot before moving on to the next step?

If you can use the command line, what does this command show in terms of RAID level:

 btrfs fi df /mnt2/<pool name>

e.g. from your screenshot

btrfs fi df /mnt2/1_8_raid1

since that will be the “ultimate” info showing you what RAID level the pool is on that you want to remove disks from.

1 Like

Hooverdan,
Thank you for responding to my questions. Yes, the Balances tab under the share does show a percent finished when you refresh. And I did wait till the Status of the balance was finished. For both the change to Raid 1 and to Single.
Not sure if it matters but order of operation I did are:
1 remove 2 TB drives from Share1 pool and wait for rebalance
2 create 1_8_raid1 pool as Raid 1 with the 2 TB drives.
3 create share for 1_8_raid1 pool
4 transfer files destined for that share.
5 attempt removal of 3 TB drives from Share1 pool (failed)
6 change Raid of Share1 pool to Raid1, wait for UI (balances tab) to indicate “finished”
7 attempt removal of 3 TB drives from Share1 pool (failed)
8 change Raid of Share1 pool to single, wait for UI (balances tab) to indicate “finished
9 attempt removal of 3 TB drives from Share1 pool (failed)
10 reboot system
11 attempt removal of 3 TB drives from Share1 pool (failed)
12 attempt removal of a single 3 TB drive from Share1 pool (failed)

sudo btrfs fi df /mnt2/Share1

Data, single: total=3.09TiB, used=3.09TiB
System, RAID10: total=64.00MiB, used=368.00KiB
Metadata, single: total=5.00GiB, used=4.01GiB
GlobalReserve, single: total=512.00MiB, used=0.00B

"System, RAID10: " would apparently be my problem I have no idea where to go from here

ok. Interesting that the system portion (i.e. the entire filesystem) is still considered RAID10 whereas you have converted the data portion to single.

Let’s see: can you post:

btrfs filesystem usage -T /mnt2/Share1

That will probably make it even more clear on what’s what 


I think the problem is that it converted the data but not the metadata and therefore prevents you from removing more disks 


I see you’re on version 4.6 I think that might be the issue with the lack of complete conversion. Though I checked and I think the additional parameter was put in place by 4.5.9-1.

The only other thing I could suggest is that you manually trigger a conversion using the command line, e.g.:

btrfs balance start -mconvert=dup -dconvert=single /mnt2/Share1

I am assuming that it will ignore the dconvert parameter since that already seems done, but just in case.
Then use the remove process on the device you want outta there. Once you’ve reshuffled your hardware, you could then go back to the relevant RAID levels you need.

I have not done it this way, because I did not have a scenario like this, so YMMV.

1 Like

so I started a convert back to Raid 1 whis is about 35% complete. I suspect that is why the table has both single and RAID1 colomns.

btrfs filesystem usage -T /mnt2/Share1 

Overall:                                                                                              
    Device size:                  30.92TiB                                                            
    Device allocated:              4.21TiB                                                            
    Device unallocated:           26.71TiB                                                            
    Device missing:                  0.00B                                                            
    Used:                          4.21TiB                                                            
    Free (estimated):             19.68TiB      (min: 13.36TiB)                                       
    Data ratio:                       1.36                                                            
    Metadata ratio:                   1.60                                                            
    Global reserve:              512.00MiB      (used: 48.00KiB)                                      
                                                                                                      
            Data       Data    Metadata Metadata System                                               
Id Path     single     RAID1   single   RAID1    RAID10    Unallocated                                
-- -------- ---------- ------- -------- -------- --------- -----------                                
 5 /dev/sdc 1018.00GiB 1.11TiB  1.00GiB  3.00GiB  32.00MiB    10.63TiB                                
 6 /dev/sdd 1018.00GiB 1.11TiB  1.00GiB  3.00GiB  32.00MiB    10.63TiB                                
 2 /dev/sdf          -       -        -        -  32.00MiB     2.73TiB                                
 1 /dev/sdg          -       -        -        -  32.00MiB     2.73TiB                                
-- -------- ---------- ------- -------- -------- --------- -----------                                
   Total       1.99TiB 1.11TiB  2.00GiB  3.00GiB  64.00MiB    26.71TiB                                
   Used        1.99TiB 1.11TiB  1.56GiB  2.44GiB 416.00KiB

okay the format on that didn’t paste well

Id, Path, Data single, Data RAID1, Metadata single, Metadata RAID1, System RAID10, Unallocated


5, /dev/sdc, 1018.00GiB, 1.11TiB, 1.00GiB, 3.00GiB, 32.00MiB, 10.63TiB
6, /dev/sdd, 1018.00GiB, 1.11TiB, 1.00GiB, 3.00GiB, 32.00MiB, 10.63TiB
2, /dev/sdf, - , - , -, -, 32.00MiB, 2.73TiB
1, /dev/sdg, - , -, - , -, 32.00MiB , 2.73TiB


Total, 1.99TiB, 1.11TiB, 2.00GiB, 3.00GiB, 64.00MiB, 26.71TiB,
Used, 1.99TiB, 1.11TiB, 1.56GiB, 2.44GiB, 416.00KiB,

Thanks for the update.

Just a tip, if you encase the entire output with three ``` (in their own line before and after, and in this case you could add “bash” directly after the opening block to indicate the language) it will put it into the “terminal” view without using the proportional spacing and line breaks where they don’t belong but sometime weird coloring :slight_smile: ).

Like so:

# btrfs filesystem usage -T /mnt2/SamplePool
Overall:
    Device size:                  50.93TiB
    Device allocated:             32.81TiB
    Device unallocated:           18.12TiB
    Device missing:                  0.00B
    Device slack:                    0.00B
    Used:                         32.64TiB
    Free (estimated):              9.14TiB      (min: 6.12TiB)
    Free (statfs, df):             9.14TiB
    Data ratio:                       2.00
    Metadata ratio:                   3.00
    Global reserve:              512.00MiB      (used: 0.00B)
    Multiple profiles:                  no
            Data     Metadata System
Id Path     RAID10   RAID1C3  RAID1C3  Unallocated Total    Slack
-- -------- -------- -------- -------- ----------- -------- -----
 5 /dev/sdc  8.19TiB 18.00GiB        -     4.53TiB 12.73TiB     -
 6 /dev/sdb  8.19TiB 18.00GiB 32.00MiB     4.53TiB 12.73TiB     -
 7 /dev/sdd  8.19TiB 18.00GiB 32.00MiB     4.53TiB 12.73TiB     -
 8 /dev/sde  8.19TiB 18.00GiB 32.00MiB     4.53TiB 12.73TiB     -
-- -------- -------- -------- -------- ----------- -------- -----
   Total    16.37TiB 24.00GiB 32.00MiB    18.12TiB 50.93TiB 0.00B
   Used     16.29TiB 20.82GiB  1.77MiB
1 Like

thank you for the tip

the

btrfs balance start -mconvert=dup -dconvert=single /mnt2/Share1

finally finished with no changes to the “system” being Raid10 and the ability to remove disks

ran

btrfs balance start -sconvert=single /mnt2/Share1 --force

I now have

file-server:/ # btrfs filesystem usage -T /mnt2/Share1                                                     
Overall:                                                                                                   
    Device size:                  30.92TiB                                                                 
    Device allocated:              6.20TiB                                                                 
    Device unallocated:           24.72TiB                                                                 
    Device missing:                  0.00B                                                                 
    Used:                          6.20TiB                                                                 
    Free (estimated):             12.36TiB      (min: 12.36TiB)                                            
    Data ratio:                       2.00                                                                 
    Metadata ratio:                   2.00                                                                 
    Global reserve:              512.00MiB      (used: 0.00B)                                              
                                                                                                           
            Data    Metadata Metadata System                                                               
Id Path     RAID1   RAID1    DUP      RAID10    Unallocated                                                
-- -------- ------- -------- -------- --------- -----------                                                
 5 /dev/sdc 3.09TiB  4.00GiB  2.00GiB  32.00MiB     9.63TiB                                                
 6 /dev/sdd 3.09TiB  4.00GiB  2.00GiB  32.00MiB     9.63TiB                                                
 2 /dev/sdf       -        -        -  32.00MiB     2.73TiB                                                
 1 /dev/sdg       -        -        -  32.00MiB     2.73TiB                                                
-- -------- ------- -------- -------- --------- -----------                                                
   Total    3.09TiB  4.00GiB  2.00GiB  64.00MiB    24.72TiB                                                
   Used     3.09TiB  2.65GiB  1.34GiB 480.00KiB                                                            

rebooted and still cannot remove disks from the pool.

it also looks like the initial one did not move the data to dup for metadata or single for the data, but to RAID1 for Data and a mix of RAID1/DUP for Metadata.

Do you have any snapshots for this pool?

I would try the balance one more time, though I can’t justify on why 


btrfs balance start -sconvert=raid1 /mnt2/Share1 --force

Not sure whether @phillxnet or @Flox have some additional insights. Considering the amount of time you’ve already spent on splitting your pools, would it make more sense to back up the existing data (which I’m sure you’ve already done :slight_smile: ), dismantle all pools and re-assemble?

Unless, of course you want to continue to slog through it for the greater good :smile_cat:

Finally, it might make sense to upgrade to the latest stable 5.1.0-0 since that will also have a newer kernel and newer versions of the btrfsprogs which might contain some fixes surrounding conversion filters (I haven’t had time to take a look at the various versions and their updates, though).

2 Likes

maybe adding the soft switch to the conversion for the metadata if you’re so inclined to also get that aligned (since it’s currently a mix between RAID1 and DUP), so it only focuses on the things that have not been converted yet


Of course, if you want to take it to the next step you could use he btrfs mailing list as described here:

https://rockstor.com/docs/data_loss.html#asking-for-help

just finished

btrfs balance start -mconvert=single -dconvert=single -sconvert=single /mnt2/Share1 --force

and now have

            Data    Metadata System                                                                                                                                                                    
Id Path     single  single   RAID10    Unallocated                                                                                                                                                     
-- -------- ------- -------- --------- -----------                                                                                                                                                     
 5 /dev/sdc 1.55TiB  4.00GiB  32.00MiB    11.18TiB                                                                                                                                                     
 6 /dev/sdd 1.55TiB  2.00GiB  32.00MiB    11.18TiB                                                                                                                                                     
 2 /dev/sdf       -        -  32.00MiB     2.73TiB                                                                                                                                                     
 1 /dev/sdg       -        -  32.00MiB     2.73TiB                                                                                                                                                     
-- -------- ------- -------- --------- -----------                                                                                                                                                     
   Total    3.10TiB  6.00GiB  64.00MiB    27.82TiB                                                                                                                                                     
   Used     3.09TiB  4.01GiB 368.00KiB

no luck removing disks

I do have snapshots, but from before I added the 14 TB drives and removed the 2tb drives. I’m not sure they would be much help now

My backups are fragmented across several external drives I was hoping I would not have to sort through all of that.
The UI didn’t indicate that there were updates so I hadn’t thought to look that direction. Since there are, I think I will gather up may backups wipe the system do a fresh install with 5.1.0-0.

Should I create a Raid 10 pool with no data on it and attempt to convert it to Raid 1 just to see if I can remove disks from a converted pool on 5.1.0?

Sorry to see that it doesn’t seem to change the RAID10 on the system side for some reason.

My comment on the snapshots were that maybe they are impacting your ability to convert everything 
 so removing them could address it, but 
 that’s just a wild guess on my part.

If you were to go the route of wipe and reinstall (you could at least do a config backup so you get accounts back you might have created and Rockons you’ve installed), you could leave it as it is and only wipe the Rockstor OS disk, install the new version, then reimport the pool(s) you currently have and try the conversion one more time. If that doesn’t work, only then destroy the pool(s), i.e. delete via WebUI or just take the disks and wipe them all. Then create the pools you originally wanted and restore from backup
or since you already have a couple of disks extricated from the one pool, copy your data over as means of a temporary backup, then proceed to destroy that pool you want to break up 


1 Like

@Hooverdan
Took your advice, one step at a time, clean install of 5.1, restore config, restore pools, delete 3 TB drives from Share1 pool no problem. Didn’t even have to balance convert on system blocks first.

Thank you for your time and attentiveness

2 Likes