Bootstrap fails if a snapshot is created and deleted on ROOT pool

Brief description of the problem

[ROCKSTOR 5.0.3-0]

I’ve created a snapshot on the root pool like this:

btrfs subvolume snapshot /mnt2/ROOT /mnt2/ROOT/.backup

After reboot, rockstor shows this snapshot as a share.
I can delete it via UI, but after reboot bootstrap service can’t start.

Detailed step by step instructions to reproduce the problem

  1. Create a snapshot on ROOT pool.
  2. Reboot
  3. Delete the share via interface
  4. Reboot

Rockstor.log during bootstrap load:

[05/Sep/2023 19:20:57] ERROR [storageadmin.middleware:34] Error running a command. cmd = /usr/sbin/btrfs property get /mnt2/ROOT/.backup ro. rc = 1. stdout = ['']. stderr = ['ERROR: failed to open /mnt2/ROOT/.backup: No such file or directory', 'ERROR: failed to detect object type: No such file or directory', '']
Traceback (most recent call last):
  File "/opt/rockstor/.venv/lib/python3.6/site-packages/django/core/handlers/base.py", line 185, in _get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "/opt/rockstor/.venv/lib/python3.6/site-packages/django/views/decorators/csrf.py", line 58, in wrapped_view
    return view_func(*args, **kwargs)
  File "/opt/rockstor/.venv/lib/python3.6/site-packages/django/views/generic/base.py", line 68, in view
    return self.dispatch(request, *args, **kwargs)
  File "/opt/rockstor/.venv/lib/python3.6/site-packages/rest_framework/views.py", line 495, in dispatch
    response = self.handle_exception(exc)
  File "/opt/rockstor/.venv/lib/python3.6/site-packages/rest_framework/views.py", line 455, in handle_exception
    self.raise_uncaught_exception(exc)
  File "/opt/rockstor/.venv/lib/python3.6/site-packages/rest_framework/views.py", line 466, in raise_uncaught_exception
    raise exc
  File "/opt/rockstor/.venv/lib/python3.6/site-packages/rest_framework/views.py", line 492, in dispatch
    response = handler(request, *args, **kwargs)
  File "/usr/lib64/python3.6/contextlib.py", line 52, in inner
    return func(*args, **kwds)
  File "/opt/rockstor/src/rockstor/storageadmin/views/command.py", line 395, in post
    import_shares(p, request)
  File "/opt/rockstor/src/rockstor/storageadmin/views/share_helpers.py", line 96, in import_shares
    shares_in_pool = shares_info(pool)
  File "/opt/rockstor/src/rockstor/fs/btrfs.py", line 1011, in shares_info
    pool_mnt_pt, snap_idmap[vol_id]
  File "/opt/rockstor/src/rockstor/fs/btrfs.py", line 1055, in parse_snap_details
    writable = not get_property(full_snap_path, "ro")
  File "/opt/rockstor/src/rockstor/fs/btrfs.py", line 2346, in get_property
    o, e, rc = run_command(cmd)
  File "/opt/rockstor/src/rockstor/system/osi.py", line 251, in run_command
    raise CommandException(cmd, out, err, rc)
system.exceptions.CommandException: Error running a command. cmd = /usr/sbin/btrfs property get /mnt2/ROOT/.backup ro. rc = 1. stdout = ['']. stderr = ['ERROR: failed to open /mnt2/ROOT/.backup: No such file or directory', 'ERROR: failed to detect object type: No such file or directory', '']
[05/Sep/2023 19:20:57] ERROR [smart_manager.data_collector:1015] Failed to update share state.. exception: ['Internal Server Error: Expecting value: line 1 column 1 (char 0)']

I think we will need @phillxnet help here. If I read this correctly (and that’s a big “if”), root snapshots should not be listed during that call stack that you posted … (and, in your case what leads to the error, it tries to get some status/attribute of the share you have already deleted, which then seems to have the knock-on effect of making the bootstrap fail):

I am wondering whether a previously deleted snapshot can still be listed during the execution of

btrfs subvol list -s /mnt2/<mnt_point> or btrfs subvol list -p /mnt2/<mnt_point> actions that are executed in the above code.

Not sure whether that helps tracking down a special scenario that the Rockstor code does not address, but may be executing the subvol command with option d might give us further insights:

btrfs subvol list -d /mnt2/<mnt_point> (-d lists deleted subvolumes that have not yet been cleaned)

If the ./backup shows up with that command, then maybe we have to add a condition to the shares_info method that handles that (or some subsequent place).

But … with my limited knowledge this might not at all be the root cause.

2 Likes