Unresponsive system

Hi @shocker,

Allow me to chip in on this one:

I would personally try Leap 15.1 first as I see it as a lot more safe than Tumbleweed and received a ton of backports from the newer kernel. As a result, from a Btrfs perspective, it is far ahead from a generic 4.12 kernel. See below for examples:

On a freshly-updated Leap 15.1 system, for instance, you can see that the kernel package was updated just a month ago:

rockdev:~ # zypper info kernel-default
Loading repository data...
Reading installed packages...


Information for package kernel-default:
---------------------------------------
Repository     : Main Update Repository
Name           : kernel-default
Version        : 4.12.14-lp151.28.36.1
Arch           : x86_64
Vendor         : openSUSE
Installed Size : 311.1 MiB
Installed      : Yes
Status         : up-to-date
Source package : kernel-default-4.12.14-lp151.28.36.1.nosrc
Summary        : The Standard Kernel
Description    :
    The standard kernel for both uniprocessor and multiprocessor systems.


    Source Timestamp: 2019-12-06 13:50:27 +0000
    GIT Revision: 8f4a495fffe8cb92ade0b0afc0abe10de21a1d4a
    GIT Branch: openSUSE-15.1

If we look at the Btrfs-related changes, we have (listing only the 40 most recent ones):

rockdev:~ # rpm -q --changelog kernel-default | grep btrfs | head -n 40
- btrfs: tracepoints: Fix bad entry members of qgroup events
- btrfs: tracepoints: Fix wrong parameter order for qgroup  events
- btrfs: qgroup: Always free PREALLOC META reserve in
  btrfs_delalloc_release_extents() (bsc#1155179).
- btrfs: block-group: Fix a memory leak due to missing
  btrfs_put_block_group() (bsc#1155178).
- btrfs: Ensure btrfs_init_dev_replace_tgtdev sees up to date
- btrfs: remove wrong use of volume_mutex from
  btrfs_dev_replace_start (bsc#1154651).
- btrfs: Ensure replaced device doesn't have pending chunk allocation (bsc#1154607).
- btrfs: qgroup: Fix reserved data space leak if we have  multiple
- btrfs: qgroup: Fix the wrong target io_tree when freeing
- btrfs: relocation: fix use-after-free on dead relocation  roots
- Btrfs: do not abort transaction at btrfs_update_root() after
- Btrfs: remove BUG() in btrfs_extent_inline_ref_size
- Btrfs: convert to use btrfs_get_extent_inline_ref_type
- blacklist.conf: Add invalid btrfs commits
  patches.suse/btrfs-add-missing-inode-version-ctime-and-mtime-upda.patch.
- btrfs: start readahead also in seed devices (bsc#1144886).
- btrfs: clean up pending block groups when transaction commit
- btrfs: handle delayed ref head accounting cleanup in abort
- btrfs: add cleanup_ref_head_accounting helper (bsc#1050911).
- btrfs: fix pinned underflow after transaction aborted
- btrfs: Fix delalloc inodes invalidation during transaction abort
- btrfs: Split btrfs_del_delalloc_inode into 2 functions
- btrfs: track running balance in a simpler way (bsc#1145059).
- btrfs: use GFP_KERNEL in init_ipath (bsc#1086103).
- btrfs: scrub: add memalloc_nofs protection around init_ipath
- patches.suse/Btrfs-kill-btrfs_clear_path_blocking.patch:
  patches.suse/btrfs-reloc-also-queue-orphan-reloc-tree-for-cleanup-to-avoid-bug_on.patch.
  patches.suse/btrfs-fix-wrong-ctime-and-mtime-of-a-directory-after.patch.
- btrfs: qgroup: Check bg while resuming relocation to  avoid
  patches.suse/btrfs-reloc-also-queue-orphan-reloc-tree-for-cleanup-to-avoid-bug_on.patch.
- btrfs: don't double unlock on error in btrfs_punch_hole
- btrfs: reloc: Also queue orphan reloc tree for cleanup to avoid  BUG_ON() (bsc#1133612).
  patches.suse/0001-btrfs-extent-tree-Fix-a-bug-that-btrfs-is-unable-to-.patch.
- btrfs: qgroup: Don't scan leaf if we're modifying reloc  tree
- btrfs: extent-tree: Use btrfs_ref to refactor
  btrfs_free_extent() (bsc#1063638 bsc#1128052 bsc#1108838).
- btrfs: extent-tree: Use btrfs_ref to refactor

Now, looking at the most recent change listed there…

- btrfs: tracepoints: Fix bad entry members of qgroup events

… we can see it actually was submitted last October for inclusion in kernel 5.4-rc5 (unless I misunderstand this, of course):
https://lkml.org/lkml/2019/10/22/440

Now, looking at a freshly-updated Tumbleweed:

rockdev:~ # zypper info kernel-default
Loading repository data...
Reading installed packages...


Information for package kernel-default:
---------------------------------------
Repository     : Main Repository (OSS)
Name           : kernel-default
Version        : 5.3.12-2.2
Arch           : x86_64
Vendor         : openSUSE
Installed Size : 163.1 MiB
Installed      : Yes
Status         : up-to-date
Source package : kernel-default-5.3.12-2.2.nosrc
Summary        : The Standard Kernel
Description    :
    The standard kernel for both uniprocessor and multiprocessor systems.


    Source Timestamp: 2019-11-21 07:21:43 +0000
    GIT Revision: a6f60814d3dbf81b05caf84e6143251ca14f5f37
    GIT Branch: stable

and most importantly:

rockdev:~ #  rpm -q --changelog kernel-default | grep btrfs | head -n 40
- btrfs: block-group: Fix a memory leak due to missing
  btrfs_put_block_group() (bnc#1151927).
- btrfs: don't needlessly create extent-refs kernel thread
- btrfs: tracepoints: Fix wrong parameter order for qgroup events
- btrfs: tracepoints: Fix bad entry members of qgroup events
- btrfs: fix uninitialized ret in ref-verify (bnc#1151927).
- btrfs: fix incorrect updating of log root tree (bnc#1151927).
- btrfs: fix balance convert to single on 32-bit host CPUs
- btrfs: allocate new inode in NOFS context (bnc#1151927).
- btrfs: relocation: fix use-after-free on dead relocation roots
- btrfs: delayed-inode: Kill the BUG_ON() in
  btrfs_delete_delayed_dir_index() (bnc#1151927).
- btrfs: extent-tree: Make sure we only allocate extents from
- btrfs: tree-checker: Add ROOT_ITEM check (bnc#1151927).
- btrfs: Detect unbalanced tree with empty leaf before crashing
- btrfs: fix allocation of free space cache v1 bitmap pages
- btrfs: Relinquish CPUs in btrfs_compare_trees (bnc#1151927).
- btrfs: adjust dirty_metadata_bytes after writeback failure of
- btrfs: qgroup: Fix the wrong target io_tree when freeing
- btrfs: qgroup: Fix reserved data space leak if we have multiple
- btrfs: Fix a regression which we can't convert to SINGLE profile
  writeback attempts (btrfs hangup).
- btrfs: trim: Check the range passed into to prevent overflow
- btrfs: qgroup: Don't hold qgroup_ioctl_lock in
  btrfs_qgroup_inherit() (bnc#1012628).
- btrfs: Flush before reflinking any extent to prevent NOCOW write
- btrfs: fix minimum number of chunk errors for DUP (bnc#1012628).
- btrfs: tree-checker: Check if the file extent end overflows
- btrfs: inode: Don't compress if NODATASUM or NODATACOW set
- btrfs: shut up bogus -Wmaybe-uninitialized warning
- btrfs: correctly validate compression type (bnc#1012628).
  patches.suse/btrfs-8447-serialize-subvolume-mounts-with-potentially-mi.patch
- btrfs: start readahead also in seed devices (bnc#1012628).
  patches.kernel.org/5.1.8-025-btrfs-qgroup-Check-bg-while-resuming-relocation.patch
- btrfs: correct zstd workspace manager lock to use spin_lock_bh()
- btrfs: qgroup: Check bg while resuming relocation to avoid
- btrfs: reloc: Also queue orphan reloc tree for cleanup to
- btrfs: don't double unlock on error in btrfs_punch_hole
- btrfs: Check the compression level before getting a workspace
- Btrfs: do not abort transaction at btrfs_update_root() after

As you can see by looking only at the most recent changes, we only are missing 3 changes in the Leap 15.1 kernel when compared to the Tumbleweed one (Btrfs-related, of course).

Hope this helps,

1 Like

Thank you @Flox for making time to share those!
My only concern is the system force restarts during recovery :slight_smile:
For example the delete process cannot be stopped and not even killed. The only way to recover and have a working FS is a force restart :slight_smile:
To be honest now I’m searching the internet for cloud backup solutions to drop everything there before playing with delete command again.
Anyway I’ll join the alfa testing and share my feedback. It’s ok if there are bugs in the gui as most of the time I’m a cli guy :slight_smile:

Cheers!

@Flox Thanks for that excellent break down of the Leap15.1 vs Tumbleweed kernel btrfs backports.

I fully concur, plus in the Leap15.1 / SLES kernel we have the benefit of better 3rd party driver compatibility via the ‘qualification’ programs that often accompany enterprise linux kernels.

Thanks again for publishing your findings on this very pertinent issue.

Well sharing is caring :smile: It’s aleways great to share. This is how you can improve the product, others can easily find others and devs can identify issues. :slight_smile:

I’ll switch to beta for sure in the upcomming days, right now I’ve discovered dropbox for business with unlimited storage and it’s great! I’m doing a full backup to dropbox just to feel safe to continue troubleshooting :slight_smile:
As of now my upload is 600Mbps, I’ll try tomorrow to balnce the as-path through some 10G peerings or split uploads via different upstreams in BGP to speed up things :slight_smile:

Thanks for the feedback. Actually it’s a raid6 so I’m still heaving one more drive. I’ll switch to the openSUSE in the near future but plans changed with this new isolation situation :slight_smile:
I’ll try to do a scrub but only after I’ll switch to openSUSE to ensure that I have most of the bug fixes in place.

Installed opensuse leap 15.1 and rockstor.

If i’m trying to restore from backup, i’m getting this:

[13/Apr/2020 12:53:50] ERROR [storageadmin.views.disk:418] Error running a command. cmd = /usr/sbin/smartctl --info /dev/disk/by-id/usb-Samsung_Flash_Drive_0363015100009329-0:0. rc = 1. stdout = [‘smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.12.14-lp151.28.44-default] (SUSE RPM)’, ‘Copyright © 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org’, ‘’, ‘/dev/disk/by-id/usb-Samsung_Flash_Drive_0363015100009329-0:0: Unknown USB bridge [0x090c:0x1000 (0x1100)]’, ‘Please specify device type with the -d option.’, ‘’, ‘Use smartctl -h to get a usage summary’, ‘’, ‘’]. stderr = [’’]
Traceback (most recent call last):
File “/opt/rockstor/src/rockstor/storageadmin/views/disk.py”, line 416, in _update_disk_state
do.name, do.smart_options)
File “/opt/rockstor/src/rockstor/system/smart.py”, line 338, in available
[SMART, “–info”] + get_dev_options(device, custom_options)
File “/opt/rockstor/src/rockstor/system/osi.py”, line 176, in run_command
raise CommandException(cmd, out, err, rc)
CommandException: Error running a command. cmd = /usr/sbin/smartctl --info /dev/disk/by-id/usb-Samsung_Flash_Drive_0363015100009329-0:0. rc = 1. stdout = [‘smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.12.14-lp151.28.44-default] (SUSE RPM)’, ‘Copyright © 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org’, ‘’, ‘/dev/disk/by-id/usb-Samsung_Flash_Drive_0363015100009329-0:0: Unknown USB bridge [0x090c:0x1000 (0x1100)]’, ‘Please specify device type with the -d option.’, ‘’, ‘Use smartctl -h to get a usage summary’, ‘’, ‘’]. stderr = [’’]
^C

P.S. It’s not an issue I’m experiencing, USB stick is unplugged now, just to mention about this bug :slight_smile:

Another bug is that NFS service appears off in the web gui but it’s running fine into the system.
Warning! NFS Service is not running . Clients won’t be able to mount unless NFS is running.

@shocker Hello again, and thanks for the reports.

The following:

Just means that the USB stick was not known to smarmontools so essentially independant of Rockstor. Just means there will be log spamming as we use the smarctl program to assess if smart is supported to inform the enablement of the smart switch at the end of each disk entry on the disk over view page.

A newer version of smartmontools could fix this if anyone reports that device to them. I’ve done one or two of these reports myself over the years. Another option is to try and configure a custom S.M.A.R.T option for that device. This is covered in the following Rockstor documentation section:

Disk Custom S.M.A.R.T Options

Which uses an almost identical error message as an example of how these custom options can help smartctl retrieve useful info from devices that are not currently in it’s database of supported devices. But with many USB keys there is simply no smart support to be had, irrespective of that option is tried.

So in short it’s a smartmontools thing in this case.

Re:

This one is worth a new thread and it would help to specify that it’s openSUSE testing realated. Also if you could include in that thread some investigative info say from the logs it might help to track this down. I’ve not looked at the NFS stuff on the openSUSE side just yet so it may help to kick off whatever individual bugs we can tease out of that area before our relaunch on openSUSE.

Thanks again for the reports and well done on moving over the the ‘Built on openSUSE’ variant. Hopefully it’s serving your well bar the outstanding differences from our stable offerings on CentOS. And if you could start a new thread on the inability to detect NFS service status; it’s likely just down to a difference in service names or the like so may be an easy fix once we have enough info on why it’s failing. Try and include as much info as possible, ie how you setup NFS etc so your observed failure can be reproduced. We can then make an issue of it and hopefully get it sorted quickly.

2 Likes

Done :slight_smile:
One more thing, I’m trying to move my license to the new server. I have updated the Appliance ID in appman.rockstor.com but it’s not applied. Do I need to do something else?

I’ll continue the troubleshooting now on this topic, balance and try to recover my degraded raid6 :wink:

Cheers!

Tried to balance, seems that the system is doing 1 chunk per minute, way too slow but it’s not freezing anymore :slight_smile:

btrfs balance status -v /mnt2/mount/

Balance on ‘/mnt2/mount/’ is running
3 out of about 20965 chunks balanced (4 considered), 100% left
Dumping filters: flags 0x7, state 0x1, force is off
DATA (flags 0x0): balancing
METADATA (flags 0x0): balancing
SYSTEM (flags 0x0): balancing

Now the broken HDD is removed and my FS looks like:

devid 19 size 12.73TiB used 12.01TiB path /dev/sda
devid 20 size 12.73TiB used 12.01TiB path /dev/sdf
*** Some devices missing

Should I try to btrfs device delete missing /mnt2/mount ?
The faulty device wasn’t replaced I just removed it from the pool and the disk is also deleted from “disks”.

The FS usage looks like:

btrfs filesystem usage /mnt2/mount

WARNING: RAID56 detected, not implemented
WARNING: RAID56 detected, not implemented
WARNING: RAID56 detected, not implemented
Overall:
Device size: 172.83TiB
Device allocated: 0.00B
Device unallocated: 172.83TiB
Device missing: 7.28TiB
Used: 0.00B
Free (estimated): 0.00B (min: 8.00EiB)
Data ratio: 0.00
Metadata ratio: 0.00
Global reserve: 512.00MiB (used: 0.00B)

Data,RAID6: Size:137.86TiB, Used:135.36TiB
/dev/sda 12.00TiB
/dev/sdb 12.00TiB
/dev/sdd 12.00TiB
/dev/sde 12.00TiB
/dev/sdf 12.00TiB
/dev/sdg 7.26TiB
/dev/sdh 7.26TiB
/dev/sdi 7.26TiB
/dev/sdj 7.26TiB
/dev/sdk 7.26TiB
/dev/sdl 7.26TiB
/dev/sdm 7.26TiB
/dev/sdn 7.26TiB
/dev/sdo 7.26TiB
/dev/sdp 7.26TiB
/dev/sdq 7.26TiB
/dev/sdr 7.26TiB
/dev/sds 7.26TiB
/dev/sdt 7.26TiB
missing 6.67TiB

Metadata,RAID6: Size:153.79GiB, Used:149.60GiB
/dev/sda 9.97GiB
/dev/sdb 9.97GiB
/dev/sdd 9.97GiB
/dev/sde 9.97GiB
/dev/sdf 9.97GiB
/dev/sdg 9.38GiB
/dev/sdh 9.38GiB
/dev/sdi 9.38GiB
/dev/sdj 9.38GiB
/dev/sdk 9.38GiB
/dev/sdl 9.38GiB
/dev/sdm 9.38GiB
/dev/sdn 9.38GiB
/dev/sdo 9.38GiB
/dev/sdp 9.38GiB
/dev/sdq 9.38GiB
/dev/sdr 9.38GiB
/dev/sds 9.38GiB
/dev/sdt 9.38GiB
missing 8.63GiB

System,RAID6: Size:13.81MiB, Used:10.97MiB
/dev/sdg 1.06MiB
/dev/sdh 1.06MiB
/dev/sdi 1.06MiB
/dev/sdj 1.06MiB
/dev/sdk 1.06MiB
/dev/sdl 1.06MiB
/dev/sdm 1.06MiB
/dev/sdn 1.06MiB
/dev/sdo 1.06MiB
/dev/sdp 1.06MiB
/dev/sdq 1.06MiB
/dev/sdr 1.06MiB
/dev/sds 1.06MiB
/dev/sdt 1.06MiB
missing 1.06MiB

Unallocated:
/dev/sda 736.16GiB
/dev/sdb 736.16GiB
/dev/sdd 736.16GiB
/dev/sde 736.16GiB
/dev/sdf 736.16GiB
/dev/sdg 8.31GiB
/dev/sdh 8.31GiB
/dev/sdi 8.31GiB
/dev/sdj 8.31GiB
/dev/sdk 8.31GiB
/dev/sdl 8.31GiB
/dev/sdm 8.31GiB
/dev/sdn 8.31GiB
/dev/sdo 8.31GiB
/dev/sdp 8.31GiB
/dev/sdq 8.31GiB
/dev/sdr 8.31GiB
/dev/sds 8.31GiB
/dev/sdt 8.31GiB
missing 616.87GiB

@shocker Re:

So that’s god to know. Have you disable quotas again, that’s usually the main slow down. Also note that the parity raids are rather slow, especially on the repair side of things.

Given your report indicates missing disks I’m assuming you must have mounted degraded. And the Web-UI should be giving you pointers on how to do the delete missing. But yes you do really need to get the pool into a non device missing state, either through the Web-UI or command line. This would have been best done prior to initiating a balance however. And given it’s potentially poorly until there are no missing devs and you are likely mounted degraded you need to stick to one thing at a time.

Let us know how it goes. I’d be tempted to cancel the balance and wait until it’s confirmed as not running and then disable quotas, then remove the missing device from command or Web-UI and this in turn will kick off a kind of internal balance which will not show (or at least didn’t use to) in a balance status request. Although in some situations the Web-UI can indicate this but the btrfs balance status didn’t use to indicate it. Haven’t tried it recently.

Let us know how you get on. But do remember such things as the following output at the command line:

Are further indicators that the btrfs parity levels have a little way to go. Although I did see a proposed fix for this particular short fall on the btrfs mailing list more recently so maybe that will arrive in good time via our upstream kernel updates.

So in short one thing at a time and get quota turned off if they arn’t off already. And give each thing a change to properly settle before moving on. But prime directive is to get to a no missing devices as otherwise you are dependant on a degraded mount which is exclusively for repair procedures and balance is not that, unless initiated internally via a dev delete, and again that type of balance at least fairly recently didn’t use to show up at all other than by watching space move around and having a device indicated as zero size: that’s how our ‘internal’ balance indicator works within the Web-UI.

Hope that helps.

1 Like

Indeed the quotas are enabled again :slight_smile:
I have already canceled the balance and I will give it a try to delete to see if this will not freeze the system anymore. I’m waiting for the maintenance hours to play with and I’ll use the gui as this way we can also test the functionality :slight_smile:

Thanks!

Quick question any idea about the application activation via the appman? I have changed it on the appman but the gui is keep asking to activate. Do i need to do something else?

1 Like

@shocker Re:

Yes, we default to enabling them on import.

This is a temporary problem while we are in this transition phase. The Stable Channel for our “Built on openSUSE” variant is as-yet empty. Once we have feature parity, or near enough, we are to transition the nominated testing channel rpm over to the openSUSE Stable Channel so we can get on with addressing our rather significant technical debut (old Django old Python etc) via the openSUSE based testing channel. This will inevitably cause breakage but is what the testing channel is for. In the interim transition period we are only publishing to the testing channel in openSUSE based rpms and the Stable channel for CentOS. Once we are ready for the “Build on openSUSE” re-launch of sorts we will then maintain both Stable and Testing as per the docs.

So in short there are only testing rpms at the moment for our ‘Built on openSUSE’ variant. But this is hopefully to change shortly, once we get the last few niggles sorted. I’m also keen to address one or two long standing bugs in both our variants before the re-launch, but that very much depends on available time and one or two looming deadlines.

Hopefully that answers that one. And thanks again for testing our experimental edge. And do keep an eye on the Leap15.2 release schedule by the way, just saying :slight_smile: .

2 Likes

Thanks for the feedback. Seems that it will be release on 7th May. Let’s see what btrfs improvements will bring on :slight_smile:

Started the delete process. It’s running now for ~ 9h:

The missing for unallocated decreased from -6.67 TiB to -6.49TiB, no idea if this is normal or not to be so slow, but at least I can see some progress :slight_smile:

And disk are being busy:

1 Like

-6.18 TiB is running very slow :slight_smile:
Is there a way to fine tune the speed? I would like to limit the speed in order to not create system degradation even that this will keep running for a month or so :slight_smile:

@shocker Hello again.

Re:

No I don’t think so. And the parity raids are particularly bad speed wise in these repair scenarios. Only thing you can do is make sure quotas are off and you have already done that.

Incidentally I found the linux-btrfs mailing list post regarding a proposed fix for the prior mention of poor support in the parity raids for the ‘btrfs fi usage’ comand:

https://lore.kernel.org/linux-btrfs/91df2f66-ea8a-c369-9f03-b826dbbe1481@libero.it/

Pretty sure you can’t even play with nice values for this ‘level’ of stuff.

Although you may be able to play with the nice level of other services however. But again it’s likely not to have much effect is my suspicion.

Thanks @phillxnet already tried to nice the processes but without any effect. Somehow this is on top of everything :slight_smile:
Too bad that there are no ways to play with this as you can do with zfs.

The only difference here are the warnings except that nothing is missing and I cannot see any empty results for used or I’m getting it wrong? :slight_smile:

btrfs fi us /

Overall:
Device size: 109.30GiB
Device allocated: 7.02GiB
Device unallocated: 102.28GiB
Device missing: 0.00B
Used: 4.16GiB
Free (estimated): 104.25GiB (min: 104.25GiB)
Data ratio: 1.00
Metadata ratio: 1.00
Global reserve: 16.00MiB (used: 0.00B)

Data,single: Size:6.01GiB, Used:4.04GiB
/dev/sdc3 6.01GiB

Metadata,single: Size:1.01GiB, Used:121.61MiB
/dev/sdc3 1.01GiB

System,single: Size:4.00MiB, Used:16.00KiB
/dev/sdc3 4.00MiB

Unallocated:
/dev/sdc3 102.28GiB

@shocker Re:

I think that’s the core of it. It will, in time, hopefully have the same support that we have seen in the other btrfs raid levels. This has knock-on for Rockstor as it goes since we use this command in a few places and so end up having size reporting issues as a result when these younger (in btrfs) raid levels are used. So it will be nice to have this fixed in the upstream.

Haven’t looked at it in detail but it’s got to ease our task of reading state if that state is reported equally well across the various raid levels.