Scheduled Scrub not running

Hi!

So here is the output:

Blockquote [root@nitrogen ~]# psql -U rocky storageadmin
Password for user rocky:
psql (9.2.18)
Type “help” for help.

storageadmin=# SELECT pool_id,pid,count() FROM storageadmin_poolscrub GROUP BY pool_id,pid HAVING count() >1;
pool_id | pid | count
---------±----±------
(0 rows)

I’ll be back later with more after rebooting and starting a scrub job

And with a scrub job running:

storageadmin=# SELECT pool_id,pid,count() FROM storageadmin_poolscrub GROUP BY pool_id,pid HAVING count() >1;
pool_id | pid | count
---------±----±------
(0 rows)

Hmmm?

I noticed a zombie gunicorn process that was started at the same time as the scrub job.

Blockquote
Tasks: 224 total, 2 running, 221 sleeping, 0 stopped, 1 zombie
%Cpu(s): 0.2 us, 12.3 sy, 0.0 ni, 70.5 id, 16.9 wa, 0.0 hi, 0.1 si, 0.0 st
KiB Mem : 7084448 total, 3255612 free, 393192 used, 3435644 buff/cache
KiB Swap: 5862396 total, 5862396 free, 0 used. 6342896 avail Mem

root 14792 0.0 0.0 0 0 ? Z 21:52 0:00 [gunicorn] defunct

Blockquote [root@nitrogen ~]# pstree
systemd─┬─NetworkManager─┬─dhclient
│ └─2*[{NetworkManager}]
├─agetty
├─atd
├─auditd───{auditd}
├─avahi-daemon───avahi-daemon
├─btrfs───4*[{btrfs}]
├─chronyd
├─crond
├─dbus-daemon
├─docker-wrapper─┬─docker───5*[{docker}]
│ └─2*[{docker-wrapper}]
├─gssproxy───5*[{gssproxy}]
├─irqbalance
├─master─┬─pickup
│ └─qmgr
├─netatalk─┬─afpd
│ ├─cnid_metad
│ └─{netatalk}
├─polkitd───5*[{polkitd}]
├─postgres───15*[postgres]
├─rsyslogd───2*[{rsyslogd}]
├─shellinaboxd───shellinaboxd
├─smartd
├─sshd───sshd───bash───pstree
├─supervisord─┬─data-collector───5*[{data-collector}]
│ ├─django───2*[{django}]
│ ├─gunicorn─┬─gunicorn───2*[{gunicorn}]
│ │ └─gunicorn─┬─gunicorn
│ │ └─2*[{gunicorn}]
│ └─nginx───2*[nginx]
├─systemd-journal
├─systemd-logind
└─systemd-udevd

storageadmin=# SELECT pool_id,pid,count(*) FROM storageadmin_poolscrub GROUP BY pool_id,pid HAVING count(*) >1;
 pool_id | pid | count 
---------+-----+-------
(0 rows)

FYI

storageadmin=# SELECT * FROM storageadmin_poolscrub;
 id | pool_id |  status  |  pid  |          start_time           |           end_time            | kb_scrubbed | data_extents_scrubbed |
 tree_extents_scrubbed | tree_bytes_scrubbed | read_errors | csum_errors | verify_errors | no_csum | csum_discards | super_errors | mall
oc_errors | uncorrectable_errors | unverified_errors | corrected_errors | last_physical 
----+---------+----------+-------+-------------------------------+-------------------------------+-------------+-----------------------+
-----------------------+---------------------+-------------+-------------+---------------+---------+---------------+--------------+-----
----------+----------------------+-------------------+------------------+---------------
  1 |       3 | finished | 26203 | 2017-07-15 10:14:53.958678-07 | 2017-07-15 14:07:04.958678-07 |  7778964796 |             134997767 |
                754941 |         12368953344 |           0 |           0 |             0 |  519680 |             0 |            0 |     
        0 |                    0 |                 0 |                0 | 8530047270912
(1 row)

OK, my manually started scrub job finished successfully:

Blockquote scrub status for 2815c029-9214-4217-9571-2d7fb501da0c
scrub started at Sat Aug 19 21:52:11 2017 and finished after 02:54:54
data_extents_scrubbed: 58687013
tree_extents_scrubbed: 322899
data_bytes_scrubbed: 3732788396032
tree_bytes_scrubbed: 5290377216
read_errors: 0
csum_errors: 0
verify_errors: 0
no_csum: 225408
csum_discards: 0
super_errors: 0
malloc_errors: 0
uncorrectable_errors: 0
unverified_errors: 0
corrected_errors: 0
last_physical: 3778576121856

Blockquote
storageadmin=# SELECT pool_id,pid,count() FROM storageadmin_poolscrub GROUP BY pool_id,pid HAVING count() >1;
pool_id | pid | count
---------±----±------
(0 rows)

Gunicorn has opened lots of temp files, deleted according to lsof

Create a scheduled scrub job:

  • Terminated with status error after 2 sec (according to web ui)

Blockquote
storageadmin=# SELECT pool_id,pid,count() FROM storageadmin_poolscrub GROUP BY pool_id,pid HAVING count() >1;
pool_id | pid | count
---------±----±------
(0 rows)

No zombie process this time

@peter and @magicalyak Thanks for this folks, looks like my db suspicion in that area was in deed the wrong tree. Oh well, I’ll have another look, but still haven’t seen the exact same thing here yet.

So I’m assuming that you are still experiencing the same log entries / email on that failed scheduled scrub? I.e the

... returned non-string (type list) ...

one.

Also could you try running a manually initiated (via Web-UI) scrub (when no others are running or scheduled) but adding the force option as when used there is a little bit of extra checking that is done. Don’t think it’s going to help as all it does it turn existing running or started jobs into Terminated state and I don’t think you have any of those anyway.

terminated-scrub-task

I’ll ponder some more and see if anything comes up.

Looks my mid month scrub job for the Root (RockStor Drive) ran ok, then again there’s not a lot on there other than the OS itself so there isn’t much to scrub.

Be more interesting to see what happens at the start of the month when it does the main pool.

@Dragon2611 Thanks for the update and do let us know how your main pool schedule scrub goes. Still unable to reproduce the “non-string (type list)” issue but at least we have the GitHub issue and this thread to keep an eye on it.

Hi!

Correct;

The manual/web UI one:

  • ran successfully and with no errors
  • gave me a zombie process (might have been a coincidence)
  • shows up in the web UI/Storage>Pools>Scrubs table
  • gave me no records in the db as above

The scheduled one

  • did not run successfully, seems to have terminated within 2 sec (however that may or may not be correct)
  • did not start a zombie process
  • not listed in the web UI/Storage>Pools>Scrubs table
  • received the e-mail notification you mentioned

@peter Thanks for the detailed report / update.

No worries on that one, you should see it if you execute the sql command indicated by @magicalyak early re “FYI” which shows all entries, my command was looking for repeat instances of a pid (wrong tree to bark up).

By way of update on my part I have opened an new issue that specifies a related but different bug. I will tend to this latter issue first as it also results in blocked scheduled tasks but I can reproduce it and it should help to make the scheduled scrub code a little more robust:

I don’t think it shares the same cause but is an improvement in the same area.

If you are game there might be a clue in the following db table which holds the current task status (largest id) and history:

psql -U rocky smartdb

pass as before

select * from smart_manager_task;

and the task_dev_id field there point to the contents of the following table:

select * from smart_manager_taskdefinition;

ie the figures in task_def_id column of the task table indicate the id entries of the taskdefinition table.

We will get there soon and I have mostly sorted the cause of this latest issue so bit by bit.

Thanks for you help in this area, much appreciated.

Hi

One thought; since we now know that there are workarounds to both the login issue (schedule a reboot each night) and to the scrub thing (run them manually) I have added a note about that in my original post; to highlight that further for other users it could be useful to mark the name of the post as [Workaround]

I’ll have a look at the other issue later on

1 Like

@peter

That’s an idea. Bit of a bad one that open files thing and again only seems to affect some users. Oh well, just need more info and time really.

Cheers.

I know the manual will run but wanted to add scheduled scrubs did work in the past. I rebuilt the os a few months ago and only recently re added them noticing the error.
How are we submitting the scrub job? Just in cron? If so shouldn’t we see the entries there?

If we’re not using cron I can figure the commands we need to enter. If the call could write to cron then problem solved :wink: let me know if you want me to contribute if that’s the path.

@magicalyak Hello again.

Yes and we do (sort of) eg:

cat /etc/cron.d/rockstortab 

SHELL=/bin/bash
PATH=/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=root
# These entries are auto generated by Rockstor. Do not edit.
*/5 * * * * root /opt/rockstor-dev//bin/st-pool-scrub 7 \*-*-*-*-*-*

And the indicated script is what I’m currently making some bug fixing changes to. At least that’s the hope.

Should have a pr ready soon, just need to do some more testing. But haven’t yet tracked down the “… returned non-string (type list)…” thing yet. Tricky without a reproducer so fixing what I can for now in that area.

were it not for the ever present details :smiley:

Thanks for the offer to help, if you still have a reproducer of the “list when string” expected scheduled scrub it would help to see some additional logging to know what the contents of the list elements were.
I was thinking something along the lines of:

logger.debug('joined presentation of passed pid=%s' % '/n'.join(pid))

in the try clause at:

As your and @peter’s reported emails seem to indicate that something is calling _validate_pool with a pid as list such as is the case when more than a single item are returned by a queryset when only a single match is expected, which in the single instance is then a string, not a list. That’s my best guess so far. That way we would have more to work with as we could see where the multiple pids (pool id’s) are coming from.

Hope that helps, but for the time being I’m looking to the more recent and reproducible scheduled scrub blocker I found while investigating this one. Hope to circle back around to this one though.

Cheers.

I think there may be two issues
Right now I think it’s the crontab for rockstor but this is just me guessing. Note I have two scrubs but does this crontab look right to you? Note the double slashes??

[root@rocky ~]# cat /etc/cron.d/rockstortab 
SHELL=/bin/bash
PATH=/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=root
MAILFROM=rocky@gamull.com
# These entries are auto generated by Rockstor. Do not edit.
0 1 * * * root /opt/rockstor//bin/st-system-power 3 \*-*-*-*-*-*
0 7 * * 3 root /opt/rockstor//bin/st-pool-scrub 4 \*-*-*-*-*-*
30 18 * * 0 root /opt/rockstor//bin/st-pool-scrub 5 \*-*-*-*-*-*

Log only shows

Aug 20 18:30:01 rocky.gamull.com CROND[16439]: (root) CMD (/opt/rockstor//bin/st-pool-scrub 5 \*-*-*-*-*-*)

Manual Run

[root@rocky log]# /opt/rockstor//bin/st-pool-scrub 5 \*-*-*-*-*-*
Traceback (most recent call last):
  File "/usr/lib64/python2.7/logging/handlers.py", line 76, in emit
    if self.shouldRollover(record):
  File "/usr/lib64/python2.7/logging/handlers.py", line 154, in shouldRollover
    msg = "%s\n" % self.format(record)
  File "/usr/lib64/python2.7/logging/__init__.py", line 724, in format
    return fmt.format(record)
  File "/usr/lib64/python2.7/logging/__init__.py", line 464, in format
    record.message = record.getMessage()
  File "/usr/lib64/python2.7/logging/__init__.py", line 324, in getMessage
    msg = str(self.msg)
TypeError: __str__ returned non-string (type list)
Logged from file pool_scrub.py, line 78
Traceback (most recent call last):
  File "/usr/lib64/python2.7/logging/handlers.py", line 76, in emit
    if self.shouldRollover(record):
  File "/usr/lib64/python2.7/logging/handlers.py", line 154, in shouldRollover
    msg = "%s\n" % self.format(record)
  File "/usr/lib64/python2.7/logging/__init__.py", line 724, in format
    return fmt.format(record)
  File "/usr/lib64/python2.7/logging/__init__.py", line 464, in format
    record.message = record.getMessage()
  File "/usr/lib64/python2.7/logging/__init__.py", line 324, in getMessage
    msg = str(self.msg)
TypeError: __str__ returned non-string (type list)
Logged from file pool_scrub.py, line 39

@magicalyak

Yes that is fine, not elegant, it does work that way. I have them working here just fine with that arrangement so no worries there, at least on the functional side.

Could you try turning on debug log level:

/opt/rockstor/bin/debug-mode ON

Note that this command will restart your rockstor service though.
Run without the “ON” to find out the options.
Don’t think that’s going to help though but worth a try.

It may help, at least to get more info, if you could add that logging line I detailed in my last post inside the try clause, at the same indent (with only spaces) as the return command and just before it. That might tell us what the contents of this mystery list are. You would then have to restart the rockstor service for this to take effect.

The problem with this issue is that only you and @peter have reported it and I can’t reproduce or track down (yet) what is causing it so still at the ‘more info’ stage really.

See how that goes if you are game but no worries otherwise as not good to play with code file editing ‘in production’ but given your history in the rockon-registry repo I figured you might be up for it. Quite frustrating at this end not having a reproducer. But at least it’s leading to other fixes which I can hopefully git in soon by way of my next (pending) pull request.

Nice on the manual run by the way:
But here I simply get a long pause as it does it bit / or waits for my every 5 min job to complete and then nothing:

[root@rockdev ~]# /opt/rockstor-dev//bin/st-pool-scrub 7 \*-*-*-*-*-*
[root@rockdev ~]# 

with Web-UI pool details noting it’s occurance:

and another in the task history:

task-history-scrub

The zeroing of the start time seconds in the latter is intentional but not sure why yet.

We should also normalise those tables at some point.

Cheers.

Note the top errors are pretty old and likely unrelated but I included them “just in case” the failed to get status errors (note the dates) are the relevant ones.

[25/Jul/2017 16:09:28] ERROR [storageadmin.middleware:32] Exception occured while processing a request. Path: /api/commands/refresh-snapshot-state method: POST
[25/Jul/2017 16:09:28] ERROR [storageadmin.middleware:33] Error running a command. cmd = /sbin/btrfs property get /mnt2/data/rockon-root/btrfs/subvolumes/e036e54af38ee90556073ed98d90ec8bf1514d919c733f4ac7a455c17e0e1a7a. rc = 1. stdout = ['']. stderr = ['ERROR: failed to open /mnt2/data/rockon-root/btrfs/subvolumes/e036e54af38ee90556073ed98d90ec8bf1514d919c733f4ac7a455c17e0e1a7a: No such file or directory', 'ERROR: failed to detect object type: No such file or directory', 'usage: btrfs property get [-t <type>] <object> [<name>]', '', '    Gets a property from a btrfs object.', '', '    If no name is specified, all properties for the given object are', '    printed.', '    A filesystem object can be a the filesystem itself, a subvolume,', "    an inode or a device. The '-t <type>' option can be used to explicitly", '    specify what type of object you meant. This is only needed when a', '    property could be set for more then one object type. Possible types', '    are s[ubvol], f[ilesystem], i[node] and d[evice].', '', '']
Traceback (most recent call last):
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/core/handlers/base.py", line 132, in get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/views/decorators/csrf.py", line 58, in wrapped_view
    return view_func(*args, **kwargs)
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/views/generic/base.py", line 71, in view
    return self.dispatch(request, *args, **kwargs)
  File "/opt/rockstor/eggs/djangorestframework-3.1.1-py2.7.egg/rest_framework/views.py", line 452, in dispatch
    response = self.handle_exception(exc)
  File "/opt/rockstor/eggs/djangorestframework-3.1.1-py2.7.egg/rest_framework/views.py", line 449, in dispatch
    response = handler(request, *args, **kwargs)
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/utils/decorators.py", line 145, in inner
    return func(*args, **kwargs)
  File "/opt/rockstor/src/rockstor/storageadmin/views/command.py", line 323, in post
    import_snapshots(share)
  File "/opt/rockstor/src/rockstor/storageadmin/views/share_helpers.py", line 139, in import_snapshots
    share.name)
  File "/opt/rockstor/src/rockstor/fs/btrfs.py", line 507, in snaps_info
    snap_name, writable = parse_snap_details(mnt_pt, fields)
  File "/opt/rockstor/src/rockstor/fs/btrfs.py", line 469, in parse_snap_details
    '%s/%s' % (mnt_pt, fields[-1])])
  File "/opt/rockstor/src/rockstor/system/osi.py", line 115, in run_command
    raise CommandException(cmd, out, err, rc)
CommandException: Error running a command. cmd = /sbin/btrfs property get /mnt2/data/rockon-root/btrfs/subvolumes/e036e54af38ee90556073ed98d90ec8bf1514d919c733f4ac7a455c17e0e1a7a. rc = 1. stdout = ['']. stderr = ['ERROR: failed to open /mnt2/data/rockon-root/btrfs/subvolumes/e036e54af38ee90556073ed98d90ec8bf1514d919c733f4ac7a455c17e0e1a7a: No such file or directory', 'ERROR: failed to detect object type: No such file or directory', 'usage: btrfs property get [-t <type>] <object> [<name>]', '', '    Gets a property from a btrfs object.', '', '    If no name is specified, all properties for the given object are', '    printed.', '    A filesystem object can be a the filesystem itself, a subvolume,', "    an inode or a device. The '-t <type>' option can be used to explicitly", '    specify what type of object you meant. This is only needed when a', '    property could be set for more then one object type. Possible types', '    are s[ubvol], f[ilesystem], i[node] and d[evice].', '', '']
[26/Jul/2017 07:00:02] ERROR [scripts.scheduled_tasks.pool_scrub:76] Failed to start scrub at pools/data/scrub
[26/Jul/2017 07:00:02] ERROR [scripts.scheduled_tasks.pool_scrub:37] Failed to get scrub status at pools/data/scrub/status
[31/Jul/2017 07:00:02] ERROR [scripts.scheduled_tasks.pool_scrub:76] Failed to start scrub at pools/bigdata/scrub
[31/Jul/2017 07:00:02] ERROR [scripts.scheduled_tasks.pool_scrub:37] Failed to get scrub status at pools/bigdata/scrub/status
[02/Aug/2017 07:00:03] ERROR [scripts.scheduled_tasks.pool_scrub:76] Failed to start scrub at pools/data/scrub
[02/Aug/2017 07:00:03] ERROR [scripts.scheduled_tasks.pool_scrub:37] Failed to get scrub status at pools/data/scrub/status
[07/Aug/2017 07:00:03] ERROR [scripts.scheduled_tasks.pool_scrub:76] Failed to start scrub at pools/bigdata/scrub
[07/Aug/2017 07:00:03] ERROR [scripts.scheduled_tasks.pool_scrub:37] Failed to get scrub status at pools/bigdata/scrub/status
[09/Aug/2017 07:00:02] ERROR [scripts.scheduled_tasks.pool_scrub:76] Failed to start scrub at pools/data/scrub
[09/Aug/2017 07:00:02] ERROR [scripts.scheduled_tasks.pool_scrub:37] Failed to get scrub status at pools/data/scrub/status
[14/Aug/2017 07:00:03] ERROR [scripts.scheduled_tasks.pool_scrub:76] Failed to start scrub at pools/bigdata/scrub
[14/Aug/2017 07:00:03] ERROR [scripts.scheduled_tasks.pool_scrub:37] Failed to get scrub status at pools/bigdata/scrub/status
[16/Aug/2017 07:00:02] ERROR [scripts.scheduled_tasks.pool_scrub:76] Failed to start scrub at pools/data/scrub
[16/Aug/2017 07:00:02] ERROR [scripts.scheduled_tasks.pool_scrub:37] Failed to get scrub status at pools/data/scrub/status
[20/Aug/2017 18:30:04] ERROR [scripts.scheduled_tasks.pool_scrub:76] Failed to start scrub at pools/bigdata/scrub
[20/Aug/2017 18:30:04] ERROR [scripts.scheduled_tasks.pool_scrub:37] Failed to get scrub status at pools/bigdata/scrub/status
[20/Aug/2017 19:33:52] ERROR [scripts.scheduled_tasks.pool_scrub:76] Failed to start scrub at pools/bigdata/scrub
[20/Aug/2017 19:33:52] ERROR [scripts.scheduled_tasks.pool_scrub:37] Failed to get scrub status at pools/bigdata/scrub/status
[20/Aug/2017 19:34:44] ERROR [scripts.scheduled_tasks.pool_scrub:76] Failed to start scrub at pools/bigdata/scrub
[20/Aug/2017 19:34:44] ERROR [scripts.scheduled_tasks.pool_scrub:37] Failed to get scrub status at pools/bigdata/scrub/status
[20/Aug/2017 19:35:41] ERROR [scripts.scheduled_tasks.pool_scrub:76] Failed to start scrub at pools/bigdata/scrub
[20/Aug/2017 19:35:41] ERROR [scripts.scheduled_tasks.pool_scrub:37] Failed to get scrub status at pools/bigdata/scrub/status
[21/Aug/2017 07:30:03] ERROR [scripts.scheduled_tasks.pool_scrub:76] Failed to start scrub at pools/bigdata/scrub
[21/Aug/2017 07:30:03] ERROR [scripts.scheduled_tasks.pool_scrub:37] Failed to get scrub status at pools/bigdata/scrub/status
[21/Aug/2017 09:55:57] ERROR [scripts.scheduled_tasks.pool_scrub:76] Failed to start scrub at pools/bigdata/scrub
[21/Aug/2017 09:55:57] ERROR [scripts.scheduled_tasks.pool_scrub:37] Failed to get scrub status at pools/bigdata/scrub/status
[21/Aug/2017 09:55:57] DEBUG [scripts.scheduled_tasks.pool_scrub:87] task(5) finished with state(error).
[21/Aug/2017 10:02:02] ERROR [scripts.scheduled_tasks.pool_scrub:76] Failed to start scrub at pools/bigdata/scrub
[21/Aug/2017 10:02:02] ERROR [scripts.scheduled_tasks.pool_scrub:37] Failed to get scrub status at pools/bigdata/scrub/status
[21/Aug/2017 10:02:02] DEBUG [scripts.scheduled_tasks.pool_scrub:87] task(5) finished with state(error).

root mail

From rocky@gamull.com  Mon Aug 21 10:02:02 2017
Return-Path: <rocky@gamull.com>
X-Original-To: root
Delivered-To: root@rocky.gamull.com
Received: by rocky.gamull.com (Postfix, from userid 0)
	id DD5332858A8; Mon, 21 Aug 2017 10:02:02 -0400 (EDT)
From: "(Cron Daemon)" <rocky@gamull.com>
To: root@rocky.gamull.com
Subject: Cron <root@rocky> /opt/rockstor//bin/st-pool-scrub 5 \*-*-*-*-*-*
Content-Type: text/plain; charset=UTF-8
Auto-Submitted: auto-generated
Precedence: bulk
X-Cron-Env: <XDG_SESSION_ID=8>
X-Cron-Env: <XDG_RUNTIME_DIR=/run/user/0>
X-Cron-Env: <LANG=en_US.UTF-8>
X-Cron-Env: <SHELL=/bin/bash>
X-Cron-Env: <PATH=/sbin:/bin:/usr/sbin:/usr/bin>
X-Cron-Env: <MAILTO=root>
X-Cron-Env: <MAILFROM=rocky@gamull.com>
X-Cron-Env: <HOME=/root>
X-Cron-Env: <LOGNAME=root>
X-Cron-Env: <USER=root>
Message-Id: <20170821140202.DD5332858A8@rocky.gamull.com>
Date: Mon, 21 Aug 2017 10:02:02 -0400 (EDT)

Traceback (most recent call last):
  File "/usr/lib64/python2.7/logging/handlers.py", line 76, in emit
    if self.shouldRollover(record):
  File "/usr/lib64/python2.7/logging/handlers.py", line 154, in shouldRollover
    msg = "%s\n" % self.format(record)
  File "/usr/lib64/python2.7/logging/__init__.py", line 724, in format
    return fmt.format(record)
  File "/usr/lib64/python2.7/logging/__init__.py", line 464, in format
    record.message = record.getMessage()
  File "/usr/lib64/python2.7/logging/__init__.py", line 324, in getMessage
    msg = str(self.msg)
TypeError: __str__ returned non-string (type list)
Logged from file pool_scrub.py, line 78
Traceback (most recent call last):
  File "/usr/lib64/python2.7/logging/handlers.py", line 76, in emit
    if self.shouldRollover(record):
  File "/usr/lib64/python2.7/logging/handlers.py", line 154, in shouldRollover
    msg = "%s\n" % self.format(record)
  File "/usr/lib64/python2.7/logging/__init__.py", line 724, in format
    return fmt.format(record)
  File "/usr/lib64/python2.7/logging/__init__.py", line 464, in format
    record.message = record.getMessage()
  File "/usr/lib64/python2.7/logging/__init__.py", line 324, in getMessage
    msg = str(self.msg)
TypeError: __str__ returned non-string (type list)
Logged from file pool_scrub.py, line 39

@magicalyak it would appear from your logs:

That you have not re-created your scrub tasks post 3.9.1-2 release (see later) ie from my first comment in this thread here and my first too yourself in this thread :face_with_raised_eyebrow::

ie the url for your logged tasks is using the pool name where as from the end of the last testing release (last stable) and all throughout the current testing channel release cycle we have been using the db id of a pool not it’s name in the API calls. And more specifically the second testing channel update released in the current testing channel, 3.9.1-2:

ie:

scripts.scheduled_tasks.pool_scrub - update_state constructs url=pools/2/scrub/status

and then uses that to query the scrub state.

Sorry I should have double checked that you were on a relevant release and I obviously made a misleading mistake in my post:

Silly me.

I was thinking that the number indicated the pool but of course as per my post to @peter:

So for my example here:

cat /etc/cron.d/rockstortab 

SHELL=/bin/bash
PATH=/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=root
# These entries are auto generated by Rockstor. Do not edit.
*/5 * * * * root /opt/rockstor-dev//bin/st-pool-scrub 7 \*-*-*-*-*-*

that “7” points to:

psql -U rocky smartdb

select * from smart_manager_taskdefinition;
 id |         name         | task_type |                    json_meta                    | enabled |   crontab   | crontabwindow 
----+----------------------+-----------+-------------------------------------------------+---------+-------------+---------------
  7 | luks-on-bcache-scrub | scrub     | {"pool_name": "pool-on-luks-dev", "pool": "2"}  | t       | */5 * * * * | *-*-*-*-*-*
  6 | root-scrub3          | scrub     | {"pool_name": "rockstor_rockstor", "pool": "1"} | f       | */5 * * * * | *-*-*-*-*-*
(2 rows)

Oh well, my mistake; see how you get on after updating to the latest testing release and then deleting and re-creating your scheduled scrubs.

On my side of the updates wagon we now have the following pending pull request awaiting review:

https://github.com/rockstor/rockstor-core/pull/1801

Thanks @magicalyak and I hope the above has sufficiently explained my misunderstanding.