Scheduled scrub throws an error

This morning a mail from Rockstor was in my inbox.
Not the usual kind, but containing this:

Subject:
Cron root@rockstornas /opt/rockstor//bin/st-pool-scrub 2 *-----*

Content:
Traceback (most recent call last):
File “/usr/lib64/python2.7/logging/handlers.py”, line 76, in emit
if self.shouldRollover(record):
File “/usr/lib64/python2.7/logging/handlers.py”, line 154, in shouldRollover
msg = “%s\n” % self.format(record)
File “/usr/lib64/python2.7/logging/init.py”, line 724, in format
return fmt.format(record)
File “/usr/lib64/python2.7/logging/init.py”, line 464, in format
record.message = record.getMessage()
File “/usr/lib64/python2.7/logging/init.py”, line 324, in getMessage
msg = str(self.msg)
TypeError: str returned non-string (type list)
Logged from file pool_scrub.py, line 78
Traceback (most recent call last):
File “/usr/lib64/python2.7/logging/handlers.py”, line 76, in emit
if self.shouldRollover(record):
File “/usr/lib64/python2.7/logging/handlers.py”, line 154, in shouldRollover
msg = “%s\n” % self.format(record)
File “/usr/lib64/python2.7/logging/init.py”, line 724, in format
return fmt.format(record)
File “/usr/lib64/python2.7/logging/init.py”, line 464, in format
record.message = record.getMessage()
File “/usr/lib64/python2.7/logging/init.py”, line 324, in getMessage
msg = str(self.msg)
TypeError: str returned non-string (type list)
Logged from file pool_scrub.py, line 39

Something seems to have gone wrong, and the scrub is not running.

This has been working perfectly for as long as I have used scheduled scrubs, so something must have changed with one of the later updates.

Hi @KarstenV,
can you scheduled a new scrub task and test it?
Thanks
Mirko

I just set up at scheduled scrub, to test if it would work.

It ended up like this:

Immediately after this failed I got a mail from Rockstor containing this:

Subj: Cron root@rockstornas /opt/rockstor//bin/st-pool-scrub 3 *-----*

Contents:
Traceback (most recent call last):
File “/usr/lib64/python2.7/logging/handlers.py”, line 76, in emit
if self.shouldRollover(record):
File “/usr/lib64/python2.7/logging/handlers.py”, line 154, in shouldRollover
msg = “%s\n” % self.format(record)
File “/usr/lib64/python2.7/logging/init.py”, line 724, in format
return fmt.format(record)
File “/usr/lib64/python2.7/logging/init.py”, line 464, in format
record.message = record.getMessage()
File “/usr/lib64/python2.7/logging/init.py”, line 324, in getMessage
msg = str(self.msg)
TypeError: str returned non-string (type list)
Logged from file pool_scrub.py, line 78
Traceback (most recent call last):
File “/usr/lib64/python2.7/logging/handlers.py”, line 76, in emit
if self.shouldRollover(record):
File “/usr/lib64/python2.7/logging/handlers.py”, line 154, in shouldRollover
msg = “%s\n” % self.format(record)
File “/usr/lib64/python2.7/logging/init.py”, line 724, in format
return fmt.format(record)
File “/usr/lib64/python2.7/logging/init.py”, line 464, in format
record.message = record.getMessage()
File “/usr/lib64/python2.7/logging/init.py”, line 324, in getMessage
msg = str(self.msg)
TypeError: str returned non-string (type list)
Logged from file pool_scrub.py, line 39

Oh, I probably should mention that I’m on the latest development build 3.9.0-16

Anybody have any ideas.

Is anybody else seing this?

Hi @KarstenV,
that’s quite strange, your crontab string seems not getting escaped, will check it

M.

Looking into this, the error is being caused by pool_scrub.py trying to log an APIException, changing lines 39 and 78 to:

logger.exception(e.detail)

allows the script to complete without error, afterwards in my log I see this:

[u'Invalid api end point: http://127.0.0.1:8000/api/pools/Primary/scrub']                                                                                                     
[u'Invalid api end point: http://127.0.0.1:8000/api/pools/Primary/scrub/status']

Someone change the REST api recently?

@kbogert Hello again and nice find.

We have recently changed the pools and disks API’s from using their name field to using their db id. Looks like this got missed; oops.

Ie: For the pools #1741, and for the disks #1746 with a small associated follow up #1751

Do you fancy opening a GitHub issue on this in our rockstor-core repo as it would be a great one to get sorted? And of course if you fancy having a go at a fix by way of a pull request for it that would be great; just note your intention to do so in the issue as that should help avoid duplication of effort as this should be a pretty high priority to fix. Know we know whats going wrong. :slight_smile: Otherwise I can have a look shortly hopefully.

Thanks @kbogert, @KarstenV, and @Flyer, looks like you’ve rounded this one out in the end.

1 Like

@KarstenV, @kbogert, and @Flyer
I’ve just submitted a pull request to address this issue. It is awaiting review and further testing but looks good so far. Turned out to be a little more complicate than I had at first hoped. Oh well.

Linking to @kbogert issue:

and the associated pull request:

Lets hope that helps.

1 Like

Looks good @phillxnet.

I look forward to testing the fixes when they get released :slight_smile:

Thanks to @phillxnet for pitching in on this :slight_smile:

Mirko

@KarstenV @kbogert and @Flyer

Notification post: the fix for this issue has now been merged and released via the testing channel updates:

Thanks all for helping to resolve this one.

3 Likes

Looks good, I had to delete all of my scheduled scrubs and re-create them which was a minor annoyance.

Thanks all!

1 Like

@kbogert Thanks for the feedback / fix confirmation and glad you are now sorted:

Yes, that is a bit of a pain point. Sorry I should have emphasise that element of the fix.

Thanks again for helping to diagnose this one.