Scheduled scrub throws an error

KarstenV · July 2, 2017, 6:19am

This morning a mail from Rockstor was in my inbox.
Not the usual kind, but containing this:

Subject:
Cron root@rockstornas /opt/rockstor//bin/st-pool-scrub 2 *-----*

Content:
Traceback (most recent call last):
File “/usr/lib64/python2.7/logging/handlers.py”, line 76, in emit
if self.shouldRollover(record):
File “/usr/lib64/python2.7/logging/handlers.py”, line 154, in shouldRollover
msg = “%s\n” % self.format(record)
File “/usr/lib64/python2.7/logging/init.py”, line 724, in format
return fmt.format(record)
File “/usr/lib64/python2.7/logging/init.py”, line 464, in format
record.message = record.getMessage()
File “/usr/lib64/python2.7/logging/init.py”, line 324, in getMessage
msg = str(self.msg)
TypeError: str returned non-string (type list)
Logged from file pool_scrub.py, line 78
Traceback (most recent call last):
File “/usr/lib64/python2.7/logging/handlers.py”, line 76, in emit
if self.shouldRollover(record):
File “/usr/lib64/python2.7/logging/handlers.py”, line 154, in shouldRollover
msg = “%s\n” % self.format(record)
File “/usr/lib64/python2.7/logging/init.py”, line 724, in format
return fmt.format(record)
File “/usr/lib64/python2.7/logging/init.py”, line 464, in format
record.message = record.getMessage()
File “/usr/lib64/python2.7/logging/init.py”, line 324, in getMessage
msg = str(self.msg)
TypeError: str returned non-string (type list)
Logged from file pool_scrub.py, line 39

Something seems to have gone wrong, and the scrub is not running.

This has been working perfectly for as long as I have used scheduled scrubs, so something must have changed with one of the later updates.

Flyer · July 2, 2017, 9:44am

Hi @KarstenV,
can you scheduled a new scrub task and test it?
Thanks
Mirko

KarstenV · July 4, 2017, 5:53pm

I just set up at scheduled scrub, to test if it would work.

It ended up like this:

Immediately after this failed I got a mail from Rockstor containing this:

Subj: Cron root@rockstornas /opt/rockstor//bin/st-pool-scrub 3 *-----*

Contents:
Traceback (most recent call last):
File “/usr/lib64/python2.7/logging/handlers.py”, line 76, in emit
if self.shouldRollover(record):
File “/usr/lib64/python2.7/logging/handlers.py”, line 154, in shouldRollover
msg = “%s\n” % self.format(record)
File “/usr/lib64/python2.7/logging/init.py”, line 724, in format
return fmt.format(record)
File “/usr/lib64/python2.7/logging/init.py”, line 464, in format
record.message = record.getMessage()
File “/usr/lib64/python2.7/logging/init.py”, line 324, in getMessage
msg = str(self.msg)
TypeError: str returned non-string (type list)
Logged from file pool_scrub.py, line 78
Traceback (most recent call last):
File “/usr/lib64/python2.7/logging/handlers.py”, line 76, in emit
if self.shouldRollover(record):
File “/usr/lib64/python2.7/logging/handlers.py”, line 154, in shouldRollover
msg = “%s\n” % self.format(record)
File “/usr/lib64/python2.7/logging/init.py”, line 724, in format
return fmt.format(record)
File “/usr/lib64/python2.7/logging/init.py”, line 464, in format
record.message = record.getMessage()
File “/usr/lib64/python2.7/logging/init.py”, line 324, in getMessage
msg = str(self.msg)
TypeError: str returned non-string (type list)
Logged from file pool_scrub.py, line 39

Oh, I probably should mention that I’m on the latest development build 3.9.0-16

KarstenV · July 10, 2017, 9:12am

Anybody have any ideas.

Is anybody else seing this?

Flyer · July 11, 2017, 8:43am

Hi @KarstenV,
that’s quite strange, your crontab string seems not getting escaped, will check it

M.

kbogert · July 13, 2017, 6:42pm

Looking into this, the error is being caused by pool_scrub.py trying to log an APIException, changing lines 39 and 78 to:

logger.exception(e.detail)

allows the script to complete without error, afterwards in my log I see this:

[u'Invalid api end point: http://127.0.0.1:8000/api/pools/Primary/scrub']                                                                                                     
[u'Invalid api end point: http://127.0.0.1:8000/api/pools/Primary/scrub/status']

Someone change the REST api recently?

phillxnet · July 13, 2017, 7:49pm

@kbogert Hello again and nice find.

We have recently changed the pools and disks API’s from using their name field to using their db id. Looks like this got missed; oops.

Ie: For the pools #1741, and for the disks #1746 with a small associated follow up #1751

Do you fancy opening a GitHub issue on this in our rockstor-core repo as it would be a great one to get sorted? And of course if you fancy having a go at a fix by way of a pull request for it that would be great; just note your intention to do so in the issue as that should help avoid duplication of effort as this should be a pretty high priority to fix. Know we know whats going wrong. Otherwise I can have a look shortly hopefully.

Thanks @kbogert, @KarstenV, and @Flyer, looks like you’ve rounded this one out in the end.

phillxnet · July 14, 2017, 8:18pm

@KarstenV, @kbogert, and @Flyer
I’ve just submitted a pull request to address this issue. It is awaiting review and further testing but looks good so far. Turned out to be a little more complicate than I had at first hoped. Oh well.

Linking to @kbogert issue:

and the associated pull request:

Lets hope that helps.

KarstenV · July 14, 2017, 8:44pm

Looks good @phillxnet.

I look forward to testing the fixes when they get released

Flyer · July 17, 2017, 7:30am

Thanks to @phillxnet for pitching in on this

Mirko

phillxnet · July 29, 2017, 6:18pm

@KarstenV @kbogert and @Flyer

Notification post: the fix for this issue has now been merged and released via the testing channel updates:

Thanks all for helping to resolve this one.

kbogert · August 2, 2017, 4:49pm

Looks good, I had to delete all of my scheduled scrubs and re-create them which was a minor annoyance.

Thanks all!

phillxnet · August 2, 2017, 5:03pm

@kbogert Thanks for the feedback / fix confirmation and glad you are now sorted:

Yes, that is a bit of a pain point. Sorry I should have emphasise that element of the fix.

Thanks again for helping to diagnose this one.