Scheduled Scrub not running

phillxnet · August 21, 2017, 4:14pm

@magicalyak it would appear from your logs:

That you have not re-created your scrub tasks post 3.9.1-2 release (see later) ie from my first comment in this thread here and my first too yourself in this thread :

phillxnet:

@magicalyak Hello again:

magicalyak:
I’ll see if it’s the same issue but it’s scrubs not starting.
Cheers that would be good.

Just try re-creating those tasks as then they adopt the new pool API, assuming you have all the recent testing channel updates in place of course.

ie the url for your logged tasks is using the pool name where as from the end of the last testing release (last stable) and all throughout the current testing channel release cycle we have been using the db id of a pool not it’s name in the API calls. And more specifically the second testing channel update released in the current testing channel, 3.9.1-2:

ie:

scripts.scheduled_tasks.pool_scrub - update_state constructs url=pools/2/scrub/status

and then uses that to query the scrub state.

Sorry I should have double checked that you were on a relevant release and I obviously made a misleading mistake in my post:

Silly me.

I was thinking that the number indicated the pool but of course as per my post to @peter:

So for my example here:

cat /etc/cron.d/rockstortab 

SHELL=/bin/bash
PATH=/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=root
# These entries are auto generated by Rockstor. Do not edit.
*/5 * * * * root /opt/rockstor-dev//bin/st-pool-scrub 7 \*-*-*-*-*-*

that “7” points to:

psql -U rocky smartdb

select * from smart_manager_taskdefinition;
 id |         name         | task_type |                    json_meta                    | enabled |   crontab   | crontabwindow 
----+----------------------+-----------+-------------------------------------------------+---------+-------------+---------------
  7 | luks-on-bcache-scrub | scrub     | {"pool_name": "pool-on-luks-dev", "pool": "2"}  | t       | */5 * * * * | *-*-*-*-*-*
  6 | root-scrub3          | scrub     | {"pool_name": "rockstor_rockstor", "pool": "1"} | f       | */5 * * * * | *-*-*-*-*-*
(2 rows)

Oh well, my mistake; see how you get on after updating to the latest testing release and then deleting and re-creating your scheduled scrubs.

On my side of the updates wagon we now have the following pending pull request awaiting review:

github.com/rockstor/rockstor-core

scheduled scrub blocked by halted, conn-reset, and cancelled states. Fixes #1800

rockstor:master ← phillxnet:1800_scheduled_scrub_blocked_by_halted_conn-reset_and_cancelled_states

opened 02:14PM - 21 Aug 17 UTC

phillxnet

+7 -3

Please see issue text for context. Essential this pr extends the known terminal …states of a scrub, from the scrub scheduler point of view, from the previous states of 'error' and 'finished' to include the additional non running states of 'halted', 'cancelled', and 'conn-reset'. Pre pr and pre scrub status enhancement pr #1791 a prior scheduled or manually initiated (via Web-UI) scrub that was in state 'halted' (interrupted eg by a reboot) would cause all future scheduled scrubs of the associated pool to not execute. This was as a result of the 'running' (pre pr #1791) or (post pr #1791) 'halted' state as both were considered as non-terminal, ie as not 'finished' or not 'errror' and hence all future scrub tasks would be blocked. By extending the terminal status to include the now correctly identified but equally non-running additional states existing scheduled scrub jobs are initiated successfully. Tested by placing an ongoing Web-UI initiated scrub into both 'cancelled' (via 'btrfs scrub cancel') and 'halted' / interrupted (by rebooting mid scrub) and then ensuring that pre pr the existing scheduled scrub would fail: DEBUG [scripts.scheduled_tasks.pool_scrub:61] Non terminal state(cancelled) for task(2). Checking again. DEBUG [scripts.scheduled_tasks.pool_scrub:66] Non terminal state(cancelled) for task(2). A new task will not be run. and then post pr that the same existing schedule scrub was initiated successfully. Fixes #1800 Ready for review.

Thanks @magicalyak and I hope the above has sufficiently explained my misunderstanding.