KarstenV
(Karsten Vinding)
February 2, 2016, 7:03pm
1
Today, when I got home, my rockstor box had crashed. This newer happens, so I was quite surprised.
I rebooted it but it ended up with this:
The first errors are normal, I see them at every boot, but the rest is not normal.
Is this recoverable, or will I have to reinstall?
I really hope not having to reinstall, and even more hope not to have lost any data.
KarstenV
(Karsten Vinding)
February 2, 2016, 7:15pm
3
Boots of a 16Gb mSATA SSD.
The pool resides on 7 disks running RAID6.
roweryan
(roweryan)
February 2, 2016, 7:32pm
4
Can you try with kernel-ml-4.2.5-1.el7.elrepo.x86_64? Your error looks like mine did: “appeared twice”. See https://github.com/rockstor/rockstor-core/issues/1069
KarstenV
(Karsten Vinding)
February 2, 2016, 7:44pm
5
Trying the kernel-ml-4.2.5-1.el7.elrepo.x86_64, doesn’t work either.
The system seems to stop at the same place during bootup, but without any errors. Even with debugging turned on.
A little frustrating. But I trust that a reinstall will fix it, if all else fails
roweryan
(roweryan)
February 2, 2016, 7:53pm
6
device not ready looks bad. Are the cables okay? Is there a card the sata cables are connected to that is not on the board - if yes, check it’s in properly.
You could also try a newer kernel http://elrepo.org/people/ajb/devel/kernel-ml/el7/x86_64/RPMS/kernel-ml-4.5.0-0.rc2.el7.elrepo.x86_64.rpm
KarstenV
(Karsten Vinding)
February 2, 2016, 8:01pm
7
The device not ready is not important, its a known bug with the onboard sata controller, and there is a workaround in place.
The system has worked with this error since the first day, and booted every time. Thats not the problem.
I have tried unhooking all my hdd drives, so that only the boot drive is attached, and the system still freezes during bootup, but without any errors.
It seems I will have to reinstall
roweryan
(roweryan)
February 2, 2016, 8:02pm
8
I think you want to reinstall
KarstenV
(Karsten Vinding)
February 2, 2016, 8:04pm
9
I really dont.
I will have too go through all the hassle of setting up shares / NFS / Samba / rockons and so on.
But will get the latest ISO and start the reinstall.
But probably first tomorrow.
roweryan
(roweryan)
February 2, 2016, 8:07pm
10
Okay, then if it was my rockstor, I would reboot with debugging enabled (again), then wait 10 minutes to see if that error changes.
KarstenV
(Karsten Vinding)
February 2, 2016, 8:11pm
11
What would happen after 10 minuttes?
roweryan
(roweryan)
February 2, 2016, 8:13pm
12
You haven’t had a crash, it looks like a hang. If you wait you may get a timeout and a better error message.
KarstenV
(Karsten Vinding)
February 2, 2016, 8:16pm
13
OK.
I have restarted the system with debugging enabled. Will wait and see if anything happens.
Edit:
16 minutes later no change.
I will let it sit through the night, and see if anything has shown up tomorrow.
KarstenV
(Karsten Vinding)
February 3, 2016, 3:14pm
14
Nothing happened overnight, so now the system is reinstalled, and up and running again.
The pool was OK, and I have access to all shares.
It would be nice to have known what went wrong, but I guess I’ll have to accept it as a random event.
suman
(suman chakravartula)
February 3, 2016, 4:47pm
15
Glad you got it all back @KarstenV . Are you running 3.8-11? with 4.3.3 kernel?
KarstenV
(Karsten Vinding)
February 3, 2016, 6:21pm
16
Thanks.
Yes, running 3.8-11.06 4.3.3-1.el7.elrepo.x86_64.
The reinstall even fixed the permission denied problem I had with SMART on one of my disks.
Only thing I need to get running is Plex. If I reuse the same shares as on the last install I hope Plex will use the settings stored in these?
I’m glad to hear you didn’t lose data.
1 Like