Proxmox 4.x Hang

Right sorry for the lack of detail here, but I’ve run into it a couple of times lately where the proxmox VM seems to just hang, to the point where not even the console responds at all,.

it seems to be either I/O load or possibly refreshing the webUI that’s causing it

Once I’ve rebooted it I’ll see if anything useful got logged.

Edit: Nothing I can see in the logs, it looks like the VM literally hung

I’ve run into this same exact issue. I was beginning to wonder if I was the only one who ran RockStor on top of Proxmox. Just like you I’m stumped. For me, it does it occasionally. No rhyme or reason. I can push tons of traffic to/from RockStor with no issues. The, I’ll update some Samba shares and poof, hang.

You can turn off Rockstor with systemctl stop rockstor. Then see if it can be reproduced while Rockstor services are off. If not, the cause could be Rockstor.

@suman

I actually set a Kernel build for 4.3RC6 going last night but I forgot to assign more cores to the VM before doing so and it hadn’t finished before I went to bed, I’ll check to see if it built ok or not.

I did have some kernel panic’s previously on Proxmox 3.4 with Rockstor running the older 4.1x kernel so I think it’s more a kernel issue than a rockstor issue, an application lockup usually doesn’t freeze the whole system on a Linux box (It CAN but it’s rare)

Edit:

Other than being somewhat large (Think I missed an option that compresses it) the Kernel boots ok and rockstor is running, It’s mostly based on the config from the elrepo kernel you use although I stripped out multimedia support and I think a couple of drivers for SCSI tape drives/ Controller cards I knew I wasn’t going to be using).

Also I changed the kernel config so BTRFS was compiled straight in rather than being a module.

There are some BTRFS fixes for RAID5/6 in 4.3x and I think from reading the changelog from 4.2 some changes to virtIO (Although that said the host is still running 4.2 so I’m not sure if the VirtIO changes will be much benefit).

There’s still a lot of crap in there that I Think is un-needed but as I didn’t fully understand what some of it Did I just left it well alone (I.e it was built in the elrepo kernel I took the .config from so I left the settings the same)

1 Like

Also @baggar11 I’ve pm’d you a link to the RPM I built if you’re feeling brave/wanted to try it.

Probably wouldn’t trust it with any super critical data Just in case.

If Others want it I guess I can upload it somewhere else but i’d rather not have people blindly installing it unless they really need it just in case there’s something wrong with it, I’m not exactly an expert when it comes to compiling the kernel from source.

Thanks Dragon, I’ll probably hold off on the 4.3 kernel for now. Oddly enough, I was thinking that the issue was probably kernel related too. There has to be some virtualization regression in the v4+ kernels.

Suman, like Dragon said, it locks up the whole vm to the point that the console doesn’t work, so even trying to stop the RockStor service wouldn’t help. It kills filesharing, samba/nfs, webgui and console. No errors from the Proxmox side of things that I can tell either.

I’m not really sure how to trigger it in in my environment to test out what’s going on. For now, I’m just hoping that newer kernels fix whatever is broken, on the Proxmox and RockStor side of things. My environment is mostly stable, so it hasn’t been a big deal.

Don’t think it’s Rockstor related but my proxmox host decided to go reboot itself this morning for some unknown reason.

Have you seen anything like that happen?

No, nothing like that here. That doesn’t sound good. Maybe a power supply issue? Have you looked at the logs yet?

BMC didn’t log anything interesting nor did proxmox itself as far as I can see.

What’s your system setup look like? How many drives are you running? PSU power? Any motherboard bios/bmc updates?

C7250-D4I

5 3.5" drives + 2x 40GB ssds (Proxmox boot)
Silverstone DS380 case

Believe the PSU is 450w but I don’t remember the brand, wasn’t one of the cheapo ones that much I do remember.

Device is actually sitting in co-locker (www.colocker.com) so ambient temperature should be fine, Power should be nice and stable (UPS backed).

I’ll see if I can find the Invoice from the supplier to see what PSU I ordered.

450W Silverstone Strider ST45SF 80PLUS Bronze 85% Eff’ 80mm Quiet Fan, SFX, EPS 12V,

I seem to remember there being a somewhat limited choice for SFF psu’s at the time and this was one of the better models they had in stock

Power supplies do fail. Something to keep an eye on if it happens again. Depending on when you purchased that board, you may not have all the bios/bmc/marvell updates applied. Did you check on those?

Voltages all look in spec,

Unless they’ve sneaked a bios/firmware update out within the last month or 2 then it’s up2date (will double check later) I was talking to them fairly recently about KVMoIP/Java issues (Found it tends to get blocked unless using HTTPS)

I suspect something caused a kernel panic but I can’t be 100% sure as it didn’t log anything productive.