Rockon share required for config? By design? discuss.. ;-)

DrC · July 21, 2021, 12:15pm

So Ive been playing with rockons and i find it curious that each docker volume has to be a share, rather than a generic and more flexible file system path.

Eg syncthing, i need a share for config and a share for the shared folder.

That stops me having:

a single share for syncthing with ‘config/’ and ‘shared/’ inside.
a share for all rockons called ‘all-rockon-data’ and having ‘syncthing/config’ and ‘syncthing/shared’ inside as well as all my other rockon config directories.

If i have 10 rockon’s installed, i probably have 12-15 shares that i need to make? That seems annoying to backup, as rather than just backing up ‘syncthing/’ or 'all-rockon-data/’, i now have to backup both syncthing-config and syncthing-shared shares.

If you are going to go the force everything to be a share route, why dont you auto create them when the rockon is installed? These config shares cant be shared with other rockons or anything else, so by design you seem to be forcing people to go through a 2 step process, with no choice but to make a new share and then link it to config/ (or whatever). So why not just auto create them?

(I get that syncthing-shared might be different, but it i guess im talking more about the individual config share for every single rockon that needs to save state.)

phillxnet · July 21, 2021, 7:51pm

Funny you should say that; it’s on the cards. But we have bigger fish to fry and are a small team, at least of regular committers.

But it’s still important to keep seperate shares as otherwise one Rock-on may interfeere with another in unpridicatble ways, especially now we have dozens of them. Plus each docker image rather assumes it’s not sharing it’s date/config with any other docker container.

However the core point is approaching/improving the usability. See the following forum thread/post:

from September 2015 !!

Which was issued on GitHub, prior to completion of discussion (and initially without attribution) here:

github.com/rockstor/rockstor-core

Rock-On improvements (required inputs, defaults)

opened 08:03PM - 01 Feb 16 UTC

henfri

enhancement

From http://forum.rockstor.com/t/next-phase-of-rock-on-framework-improvements/41…7/4 (@phillxnet) > Currently we have the explanation header and the selection of the "Root Share". > In the circumstance where there is only one pool (ie most less advanced installs) offer to create the minimum 5GB Share (it can always be expanded). Named for example "auto_rock_ons_root". > If more than one user created pool exists then flag the auto option as "only available on single pool configurations, please select an existing Share or create one on your desired pool" > So to summarize the "Configure Rock-on Service" screen would have an "Auto create and configure" button that would create a "auto_rock_ons_root" share and set it in the existing "Root Share" selector once pressed. The button would be greyed out / disabled on multi pool installs and have the above explanation as to why. We can't know which pool to make a rock-ons-root on so the user will have to do it. But if there was only one user created pool then we can auto make a minimum share for them on the spot. There might also be an opportunity here to sense an existing auto named pool and offer to use it if found. ie the Wizard notices an existing auto_rock_ons_root named share and states its existence and if the user would like to use it, we have to check for it anyway before creating it to making sure we don't share name clash so we might as well offer up setting a new config to inherit a previous installs auto_rock_ons_root if found (not sure how difficult / feasible this would be though). > > As for the individual Rock-on setup wizards we could offer a similar "Auto create and configure" button (big button with that title). Like wise it would in all instances just create a set of minimal shares appropriate for the particular Rock-on. I know this is a little tricky but a basic config size is pretty easy (eg 1GB) and a data size can just start at say 20G (or say 10% of total storage); again all can be resized on the fly anyway and we have nice config areas for all of that. > > With this scenario we train the user on the "Auto create and configure" button in the initial Rock-ons configure screen and so when they come across it again they know what to expect; ie it will make and fill out all those storage / config / data share question things that have to be done some how. Given the button suggestion will essentially auto create shares and auto populate the existing config / data questions with those shares it is also clearer to the user what is happening, so they will know where all of these auto_whatever named shares come from when they visit the shares page. > > I think this qualifies as fewer inputs as the button is a one click fix for creating and selecting the config & data shares that might be needed. There will however be more actual inputs shown though. Work around if this is not desired is an advanced tick box that shows the existing config / data selectors which might otherwise be hidden. > > Of course in all of the above we are assuming a user created pool, if one is not found then just prompt with "no user created pool found / available, please make one here" with a link to jump them to the pools page. Problem is when people want to use the rockstor_rockstor pool (not idea). So we just add "skip to advanced mode" link which activates our tick box and lets users through even if they haven't created a pool yet.

Thanks again for your import. Much appreciated. Pull requests are welcome and we have yet to get you current one in so apologies for that. Should be next in the list for attention and likely merge as it goes.

Again we are a small team taking on a lot of shifting sands. But once we have our Rockstor 4 out-of-the-door and addressed the majority of our tecnical debt (update our Django and do the Python 2 to 3 migration), we can again start to address directly our feature / usability improvements.

Cheers.

DrC · July 22, 2021, 8:10am

(Please just say so if im being annoying posting/reporting all of these thoughts. I tend to point things out because im an old school unix sysadmin who just cant hold back his 2cents, but im not a fanatic about the ideas/observations. Im 100% ok with ‘we are too busy’ or ‘wont do it’ and you dont have to explain. You have a groovy project and id hate to take time away from that in any way. You folks rock! Im just an old git giving it a whirl.)

“Rockons may interfere” / “not sharing data/config”. Yep. But /mnt2/all-dockers/syncthing and /mnt2/all-dockers/redis are not going to collide either. Its not the share that is stopping the collision, its a full filesystem path (which is independent of share). I dont see what you gain by making them seperate shares (other than maybe hard link’ed directories dont work maybe? not sure with btrfs) vs just seperate directories. They only need their own directory to mount as a volume inside docker, that doenst have to be a seperate share.

If you are going ponder autocreating shares for each rockon’s config, why not consider making a config-per-config// folder inside the share you have for holding rockon state? That solves the problem (each rockon has its own config folder) and yet it doenst create lots of shares (in fact it removes some of the user input you need).

#include warning <i havnt read the code so im making assumptions here>

Im actually interested in the idea of looking at making the docker stuff docker swarm compatable and it made me wonder how tightly the docker code is integrated into your ui/control suite. I was thinking that it seemed like it would be interesting to have an api/modular plugin system for the container controller. (Ah la webmin or something like it where you have a ‘front end’ which only does ui stuff and a set of plugins that do the work.)

In that way you could have it ‘hands off’ from the implementation details. (Im all for many small apps working together vs one unified app… i did say i was old school unix, right? grep pipe though word count kinda old school… haha.) I was pondering just how much the ui needs to know about docker or what info it needs to pass in to let a docker module get its work done.

This is just me pondering quietly while pondering ripping the rockon config info down and seeing if i can get it playing with docker swarm.

Eg, a rockstor box with lots of storage and exported shares. A pile of rockon-workers who pull their config from that.

Syncthing? That config is found at [rockstor:share]+/syncthing/config and media is saved to [rockstor:homeshare]+/blah. Logs go to [local:logs]+/syncthing

DNS? Config for that says [clone_on_start:rockstor:share]+/dns. Logs saved to [local:logs]+/dns (no mounts or external stuff needed)

rockon-workers could pull the info they need (rockons, shares, usernames/uids) from the rockstor box and maybe even autoconfigure. The same code would work on the rockstor machine. It would just use an api for pulling the data from the rockstor-queen box.

(Yeah, this is pie in the sky thinking that would be heaps of work. haha. But im a dreamer. Id love to be able to plug in a couple of raspberry pi’s as backup dhcp and dns for my network and i was pondering that as a use for swarm mode )

phillxnet · July 22, 2021, 12:33pm

@DrC No worries on ideas, always welcome. But yes it can take time to review/reference previous posts issue on what has been discussed and planned over the years. We need some kind of a road-map so folks can see. But again this is all time consuming. Oh well, bit by bit.

The main reason we don’t, at least yet, manage directories as a ‘unit’ is because we don’t yet have an extension to our subvolume management system that everything is based on. Plus adding that degree of flexibility leads to many more possible scenarios that are very likely to introduce a how host of new bugs and our current aim it to get everything we currently have working on our new OS via the “Built on openSUSE” effort of the Rockstor 4 release. This as it happens has also, accidentally, headed off the far more recent CentOS change of stance. So that’s a win we wern’t expecting when we embarked on this. There after we need to address the tecnical dept that the OS move has only compounded by diverting our limited contribution ‘power’. But it’s all good and we are getting there. However I’m super keen to not have significant functional additions (read features) until we have moved our Django and Python over.

Another element here is that a btrfs subvolume is little more expensive than a directory, bar the indicated overhead of UI management. And we get seperation of concerns by default which is always nice. a subvolume in btrfs shares it’s meta-data with the parent pool so it’s not en entirely seperate filesystem, but it looks rather like it. So a bit of bests of both worlds here. And again, flexibility brings exponential complexity and we need to debug all that we have before introducing more complexity or the project will become unmanageable and no good to anyone.

Not that much actually as our existing UI is just a docker wrapper at heart. But Web-UI and wrapper are tightly coupled via our existing database within Django. This is a system we already have in development and we have many rock-ons that depend upon it that have taken years to be contributed. To start over and adopt an external project that is not matched to our ‘ways’ is potentially very complex and ultimately may never be a good fit. However there is nothing stopping folks running other docker managers on the base OS re run on. But to expect intergration into our btrfs subvolume structure is unrealistic without more years of slow contribution and itteration. But just fine for advanced users who can configure the more advanced (read flexible) specialist docker managers to do their bidding within the limitations of the subvolume structure that Rockstor itself works within. I.e. all mounts within /mnt2 and a limited depth there in with snapshots etc. We need a tecnical doc for this really and don’t as yet have an updated one: again contributions are welcome to our docs on this front for those willing to put in the time required to explore and explain our own limitations. The simplest route to this is to watch what happens on disk when one does such things as subvol create, clone create snapshot create, smb shadow compatibility via snapshots etc.

Interesting. But is there a wide spread desire for this capability within our user base. And any complex code extension we take on are likely going to have to be maintained indefinitely there after so will have to come with excemplary docs such as we have with our current rock-ons (but only recently) thank to @Flox. Again the doc you you need to reference, and it in turn references the relevant code, is the following wiki entry in the forum:

What is deseptively difficult here is how easy it looks from the Web-UI what we already do. Take a closer look at the code and keep in mind that we have to continue to support such intricate mechanisms as our replication system if we are to add further complexity to our base unit of the btrfs subvolume. Sometimes however a little change in just the right place can yield some unexpected flexibility and we have seen that. This all depends, as you likely appreciate, on our base design being robust enough to handle these extension. I am of the opinion that we must first propagate our existing btrfs-in-partition to the system drive first. This is a non trivial endeavour but will unify and simplify our currently two disparat manegement systems re partitions/subvolumes. We must also enable base features such the ability to re-name pools and subvolumes. A far more requested feature than docker swarm and potentially relevant to our entire user base. But don’t get me wrong, I’m all for exploration but we do rather have our hands full with CentOS having dropped btrfs from even a technical preview dictating that we must then jump ship (OS) and abandon our then ongoing technical dept endeavour. So it’s a frustrating time to have so many feature requests. But again if you can become familiar with the code and our ways then dandy. We have more hands and that has to help. But realise that any major feature / capability extension is expected to be drawn out in detail via a wiki article prior to code contribution as there is then the posibility of wider community involvement that may or may not take place to help refine the ideas as presented. However on the other hand we are rather a do-ocracy here so you likely get to chose the exact implementation if you are the one standing up to implement it. But again we can’t take on much more major changes as we will soon have to convert all code to Python 3 in addition to the Django changes that may be involved.

So I say take a look at the code itself, we strive to have it self documented where possible and the forum wiki entries may also help in some places, i.e. UPS / Docker etc, and a likely initial up-shoot is “So that’s not right even as-is” which would be fantastic as then you help to ensure our existing framework can more easily handle the grander ideas we all would like to see implemented in a sustainable manner. My own favourite on this front would be GlusterFS via an easy setup across multiple Rockstor instances. Although a stepping stone on that front would be across multiple pools first. But if done right that then least to across Rockstor hosted pools and the ability to down a machine in a cluster for maintenance and have it re-assemble / re-intergrate into the cluster filesystem upon re-connection. We have to first fully enable our core competencies (ideally) first, such as share (subvol) and pool renaming. And we have made some progress on the latter but the former still mounts my name which is a broken approach if we introduce renaming. But the latter now mounts by subvol id (in later 3 and all 4 variants anyway).

Sounds a little like my ideas re clustering capability doesn’t it. I.e. in scope and broad terms involving wavy hands. You should start contributing, oh wait you have :). Nice. Again I’ll get to that pr soon.

You know we now run on Pi4’s I take it. See the Pi4 profiles in:

Nice to hear your enthusiasm and apologies to be the stick in the mud here. And any familiarity you can gain with the code is most likely to breed at lest some fixes of our existing systems so do jump in to anywhere you find interesting. I just can’t promise much hand holding for a while. Bu the code is really not in that bad a shape really, just Python 2 and older Django. However at least our oldest dependency recently got replace by our newest (django-ztaksd to huey in 4.0.6) so we are getting there.

Hope that helps, at least with some context and links.

@Flox is our Rock-on / docker wrapper main maintainer so I would ultimately back their chosen direction of this front, thought I would expect an open discussion in the interim as it goes.

DrC · September 8, 2021, 9:59am

I havnt exploded or been vaporised by aliens I still exist (i think, maybe im a simulation)

Dissapointingly, for my own use i ended up just getting a NAS solution rather than diy’ing because im lazy and i just want it to work, and have lots of drives, and be low low power AND the asustor hardware looks quite natty. I ended up going with an https://www.asustor.com/product/spec?p_id=63 if anyone cares (8 drives, 2 nvme, 10g net, btrfs, can take two more 4 drive expansions via usb3).

I have lots of friends who are IT challanged and Im wanting to offer them offsite backup via something like syncthing (and btrfs snapshots for backups).

So where that leaves me with rockstor is looking to build the bits that id need to provide a client for them to run in their homes that would do nas/syncthing/other useful plugins for them as easy to use clients. There is a lot of nice work here and it would be nice to be able to leverage that.

So Im still alive and still interested in this project

Just been reading about http://www.45drives.com/blog/uncategorized/what-is-ceph-why-our-customers-love-it/ and im curious why you lean towards glusterfs (#include )

EDIT: to probably answer my own question, ceph needs 3 servers at a minimum, gluster 2.

phillxnet · September 8, 2021, 3:51pm

@DrC Hello again.
Re:

That looks like a nice system. It would also probably boot Rockstor , assuming they haven’t hobbled the boot options.

Because we then leverage our current btrfs setup and management. Where as ceph, as I understand it, uses it’s own base block format. I particularly like the idea that any data is stored in a base file system, so if the ‘fancy’ inevitably complex cluster management fails/fall on it’s face, one can simply reference the base file system. A Ceph data ‘unit’ need ceph to read it. Glusterfs’s payload (your data) is ultimately held in a btrfs filesystem. Handy as we already manage those. Ergo leave Ceph for projects that start from scratch and we build on our strengths to date, plus the bit about base data being accessible without the fancy cluster bit if all else fails.

Glusterfs can be hosted all on one machine just fine, not ideal but it would be a nice start. I.e. a cluster filesystem across multiple pools in the same machine. But the main draw for me on Gluster is that the data, at it’s lowest level, is just a filesystem. Where as with Ceph it’s a block/object management system that needs Ceph to read, also looks more complicated from my initial perusal of the various options from way back. I appreciate the irony here that btrfs and all linux filesystems in turn use the kernel block layer however. But a ‘real’ filesystem underpinning a clustered filesystem rings more fail-over friendly to me and more generally accessible to our broad user base. I.e. Ceph is more of a stretch conceptually than Gluster. Lead R&D Engineer Brett Kelly of 45Drives elequently acknowledges this in the first video below at just before the 8:30 minute mark.

Yes, 45drives is quite the experimenter and good information source. See also their older article:

Plus their Ceph vs GlusterFS 3 part video series on YouTube:
https://www.youtube.com/playlist?list=PL0q6mglL88AN_5JegK2JejnUThlVTeYSO

They use both apparently. Also note that Facebook uses GlusterFS; so there’s that. Plus Facebook also use a lot of btrfs (along with XFS). It may well be that ultimately at the extreeme end of storage Ceph will make more sense, or have the edge. But the vast vast majority of current human storage may well be just dandy with an old fashioned filesystem underneath. And given Facebook I believe is the largest human endeavour in storage we may be OK for a bit with our regular old/new filesystems. Plus that bit about reading the data without needing the cluster to be ‘working’ is nice I think.

Gluster and Ceph are both excellent and mature technologies, but I see Gluster as a more natural fit for Rockstor’s immediate future given our small team and that we already have the btrfs management in place. Btrfs was incidentally the preferred/recommended backing filesystem for Gluster, they then did a spell of recommending XFS when btrfs hit some sticky bits. Not sure where they stand currently though but since those sticky times btrfs itself has come more of age, i.e. in mass adoption within consumer NAS and in for example openSUSE and Fedora more of late.

Sorry to have waffled a tad there, super interesting stuff and I’m itching to be pushing ahead on this but all in good time. Django update + Python 3 first then more play with the fancies I think.

Glad to here it and thanks for your input.

GeoffA · September 8, 2021, 4:19pm

That must be mine - it has the same IP address

Seriously though, looks a nice bit of kit. If I had $1k down the back of the sofa that is lol.

DrC · September 10, 2021, 6:56am

It’s a great piece of hardware but the software is a bit crippled, so I do have some regrets. But It does really blast out bits which is nice.

It does it’s btrfs raid with lvm rather than btrfs DUP which is weird (but it also supports ext4 raid the same way so I guess they bork it down to the lowest common denominator). Not having a full OS to butcher, err i mean hack makes me sad too.

Oh well, it has all the hardware i need, just a pity i cant run rockstor on it.

phillxnet · September 10, 2021, 12:08pm

@DrC Re

Maybe there is some DIY NAS software that closer fits your needs, or is accessble to your own input maybe . I’ve heard Rockstor v4 is worth a try.

Yes there are several consumer NAS arrangements the do likewise, or via mdraid. Feels a bit half hearted to me but it’s what they know. But again, there are alternative NAS software projects that may come in with such things.

Why not. Have they in-fact hobbled it. The listed compatibility has lots of OS’s, maybe they mean for connection purposes. Also the eMMC is listed at 4GB so that’s tight but may be doable. We also have a customisable DIY installer you know . So if it needs special treatment you may be able to develop a custom profile for it. But likely there is a work-around like booting it from an unused sata port or even a USB3.0 attached system disk. Assuming the no-hobbled bit. Plus I bet they have used a pre-existing (read kernel build-in) module for the build-in display.

Likely you will need to re-do all data on it though as we don’t mess with anything but btrfs down to the hardware to keep the ‘layers’ to a minimum and to keep only one ‘player’ all the way down which is part and parcel of a more predictable arrangement I think. I.e. btrfs only has to play nicely with itself. No other partition management no other device management. Just data pools on bare whole-disk devices.

They also likely have a serial console so if you can hook-up to that you are at least part way there. I don’t see a generic hdmi or vga output? Our ARM64EFI profiles have an example of additional serial console kernel command line additions that may help if you do get a serial adapter hooked up.

Fun to see Rockstor running on all-sorts so let us know if you go that way. It is at least x86_64 based so you may well have a smooth time of it once you get a console rigged up.