A fork in the Lustre road?
By joe
- 5 minutes read - 928 wordsI’ve been waiting a while to post this, to see how things develop. Lustre does indeed have a future. The question is, will Oracle cede control over Lustre, or will it be forked by OpenSFS/WhamCloud/Xyratec ? A few short months ago, its future was cloudy at best. Oracle isn’t seemingly interested in HPC, except where it matters for the database side of things. So most things HPC specific (with little possible alternative use cases) have been given the heave-ho. Products/projects with more than a pure HPC use case (Gridengine, etc) have been retained by Oracle and given an expanded mission. Which leaves Lustre in a decidedly difficult position, as it is basically an HPC file system (and little more).
Lustre has many issues, and some of the discussion recently on the Lustre mailing list as replies to the announcement of OpenSFS suggest that some people are getting fed up waiting for these things, and are considering alternatives. Lustre is basically an object store, with a single metadata server. Great for large streaming loads, not so great for small files and lots of metadata ops. Lustre doesn’t do a good job in terms of enabling something akin to replication or RAID/RAIN, this is done at a lower level, at the storage level via replicated block servers. The industry has been moving away from this for a while, but Lustre presumes this as a base. Lustre’s single centralized metadata server isn’t just a bottleneck, its a liability. No metadata replication or distribution means that it is possible to lose an entire file system by losing the metadata server. Which makes that a very important server. We have many customers using Lustre. Many building or have built business and process dependencies upon it. We do try to make them aware of the issues (we want our customers to go in with their eyes open so they aren’t surprised). One of the big issues has been support. It was difficult to get a quote out of Sun for support, and Oracle will only support Lustre on Sun/Oracle gear. Which isn’t meaningful for us or our customers. Hence our partner in support is Clusterstor now part of Xyratec. Another big issue is the future of Lustre. Where is it going, and who will decide what happens to it? What features will it get? Oracle owns Lustre, Lustre IP, Lustre logos and copyrights. Oracle could close it up going forward (and from their previous announcements, this appears to be the direction they, in effect, have gone). OpenSFS looks like (from an outsiders perspective) it may fork Lustre. Since Whamcloud just joined the group, I’d expect them to contribute. Lustre is licensed under GPL v2.0 (at least 1.8.4 is, I haven’t checked 2.0) so it can fork with the removal of copyrighted information from the release. Much as Centos is a rebuild of RHEL, it is possible to do the same thing with Lustre. If you read the slides, especially page 4, they seem to suggest that they won’t fork it, but augment it. Slide 5 suggests they will push these changes back to Oracle for inclusion. So what if they aren’t included? What if the bug fixes, and feature updates, and other bits, simply result in a mute Oracle doing its own thing? Slide 6 suggests they won’t fork. Honestly, I think that is likely to be wishful thinking without buy in from Oracle. You can look at this arrangement as similar to Ubuntu and Debian. Ubuntu is/was based upon Debian releases, and Canonical adds its own additional value. Which it offers back up to Debian. Who often don’t take the contributions. So Canonical gradually carries forward more and more changes, until they diverge. A fork in the code. I think this is where we are headed. I know … it could be more like Centos and RHEL, where the former is a slightly value added rebuild of the latter, without the copyrights/etc., and with features Redhat “forgot” (cough cough … xfsutils … cough cough … among many others). Oracle could surprise me and be receptive to this model. Or surprise me and donate the whole kit and kiboodle to OpenSFS. Both of these would be good strong positive directions. Or they could play the same games they played with OpenOffice and OpenSolaris. Which would not be good. Add to this, that there are current and emerging competitors to Lustre. Whether folks like it or not, GlusterFS is maturing nicely, has features Lustre doesn’t (replication, distributed metadata), and is quite usable as a parallel file system. Ceph is maturing though it is sitting atop btrfs which also is having some growing pains (latest corruption bug is not something I think most folks would trigger, but it still needs to be corrected). There are several others as well. I’d argue that its in Oracle’s best interest to be receptive to working with OpenSFS, or to donate the Lustre code ownership to them, for a seat at the table. Because if Lustre forks (which I am pretty sure what will happen if Oracle doesn’t work with the OpenSFS teams), they will likely lose many Lustre customers, as people run for a community supported project over a single vendor supported project. We will see what happens. I know OpenSFS is claiming no fork now. Lets see what happens in 3 months, 6 months, a year, 2 years. In the mean time, whats out there will be supported, and will be extended. The fork, if it occurs, is about who gets to make decisions about direction, features, etc.