sad/exciting time ahead

One of our customers has become fed up with the issues they’ve run into on Gluster. Started about a year ago, with some odd outages in the 3.0.x system, and didn’t improve with 3.2.x … in some instances it got worse. RDMA support in 3.0.x was pretty good, there were other bugs (which were annoying). The migration to 3.2.x was rocky. Libraries left from 3.0.x were somehow picked up and some things just failed.

Suffice it to say that this customer was experiencing Gluster outages on a weekly basis. Usually involving a long phone call to me for the post mortem. And there came a point in time, after watching Gluster get absorbed by Red Hat, and realizing that the ties that I had with the Gluster engineering team had now been … er … reduced … that our ability to get this customer the support they needed was now problematic.

Add to this a hardware RAID issue (vendor has trouble admitting that they have a problem, despite a reproducable problem we’ve seen pretty much everywhere). Sadly they are still the best vendor of the lot, even though their support is not what one might call “good”, or even “acceptable”. Or “workable”.

So we absorb most of the brunt of their failures. And the software bugs.

It got to the point with this one customer that we were spending 5-6 hours per week discussing the latest failure (usually at a Gluster level).

So the customer has kicked Gluster to the curb. I am currently re-commissioning one of the machines now. This is a fairly sizable storage cluster, and one of the early “successes” that we had with Gluster. It had just worked in the early days, though there were some bugs. The mistake appears to have occurred in moving them off of 3.0.x to higher numbered versions. Thats when things went from annoying to terrible.

Thats the sad part.

The exciting part is that they are going to be giving Fraunhofer Parallel File system a try. Initial tests on our recommendation have been … very encouraging.

The hardware is solid (modulo environmental changes). We even have a workaround for the recalcitrant RAID vendor. And we have some stuff development which should handle issues for us soon, so we won’t have these problems to worry about anymore.

Yeah, its an exciting time.

Viewed 18894 times by 3269 viewers