## Brittle systems

Years ago, we helped a customer set up a Lustre 1.4.x system. This was … well … fun. And not in a good way. Right before the 1.6 transition, we had all sorts of problems. We skipped 1.6, and now we have set up a Lustre 1.8.2 system, and have several on quote now for various RFPs.
From our experience with the 1.8.2 system … I have to say, I have a sense that it is brittle. Yeah, you can call it “subtle and quick to anger”, or even praise some of the design features/elements.
It just has many moving parts, some work well (MDS), some … well … not so well (OST problem notifications). The failure surface is huge, and figuring out where you are on that surface has become effectively the morning cat-n-mouse game for us.
Speaking with other vendors, friends running these systems, I get the sense that people design/build Lustre systems defensively. That is, they know and appreciate that it will break, so they aim to limit the damage from this breakage. Control or limit the unknowns.
This could be something of a harsh assessment, but I just caught myself doing exactly this for a customer’s configuration. They requested Lustre, and we went back and designed defensively.

## Imagine … trying to get something as simple as a quote for Lustre support …

… and not being able to. Seems most of the folks at Sun/Oracle haven’t heard of Lustre. I had to explain it to them on several calls yesterday. They didn’t understand why someone would want to pay for support of a GPL licensed system … er … ah … mebbe we found some real nice … Read moreImagine … trying to get something as simple as a quote for Lustre support …

## now OpenSolaris' future in doubt

Sun/Oracle has decided to change strategy around Solaris. But bad news is for the community as Open Solaris has an uncertain future. Oracle has made clear that not all features from Oracle Solaris will be added to Open Solaris. Oracle may not even release the code of most of the new features of Oracle Solaris … Read morenow OpenSolaris' future in doubt

## The fat lady's song is now over, and the curtain is falling

SCO lost. As I had said to some local colleagues who were (for reasons I could not grasp) swayed by SCO’s arguments, this would not end well for SCO. And it didn’t. The game is effectively over. Its time to wind down SCO as an entity in an orderly manner, to distribute the remaining value … Read moreThe fat lady's song is now over, and the curtain is falling

## This could be huge … and disruptive

ACLU seems to have taken down the BRCA gene patent from Myriad Genetics. This could actually change a chunk of the drug development business model. I am not sure if this is a good thing (the business model change), though I also didn’t think that one could patent what is effectively naturally generated prior art. … Read moreThis could be huge … and disruptive

## The evolution of the data center

Way back in the day, data centers used to be cold. Cold air came in, and usually in hot-aisle/cold-aisle configs, left through the back.
Power per rack was measured in a few thousand watts.
Cooling per rack could be mebbe one ton of AC. Up to two in the worst case.
Then stuff got denser. Somewhere along the line someone decided they could run their stuff at higher temperatures. This works fine for machines that are actually mostly open space (blades, sparsely populated server systems, …). It doesn’t work so well for densely populated server systems.
Inlet temps above 72F can be a problem for dense electronics. Poor airflow in a data center (e.g. no real positive pressure on inlet, no real negative (relative) pressure on outlet is a real problem.
Yet we’ve seen enough of our share of such data centers in the last 6 months that I am starting to question some of the designs I see. We might have to start actively asking customers, do you have the following conditions in your data center (and then list them), for optimal use case. If not, we’ll have to ask some defensive questions, such as, do you have inlet temperatures below 72F. Do you have positive front pressure, and negative back pressure.

Read moreThe evolution of the data center

## There is/was a name for my pain

… yeah, the kidney stone saga continues. Had a basket extraction Wednesday, fine most of Thursday till evening, then Friday morning, they decided to remind me who was boss. Off the the ER I went, in terrible pain. Kidney stones are not life threatening, though there are times you wish death was less painful. Well, … Read moreThere is/was a name for my pain

## Don't share anything important or of value via Linkedin … they will own it!

[update] trackbacks/pingbacks temporarily disabled. Waaay too much spam. Seriously. From their updated user agreement: License and warranty for your submissions to LinkedIn. You own the information you provide LinkedIn under this Agreement, and may request its deletion at any time, unless you have shared information or content with others and they have not deleted it, … Read moreDon't share anything important or of value via Linkedin … they will own it!

## Fixed up some of the siCluster tools

Well … more correctly, fixed the data model to be saner, so that the tools would be easier to develop and use. Still a few more things to do, and one (simple) presentation abstraction to set up.
The gist of it is that (apart from the automatically added nodes), adding nodes by hand should be easy. This also means by XML (not done yet, but I know how to do this), and web (basically XML or CGI like devices).
So I want to add a node into our database.
 root@manager:/etc/cluster/bin# ./add_nodes.pl --index=4 --slot=4 --name=paul --location=rack5 Inserting node into cluster.db
And sure enough, its there …
 root@manager:/etc/cluster# bin/ls_nodes.pl george, eth3=10.100.1.1/255.255.0.0, ipmi=10.101.1.1/255.255.0.0, wifi=10.102.1.3/255.255.0.0[fast] harry paul
now lets attach a network interface to this node

Read moreFixed up some of the siCluster tools