Storage bandwidth wall writ large

Henry Newman, CEO/CTO of Instrumental, has a great article on Enterprise Storage Forum.
Remember, what we call the storage bandwidth wall, e.g. the time in seconds to read/write your disk, is your capacity divided by your bandwidth to read/write that capacity. Its a height, measured in seconds, to take one pass through your data.
If you can read/write at 1GB/s and have 1TB of data, your wall height is 1000GB/(1 GB/s) = 1000s. Which gives you a rough (best case) scenario for access.
Henry does a really good job describing problems with large archives (multi-PB range) which must be bit accurate, and not change this. Ever.
Some of the things he calls out are, maybe, less of a problem (apart from some poorly designed data stores). Anyone not using ECC ram in their units … yeah … well … some things can’t be helped. FWIW, we (and Google etc.) haven’t seen amplified corruption on “consumer level” drives. We have seen some enterprise drives do very … very … bad things. So much so that there are now brands we will not give serious consideration to again for years (to give them time to work the kinks out of their systems), that our competitors with … well … a bit less concern, happily put in their systems.
There’s nothing magical about the other issues. But there is a big one which can’t really be addressed very well by many of the designs on the market.

Read moreStorage bandwidth wall writ large

Sometimes you get the bear … other times, the bear gets you

This took guts. The (new) CEO of Nokia noting that there are issues going forward. Nokia has had great handsets. I still recall with great fondness, the E61 that I left in a taxi somewhere in London after visiting a customer …
But Nokia hasn’t innovated in a meaningful way, hasn’t adapted well to the rapid change in market conditions. Like RIM, their phones are competent, excellent phones. Unlike Apple and Google/Android, their phones don’t have a great user experience.
I have a private (family) cell which is a blackberry. Long story, but we got it for free when my wife upgraded. Until the blackberry 6 OS, I’d say that RIM was in deep trouble on the user experience front. Their “OS” is really one large Java application. And if you know anything about Java, you know it is slow on every platform. Very much including small portable low power CPU platforms. The OS v6 is actually not bad. Pretty easy to use, a radical departure on many fronts from a traditional blackberry OS. Give it a touch screen and a few other things, and it could be interesting.
But Nokia … they don’t have much like this. They had the N900 … roughly a way early version of an iPad, without some of the bells and whistles. They don’t have a consistent baseline OS strategy. They don’t have a modern look and feel.
There is lots to like about the iOS (and Android) platforms as a user. Stuff works, and is pretty intuitive. Mostly smooth functioning, though the Android units do feel slower or more clunky in some aspects than the iPhone unit.
Nokia’s problem is that it has all the legacy, and none of the “sizzle” this market demands. Which is a problem.

Read moreSometimes you get the bear … other times, the bear gets you

I know I shouldn't be … but I am …

[update] a bug in my reasoning (thanks Peter!)
a Perl Golf addict. Not a recovering addict, but one that is active.
What is Perl Golf? Well, as in real golf, you try to provide the minimal number of steps to a solution. In this case, you are to solve the specific puzzle.
Detractors of Perl often make snarky comments about Perl’s equivalency to random line noise and other such nonesense. Sure … if it makes you feel good to say that … I am a fan of terse languages, I wrote programs (if you could call them that) in APL … a while ago. Its a strange sensation when you realize, looking at your code, that you can parse it … and run it … in your head. But I digress.
I ran across this one today.
The problem is:

If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. The sum of these multiples is 23.
Find the sum of all the multiples of 3 or 5 below 1000.

Most folks would start off with elegant looking loops, with well spaced and well named code and variables.
This is the antithesis of what you should do in golf and in Perl golf. Try this in the language of your choice.
Here’s my 5 minute version (call it s.pl):
map$s+=(!($_%3)+!($_%5))*$_,1..1E3;print$s
map$s+=(!($_%3)||!($_%5))*$_,1..1E3;print$s

[update] The “+” should be a logical or, otherwise we would double count numbers which are multiples of 15.
To run it
landman@lightning:~$ perl -l s.pl
267333
234168
In octave (Matlab work-alike) …
octave:12> 3*sum(1:333)+5*sum(1:200) 3*sum(1:333)+5*sum(1:200)-15*sum(1:1000/15)
ans = 267333

234168
Thats pretty cool.
Some of my other neat solutions was to solve the LED problem.
You run it like this:
./led.pl SOME_NUMBER
and it generates what looks like an ascii art version of the number. Like this:

landman@crunch-r:~$ ./led.pl 12345987
  # ### ### # # ### ### ### ###
  #   #   # # # #   # # # #   #
  # ### ### ### ### ### ###   #
  # #     #   #   #   # # #   #
  # ### ###   # ### ### ###   #
landman@crunch-r:~$ ./led.pl 21435687
###   # # # ### ### ### ### ###
  #   # # #   # #   #   # #   #
###   # ### ### ### ### ###   #
#     #   #   #   # # # # #   #
###   #   # ### ### ### ###   #

Neat … huh?
So how hard was this? Here’s the code that does it, and it can be improved upon, significantly.

Read moreI know I shouldn't be … but I am …

fun with SCSI targets

Had some fun today with our SCSI target. Its a very nice system, very powerful. Not terribly easy to use. But it works well. We have tools we developed around it to make it easy to use.
Creating iSCSI targets works nicely with our target code. It builds the target, sets up the infrastructure. Done with thin provisioning, its pretty fast and mostly painless.
Well, it was until we discovered that the stack, while including /etc/initiators.allow and /etc/initiators.deny support, and even honoring them at a coarse level, had different semantics than the documentation indicated.
Caused us some grief, but the LUN masking was working. Just not the way we thought it was.
LUN masking enables the initiators (the thing requesting the block device) to only see the relevant LUNs for them. But in this case, the LUN masking was moved out of the initiators.allow and initiators.deny.
Ugh. At a coarse access control level, it worked. But at the fine grain LUN masking, we had to use a different approach. Sort of like tcpwrappers for baseline ACL, and a secondary mechanism for finer grain control.
Its fine, just caused us to have to do some things differently. For ~50 targets, our workaround isn’t very fast. Our preferred mechanism should be much faster, but it looks like some things haven’t been as well documented in the new code.
Sadly, this SCSI target isn’t the new one going into the kernel. That one, LIO, has a very different management interface, not to mention a different execution model. I am not at all sure how well they will scale to hundreds, thousands, and more targets. This isn’t a critique of ours or the LIO code. Moreover, our stack provides SRP, and LIO doesn’t. This is a problem, we aren’t ready to give up SRP. Our stack supports RDMA over IB (SRP), while LIO doesn’t yet.
Hopefully this will get fixed soon, and the two teams will work together. Hopefully this won’t be a repeat of the LVM2 – EVMS bit from a few years ago. In retrospect, the wrong stack (LVM2) won there (IMO). Our stack and LIO are both complex, both solving hard problems, in different ways. We can always recode around new management interfaces. Giving up important functionality? Not so much.
LWN.net has an article about this. We are mentioned in it. Search for LIO and SCST. Or Scalable Informatics.

Read morefun with SCSI targets

And yet again …

Me: (presents A) “So what do you think?” Them: “Hmmm … nice but what comes after A?” Me: “Lets get another time slot and I’ll go over that” (time passes … order of weeks) Me: (presents post-A) “So what do you think” Them: “Hmmm … nice but what comes after post-A?” Me: “Lets get another … Read moreAnd yet again …