Requiem

She's dead Jim!

This is the post an entrepreneur hopes to never write. They pour their energy, their time, their resources, their love into their baby. Trying to make her live, trying to make her grow.

And for a while, she seems to. Everything is hitting the right way, 12+ years of uninterrupted growth and profitable operation as an entirely bootstrapped company. Market leading … no … dominating … from the metrics customers tell you are important … position.

And then you get big enough to be visible. And you realize you need more capital to grow faster.

And you can’t get it, because, well … Detroit? 12 years of operations and you aren’t google sized?

Then you try to find capital. You go through many paths. One group complains that Pure storage is younger than you so you should be bigger than them, and you have a serious case of WTF. Another sets up what looks to be an awesome deal, until they demand all your IP for free forever, to do whatever they want with. Then another group … . VCs tell you things, you pitch them, and they can’t get to yes*

Well, lets just say this is the non-fun side of things.

Or when you go visit customers, they love your product, the benchmark results are mind blowing … and then the CIO/CFO kills the deal because, you know … you are too small.

And then, while you are talking to potential acquirers, your bank holding your LOC decides to blow you up. In the middle of the conversation. Literally.

Did I mention that this was the non-fun side of things?

You realize that while you may have been able to pull rabbits out of hats for the last 14 years, that you just hit a perfect storm. A veritable cartoon buzzsaw from which there really was no escape.

Your baby … the one you put so much energy, time, personal money into. Your baby was dying.

And there was little you could do to stop it.

There is a surreal quality to this. One not well captured in words. Not unlike an observer in a different temporal reference frame (e.g. you clock is running at a different speed), while all around you things are happening in real time.

There are the 5 stages of grief and coping.

You reflect on what you started, on what you contributed to. Small sampling would be defining and using the APU term for AMD, building tightly coupled computing and storage systems before hyperconvergence was a thing, designing accelerators and a whole market/sales plan before GPGPU was a thing, getting major customers signed up to a grid concept before there was a grid, and more than a year before there was AWS.

You reflect on your failures, or things you failed at. Raising capital in 2002-2006 to build accelerators. Raising capital in 2004-2006 to build a cloud. Raising capital in 2008 to build extreme density machines. Raising capital in 2011-2013 to build tightly coupled systems. Raising capital in 2014-2016 to build appliances.

You reflect on your successes. A small sampling: Taking a business from zero to multiple million dollar revenue without external investment (though I did try to get it). Setting very hard to beat benchmark numbers by designing/building the best architecture in market. Solving significant automation, support problems with better implementations, and skipping broken subsystems.

We’ve had people offer to “buy” us. For free. Come work for us, they said, and maybe, maybe we’ll pay you. Others have asked the amount on our LOC and tried to lowball even that.

I’ve spoken with companies that claim we have nothing of value to offer, then moments later offer to bring me on in “any capacity.” I’ve been told “name my position” many times. The mental disconnect required for people to make these contradictory statements is absolutely staggering.

You go through this “woulda, coulda, shoulda” thing. You think through all of it. How, if only, X had happened, then Y. You look for where you messed up. And realize that as hard as you are trying to blame yourself for this, you weren’t the only actor in the saga, that the other actors also had choices. You can argue that they made poor choices (and fundamentally, the number of phone calls you’ve fielded from people regretting the choices they’ve made which have involved buying “competitive” systems versus yours … this is the very definition of opportunity cost).

I’ve learned how to pull rabbits out of hats. I always knew there was a finite supply of rabbits and hats. And someday, I would not be able to do that anymore.
Going through this in your mind, you wonder, could you have found another rabbit to pull out of another hat.

And then you realize that you couldn’t have.

That day happened. The hats, and rabbits, were gone.

This has been a ride. A wild ride. A fun, fantastic, terrifying, gut wrenching, stomach turning, frustrating, frightening ride. This has been a humbling experience. I’ve learned a tremendous amount, about human nature, the nature of risk, the nature of competition, how people make decisions, and so forth. I’ve learned how hard you have to work to get customers to buy. How much effort you have to undertake. How much pain you have to absorb, smiling, while you deal with failures you have nothing to do with, but still fall … upon … you.

So here we are. At the end of the process. Life support has been removed. We are just waiting for the rest of nature to take its course.

Selling off the assets, IP, brands. joe.landman at the google mail thingy if you want to talk.

* A new Landman’s law is coming: “Any answer from a VC that is not yes, and not immediately followed by a wire transfer, is a no. If VCs tell you anything other than yes, it is a no. If VCs can’t make up their mind in less than a week, it is a no.”

Viewed 99756 times by 6990 viewers

Some updates coming soon

I should have something interesting to talk about over the next two weeks, though a summary of this is Scalable Informatics is undergoing a transformation. The exact form of this transformation is still being determined. In any case, I am no longer at Scalable.

Some items of note in recent weeks.

1) M&A: Nimble was purchased by HPE. Not sure of the specifics of “why”, other than HPE didn’t have much in this space. HPE also tends to destroy what it buys (either accidentally by rolling over it, ala ibrix, or on purpose).

Simplivity was also purchased by HPE. This gets them a HC stack. But … same comment as above.

Intel has been on an acquisition tear. Buying folks to try to get into driverless cars, defeat NVidia, etc.

2) scaling back: DSSD was killed. NVMe over Fabrics. SAN for NVMe. At first I thought this was a good idea. I no longer think this. SANs are generally terrible ideas, a 1990s/2000s architecture with a number of critical assumptions that have not been true for a while (at least a decade) now.

3) government budgets: Scaling back across the board, apart from defense. See this link for more. Some are yelling that the sky is falling. Others realize that this is not true, but more correctly understand that fiscal restraint now, while painful, will prevent far harder austerity in the future. Though there are those whom are turning this hard economic reality into a political fight (and they shouldn’t, but they are).

4) The economy is looking up in the US. This seems to be correlated with the election (which, I am still trying to understand). Real hiring, not the faux version we saw over the last 8 years, is on a tear. There are reductions in U6 (aka the real unemployment numbers). This is, surprising, but positive.

Viewed 45624 times by 3855 viewers

Best comment I’ve seen in a bug report about a tool

So … gnome-terminal has been my standard cli interface on linux GUIs for a while. I can’t bring myself to use KDE for any number of reasons. Gnome itself went in strange directions, so I’ve been using Cinnamon atop Mint and Debian 8.

Ok, Debian 8. Gnome-terminal. Some things missing when you right mouse button click. Like “open new tab”. Open new window is there. This works. But no tab entry. Its been there, like … forever.

What happened to it?

Google google google, and I find this page. This page has a comment on it, where the person commenting must have been dead-panning.

The only explanation for removing the menu item “File/Open Tab” is, that maintainers of gnome-terminal do not use Gnome terminal themselves. Also in 3.18.3 (Ubuntu 16.04) if I expand the terminal size to maximum (double click on the menu bar) it would not return back to its previous size when double clicked again.

Yeah … this may sum it up.

Looking at this, I get a sense of “lets remove something from the code base to simplify things. Wait, it was useful? Who uses this?”

I dunno … everybody?

Cinnamon is the fork of Gnome I use, as it is usable (as compared to the main branch). Most of my desktops are Linux Mint, but for a number of reasons, I need to have this environment be nearly identical to servers I build, hence the same base distro. Which also means I get to see some of the crazy changes people make to code, in the interests of being “better”. With often the exact opposite effect.

Serious #headpalm and #headdesk on this.

If you are going to change someone’s UI/UX/workflow, make sure it is a meaningful/useful change. Change for changes sake … is not good.

Viewed 41542 times by 3610 viewers

structure by indentation … grrrr ….

If you have to do this:

:%s/\t/    /g

in order to get a very simple function to compile because of this error

  File "./snd.py", line 13
    return sum
             ^
IndentationError: unindent does not match any outer indentation level

even though your editor (atom!!!!??!?!) wasn’t showing you these mixed tabs and spaces … Yeah, there is something profoundly wrong with the approach.

The function in question was all of 10 lines. The error was undetectable in the atom editor. Vim saves the day yet again, but … it … shouldn’t … have … to …

Viewed 42171 times by 3756 viewers

What is old, is new again

Way back in the pre-history of the internet (really DARPA-net/BITNET days), while dinosaur programming languages frolicked freely on servers with “modern” programming systems and data sets, there was a push to go from a static linking programs to a more modular dynamic linking. The thought processes were that it would save precious memory, not having many copies of libc statically linked in to binaries. It would reduce file sizes, as most of your code would be in libraries. It would encourage code reuse, which was (then) widely seen as a strong positive approach … one built applications by assembling interoperating modules with understandable APIs. You wrote glue logic to use them, and built ever better modules.

The arguments against this had to do with a mixture of API versioning (what if the function/method call changed between versions, or was somehow incompatible, or the API endpoint changed so much it went away … to security … well sort of. The argument, though not fully appreciated how powerful it was at the time, was that rogue code libraries could do nefarious things with these function calls, as there was no way to verify ahead of time, the veracity of the libraries, or the code calling them.

The latter point was prescient. And we’ve still not fully mapped out the nature of the exploits possible with LD_PRELOAD type attacks. You don’t need to hijack the source code, just change the library search path, injecting your own code ahead of the regular library code.

That is, your attack surface is now gigantic.

Ok.

But for the moment, I’ll ignore that. I’ll simply focus on two aspects of static vs dynamic linking I find curious in this day and age.

First, a language growing in popularity within Google and a few other quarters is Go. Go offers, to some degree, a simplified programming model. Not nearly as much boilerplate as Java (yay!), and they make a number of things easier to deal with (multi-processing bits using channels, etc.). While it has a number of very interesting features, one of the aspects of go I find very interesting is that, by default, it appears to emit a statically linked binary … well … mostly statically linked. Very few external dependencies.

Here’s an example using minio.io code.

        # ldd minio 
	linux-vdso.so.1 (0x00007ffe605cc000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f1227da3000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f12279f8000)
	/lib64/ld-linux-x86-64.so.2 (0x000055e70fc40000)

This means, as long as the c, pthread, and ABI aspects don’t change, this code should be able to run on pretty much any linux kernel supporting x86-64 ABI.

Why is this important.

Very trivial application deployment.

No, really. Very trivial.

Ok, not all of this is due to Go’s propensity to be all-inclusive with a minimal footprint outside of itself. Some of this comes from thoughtful application design … you make environmental setup part and parcel of the application startup. If it doesn’t see an environment, it creates one (~/.minio). Which, with the next enhancement, makes it very easy to deploy fully functional, fully configured software appliances.

As a quick comparison, Minio’s purpose in life it to provide an S3 object store with significant AWS compatibility. And to make this drop dead simple. Compare this to a “similar” system (that I also like) in Ceph. Ceph provides many things, has a huge development group, has Red Hat behind it, and lots of vendor participation in their processes. Deployment of Ceph is anything but simple. There are a few efforts in play to try to make it simpler (see below), but nothing I’ve seen so far is either as good as Minio, or even working at this stage for that matter.

Again, to be fair (and I like/use both Ceph and Minio), Ceph is aiming for a bigger scope problem, and I expect greater complexity. Minio is a more recent development, and has exploited newer technologies to enable a faster bring-up.

The second aspect I find interesting in the static vs dynamic linking … are the way containers are built in Linux with Docker and variants. Here, instead of simply deploying a code within a zone, a partial installation of dependencies for that code is made in the container. This results in often, very large containers. There have been efforts to slim these down, using Alpine Linux, and various distros working on building minimal footprint base lines. This latter element, the minimal footprint baseline, runs a stark counter to the previous massive dependency radii that they’ve all grown to love over the last N years. In Red Hat/CentOS, you can’t install qemu/kvm without installing gluster. Even if you will never use gluster. Which means you have a whole lotta dead code in your installations you will never use in many cases.

This dead code is an attack surface. Having it there really …. really …. doesn’t help you, and means you have a much larger perimeter to defend.

But back to my point about containers … the way we are building containers in Linux is … effectively … statically linking application code with dynamic linking, to their .so/.dlls that they need to support this, and then defining the interface between the two namespaces. This is why the containers are so damned bulky. Because we are effectively statically linking the apps … after they have been dynamically linked.

Back to Minio for a moment. They provide an S3 compatible object stack, with erasure coding, replication, web management, etc. … many features of the Ceph object store … in this sized package:

        # ls -alFh minio 
        -r-x------ 1 root root 24M Feb 15 20:47 minio

Deployment is

# ./minio help
NAME:
  Minio - Cloud Storage Server.

DESCRIPTION:
  Minio is an Amazon S3 compatible object storage server. Use it to store photos, videos, VMs, containers, log files, or any blob of data as objects.

USAGE:
  minio [FLAGS] COMMAND [ARGS...]

COMMANDS:
  server   Start object storage server.
  version  Print version.
  update   Check for a new software update.
  help, h  Shows a list of commands or help for one command
  
FLAGS:
  --config-dir value, -C value  Path to configuration directory. (default: "/root/.minio")
  --quiet                       Disable startup information.
  --help, -h                    show help
  --version, -v                 print the version
  
VERSION:
  2017-02-16T01:47:30Z

Yeah. That’s not hard at all. Maybe there is something to this (nearly) static linking of applications.

Note by the way, there are a few Ceph-in-a-container projects running around, though honestly I am not anticipating that any of them will bear much fruit. There are also some “embedded” ceph projects, but similar comments there … I’ve played with many of them, and IMO the best way to run Ceph is as a stand-alone app with all the knobs exposed. Its complex to configure, but it works very well.

Minio targets a somewhat overlapping but different scenario, and one that is very intriguing.

Viewed 50680 times by 4337 viewers

That was fun: mysql update nuked remote access

Update your packages, they said.

It will be more secure, they said.

I guess it was. No network access to the databases.

Even after turning the database server instance to listen again on the right port, I had to go in and redo the passwords and privileges.

So yeah, this broke my MySQL instance for a few hours. Took longer to debug as it was late at night and I was sleepy, so I put it off until morning with caffeine.

I know containers are all the rage now (and I’ve been a proponent of that for a while), but this was a bare metal system running the database, with a bunch of VM based services (I want stronger isolation guarantees than I can get out of docker and related things on linux … I know I know, use SmartOS … planning to for some other stuff, as I have somewhat more time to play/learn/do ).

Still … surprises like this … not so good. Goes back to my theory that distributions should have as small an install as possible, with services offered as VMs and/or containers. So software updates can be trivially rolled back if and when … they break something.

I did this in the past at Scalable Informatics with the software defined appliances, where the entire OS image can be rolled forward/backwards with a simple reboot, as it was immutable. Really a distro needs to be this … the whole concept of a bare metal install should be one of absolutely minimal footprint, with hooks to enable modular services/functions/features.

Viewed 45672 times by 3950 viewers

An article on Rust language for astrophysical simulation

It is a short read, and you can find it on arxiv. They tackled an integration problem, basically using the code to perform a relatively simple trajectory calculation for a particular N-body problem.

A few things lept out at me during my read.

First, the example was fairly simplistic … a leapfrog integrator, and while it is a symplectic integrator, this particular algorithm not quite high enough order to capture all the features of the N-body interaction they were working on.

Second, the statically compiled Rust was (for the same problem/inputs) about equivalent to if not slighty faster than the Fortran code in performance, much faster than C++, and even more so than Go. This could be confirmation bias on my part, but I’ve not seen many places where Go was comparable in performance with C++, usually factors of 2 off or worse. There could be many reasons for this, up to, and including poor measurement scenarios, allowing the Go GC to run in critical sections, etc. I know C++ should be running near Fortran in performance, but generally, I’ve not seen that either as the general case. Usually it is a set of fixed cases.

The reason for Fortran’s general dominance comes from the 50+ years people have had to write better optimizers for it, and that the language is generally simpler, with fewer corner cases. That and the removal of dangerous things like common blocks and other global state. This said, I fully expect other codes to equal and surpass it soon on a regular basis. I have been expecting Julia to do this in fairly short order. I am heartened to see Rust appear to do this on this one test, though I personally reserve my opinion on this for now. I’d like to see more code do this.

Rust itself is somewhat harder to adapt to. You have to be more rigid about how you think about variables, how you will use them, and what mechanisms you can use to shuttle state around. You have to worry about this a bit more explicitly than other languages. I am personally quite intrigued by its promises: zero cost abstraction, etc. The unlikelihood of a SEGV hitting again is also quite nice … tracing those down can often be frustrating.

My concern is the rate at which Rust is currently evolving. I started looking at it around 0.10.x, and it is (as of this writing) up to 0.14.x with 0.15.x looming.

Generally, I want languages that get out of my way of doing things, not languages that smother me in boilerplate of marginal (at best) utility, and impose their (often highly opinionated, and highly targeted, often slightly askew) worldview on me. Anything with an explicit garbage collection blocking task which can interfere with calculation is a terrific example of this.

Simple syntax, ease of expression, ability to debug, high performance, accurate results, and freedom from crashing with (*&^*&^$%^&$ SEGVs are high on my list of languages I want to spend time with.

Viewed 41815 times by 3807 viewers

Brings a smile to my face … #BioIT #HPC accelerator

Way way back in the early aughts (2000’s), we had built a set of designs for an accelerator system to speed up things like BLAST, HMMer, and other codes. We were told that no one would buy such things, as the software layer was good enough and people didn’t want black boxes. This was part of an overall accelerator strategy that we had put together at the time, and were seeking to raise capital to build. We were thinking that by 2013 or so, that accelerators would become a noticeable fraction of the total computing power for HPC and beyond.

Fast forward to today. I saw this.

Yet another idea/concept/system validated. It looks like our only real big miss was the “muscular desktops” concept … big fast processors, memory, storage, accelerators next to desks. Everything else, from accelerators, to clouds, to … everything … we were spot on.

I hope they don’t mind my smiling at this, and tipping my hat to them. Good show folks, may you sell many many of them!

Viewed 45276 times by 4145 viewers

Another article about the supply crisis hitting #SSD, #flash, #NVMe, #HPC #storage in general

I’ve been trying to help Scalable Informatics customers understand these market realities for a while. Unfortunately, to my discredit, I’ve not been very successful at doing so … and many groups seem to assume supply is plentiful and cheap across all storage modalities.

Not true. And not likely true for at least the rest of the year, if not longer.

This article goes into some depth that I’ve tried to explain to others in phone conversations, private email threads. I am guessing that they thought I was in a self-serving mode, trying to get them to pull in their business sooner rather than later to help Scalable, versus helping them.

Not quite.

At Scalable we had focused on playing the long game. On winning customers for the long haul, on showing value with systems we produced, with software we built, with services and support we provided, with guidance, research and discussions. I won’t deny that orders would have helped Scalable at that time, but Scalable would still have had to wait for parts to fulfill them, which, curiously, would have put the company at even greater exposure and risk.

The TL;DR of the article goes like this, and they are not wrong.

  1. There is insufficient SSD/Flash supply in market
  2. Hard disks which have been slowly dropping in manufacturing rates are not quite able to take up the demand slack at the moment, there are shortages there as well
  3. 10k and 15k RPM drives are not long for this world
  4. New fab capacity needed to increase SSD supply is coming online (I dispute that enough is coming online, or that it is coming online fast enough, but more is, at least organically coming online)
  5. There is no new investment in HDD/SRD capacity … there is a drying up of HDD/SRD manufacturering lines

All of this conspires to make storage prices rise fast, and storage products harder to come by. Which means if you have a project with a fixed unadaptable budget, and a sense that you can order later, well … I wouldn’t want to be in anyones shoes that had to explain to their management team, board, CEO/CFO, etc. why a critical project suddenly was going to be both delayed and far more expensive (really, I’ve seen the price rises over the last few months, and it ain’t pretty).

This isn’t 5, 15, 25 percent differences either. The word “material” factors into it. Sort of like the HDD shortage of several years ago with the floods in Thailand.

It is curious that we appear to not have learned, with the fab capacity located in similarly risky areas … but that would be a topic for another post some day.

Even more interesting to me personally in this article, is a repetition of something I’ve been saying for a while:

To deal with supply uncertainty, as we move from an industry based on mechanical hard drives (which has dedicated production facilities) to one based on commodity NAND, vertically integrated solutions will be optimal. Organizations that control everything from NAND supply to controllers to the software will be in a much better position to deliver consistently than those that don’t.

Vertical integration matters here. You can’t just be a peddler of storage parts, you need to work up the value chain. I’ve been saying this to anyone who would listen for the last 4 years or so.

Also

This may cause an existential crisis for the external storage array. Creating, validating and successfully marketing a new external storage array in a saturated market is difficult. It is unlikely today’s storage vendors will be trying to move up the value chain by reinventing the array.

Yeah, array as a market is shrinking fast. There are smaller faster startups and public companies all feasting on a growing fraction of a decreasing market (external storage arrays). And they stick to what they know, and try to ignore the rest of the world.

There is a component of the market which insists on “commodity” solutions, where the world “commodity” has a somewhat opportunistic definition, so the person espousing the viewpoint can make an argument for their preferred system. These arguments are usually wrong at several levels, but it seems to be a mantra across a fairly wide swath of the industry. It is hard to hold back a tide flowing the wrong way by shouting at it. Sometimes it is simply better to stop resisting, let it wreak its damage, and move on to something else. We can’t solve every problem. Some problem “solutions” are less focused upon analyses, and more focused upon fads, and the “wisdom” of the masses.

You may have seen me grouse, and shake my head recently over “engineering by data sheet”. This falls solidly into this realm as well, where people compare data sheets, and not actual systems or designs. I see this happen far more often in our “software eats the world” view, where people whom should know better, don’t.

Reminds me of my days at SGI, when me with my little 90MHz R8k processor was dealing with “engineering by spec sheet” by some end users salivating over the 333MHz Dec Alpha processor. The latter was “obviously” faster, it had better specs.

I asked then, if this were true, how come our actual real-world (constructed by that same customer no less) tests showed quite the opposite?

Some people decide based upon emotions and things they “know” to be true. The rest of us want hard empirical and repeatable evidence. A spec sheet is not empirical evidence. Multiple independent real world benchmark tests? Yeah, that’s evidence.

Hyper-converged solutions, on the other hand, are relatively easy: there are a whole lot of smaller hyper-converged players that can be bought up cheaply and turned into the basis for a storage vendor’s vertically integrated play.

Well, the bigger players are rapidly selling: Nutanix is public, Simplivity was bought by HPE.

Smaller players abound. I know one very well, and it is definitely for sale … the owners are motivated to move quickly. Reach out to me at joe _at_ this domain name, or joe.landman _at_ google’s email system for more information.

For the small virtual administrator, none of this may be relevant. Our needs are simple and we should be able to find storage even if supplies become a little tight. If, however, you measure your datacenter in acres, by the end of next year you may well find yourself negotiating for your virtual infrastructure from a company that last year you would have thought of as just a disk peddler.

This article almost completely mirrors points I’ve made in the past to some of the disk vendors I’ve spoken to, about why they might want to pick up a scrappy upstart with a very good value prop, but insufficient capital to see their plans through. I’ve seen only Seagate take actions to move along the value chain, with the Xyratex purchase, and I thought originally they had done that specifically for the disk testing elements. Turns out I was wrong … they had designs on the storage appliance side as well.

All the disk vendors would do well to cogitate on this. The writing is definitely on the wall. The customers know this. The remaining runway is of limited length, and every single day burns more of it up.

Exactly what are they going to do, and when will they do it?

Customers want to know.

Viewed 48081 times by 4482 viewers

A nice shout out in ComputerWeekly.com about @scalableinfo #HPC #storage

See the article here.

Surely though, widespread adoption is just a matter of time. With its PCIe connectivity, literally slotting in, NVMe offers the ability to push hyper-converged utility and scalability to wider sets of use cases than currently.

There are some vendors that focus on their NVMe/hyper-converged products, such X-IO (Axellio), Scalable Informatics, and DataON, but NVMe as standard in hyper-converged is almost certainly a trend waiting to happen.

They mention Axellio, and on The Reg article on their ISE product, they say “X-IO partners using Axellio will be able to compete with DSSD, Mangstor and Zstor and offer what EMC has characterised as face-melting performance.”

Hey, we were the first to come up with “face melting performance”. More than a year ago. And it really wasn’t us, but my buddy Dr. James Cuff of Harvard.

For the record, we started shipping the Forte units in 2015. We’ve got a beautiful design for v2 of them as well.

This said, Dataon is in a different market, and XIO is about a very different use case, more similar to Nutanix and now HPE’s Simplivity than the use case we imagined.

/sigh

Viewed 59398 times by 4390 viewers