Ok, I have been hinting at something we have been working on for a while. Time to talk a little more about this.
This is a server we are calling “JackRabbit”. The L. Flavigularis is a particular sub-species of JackRabbit. It comes in 3U and 5U flavors, and as a storage unit, could support from 6 TB through 36 TB, with 2 to 8 processor cores, and up to 64 GB ram, with multiple gigabit ethernet, Infiniband, and other technologies. The current density of this unit is quite good, and no corners were cut on performance, which is common with other designs oversubscription of RAID controllers and network interfaces.
Initial testing has shown quite good performance overall (especially in comparison with “other” units). We are tuning the initial units for file service, though we are getting lots of interest from people with harder-to-tune workloads. Considering how flexible this unit is, we can adjust its internal RAID configuration, stripe size, and so on to optimize performance for particular workloads.
This is not a competitor to Panasas and other cluster file systems. You could run Lustre on it if you wish. You can link these units together as iSCSI devices if you wish, set up a Petabyte scale highly available file system in under 9 racks, with no single points of failure. You can run it as a NAS as it happily talks NFS/CIFS/SMB, or as a “DAS” if you consider iSCSI as “direct attached”. Or as part of an iSCSI SAN. You can use it in conjunction with your Panasas units to provide an enterprise class high performance file system with RAID6 for home directories and high capacity.
There is much to like about these units from a technology, cost, and management point of view. Playing with the unit has been fun, running benchmarks on it got me thinking about real workloads on the server. I got to dig into how the IO system functions at a deep level. This was quite interesting.
This is also important, as data sets are growing rapidly, at an exponential rate, and any cluster or HPC system has to worry about not just processing faster, but moving data faster (better networks), and storing data faster. Some of the latter is addressed by object based storage. Some is not. Not to take away from Lustre/Panasas, we keep running into clusters and HPC systems for which these are overkill. This is usually in the 16-64 node region. This hole in the market is what we are addressing. We can scale down and up. Panasas and Lustre can scale out. Combined they make for an interesting system.
Ask us about it, we are happy to talk to people genuinely interested in it. We have a unit up on the net back at our lab, so if there is a test you would like to have us run before considering such a unit, please let me know. We want to focus upon real workloads and not just “running the gauntlet”.
Viewed 8116 times by 1549 viewers