Onward and upward in #HPC

A short note – today was my last day with Joyent. They are a wonderful company, building great things. Excellent technology, and technologists. I wish them nothing but success. For the immediate future, I’ll be working on consulting projects, as well as looking for the next great opportunity within high performance computing, storage, cloud. I’m … Read more Onward and upward in #HPC

Data loss, thanks to buggy driver or hardware

So this happened on the 3rd, on one of my systems Feb 3 03:02:39 calculon kernel: [195271.041118] INFO: task kworker/20:2:757 blocked for more than 120 seconds. Feb 3 03:02:39 calculon kernel: [195271.048116] Not tainted 4.20.6.nlytiq #1 Feb 3 03:02:39 calculon kernel: [195271.052678] “echo 0 > /proc/sys/kernel/hung_task_timeout_secs” disables this message. Feb 3 03:02:39 calculon kernel: [195271.060626] … Read more Data loss, thanks to buggy driver or hardware

Reflections on where we’ve been in HPC, and thoughts on where we are going

Looking back on past reviews from 2013 and a few other posts, and what has changed since then up to 2019 (its early, I know), I am struck by a particular thought I’ve expressed for decades now. In 2009 I wrote HPC has been moving relentlessly downmarket. Each wave of its motion has a destructive … Read more Reflections on where we’ve been in HPC, and thoughts on where we are going

A bug in s3 buckets with no apparent way to request support to deal with it

This is a fun one, I’ve been playing with for the last 5 days or so. I’m helping someone out with backups, and they changed their mind on what they wanted backed up. So I started deleting the backups they didn’t want. One of the machines contained a set of directories for hashdeep which includes … Read more A bug in s3 buckets with no apparent way to request support to deal with it