Note to self: use the sparse switch when moving data around with tar

Using a tar pair to move data between two systems, over an NFS link. This is faster than over ssh (ssh isn’t a fast transport layer).

Some user wrote a sparse file out.

An 11PB sparse file.

Which the tar happily … happily I tell you !!! was trying to copy, in its entirety, over to the backup unit.

Happily.

Took me a quick look to see what was going on. It had copied 9TB of the 11PB file by the time I’d caught it.

Viewed 67977 times by 9410 viewers

5 thoughts on “Note to self: use the sparse switch when moving data around with tar

  1. Thats a 1.1 PB file. This was an 11PB file, generated as part of a real science run.

    Working now to finish moving their data. Would really … REALLY … like to avoid using rsync on ~8TB of data.

  2. BTW – cp tar etc – all have SPARSE detection
    (which just detects contigous ranges of ZEROs)
    A 2010 marketeer would say ‘dedupe’ 🙂
    But in short: you need to turn it on.

    with gnu-tar: -S or –sparse

  3. Just out of curiosity: Did the user really intend to create a 11PB sparse file or was that actually a typo in the input parameters or something like that?

Comments are closed.