Here’s the basic theory behind swap space. Memory is expensive, disk is cheap. Only use the faster memory for active things, and aggressively swap out the less used things. This provides a virtual address space larger than physical/logical memory.
No. Heres why.
- swap makes the assumption that you can always write/read to persistent memory (disk/swap). It never assumes persistent memory could have a failure. Hence, if some amount of paged data on disk suddenly disappeared, well …
Put another way, it increases your failure likelihood, by involving components with higher probability of failure into a pathway which assumes no failure.
- it uses 4k pages (on linux). Just. Shoot. Me. Now. Ok, there are ways to tune this a bit, and we’ve done this, but … but … you really don’t want to do many many 4k IOs to a storage device. Even an SSD.
NVMe/MCS may help here. But you still have the issue number 1, unless you can guarantee atomic/replicated writes to the NVMe/MCS.
- Performance. Sure, go ahead and allocate, and then touch every page of that 2TB memory allocation on your 128GB machine. Go ahead. I’ve got a decade or two to wait.
- Interaction with the IO layer is sometimes buggy in surprising ways. If you use a file system, or a network attached block device (think cloud-ish), and you need to allocate a SKB or some additional memory to write the block out, be prepared for some exciting (and not in the good way) failures, some spectacular kernel traces that you would swear are recursive allocation death spirals.
“Could not write block as we could not allocate memory to prepare swap block for write …”
Yeah. These are not fun.
- OOM is evil. There is just no nice way to put this. OOM is evil. If it runs, think “wild west”. kill -9 bullets have been lobbed against, often, important, things. Using ICL to trace what happened will often lead you agape with amazement at the bloodbath you see in front of you.
So towards this end, we’ve been shutting off paging whenever possible, and the systems have been generally faster and more stable. We’ve got some ideas on even better isolation of services to prevent self flagellation of machines. But the take home lesson we’ve been learning is … buy more ram … it will save you headache and heartache.