12 hours into the new year, and I have 210 spam in my spam-box

This would be about 12.8k spam estimated for this month.

Last month I had 10.9k. I heard somewhere that someone said spam is decreasing.

My measurements (graphs over the last year) show quite the opposite. Our spam-box is on a per user basis. Each mail runs an annotation filter gauntlet. At the end of this gauntlet, it is classified as spam or not-spam. Not all mail reaches the gauntlet. Most of the “mail” gets rejected. I took out our grey-list function, as it wasn’t needed, and it delayed legitimate emails. That, and some mailers were horribly broken with respect to SMTP, and couldn’t deal with servers returning “450” messages. I will leave you to guess which company markets these on some … exchange …

Our gauntlet gateway rejected some 36k obvious fakes this last month. This has been growing steadily. I do see specific patterns in the spamming, I mean literally. There are specific repeating patterns which suggest particular mail bots are waking up and probing every 12 hours or so, and I have two that are offset slightly from each other. I see double peaks in our spam attackers.

weekly spam

We know their bot nets are scale free nets, we know they are attackable, and susceptible to collapse with some pretty simple ju-jitsu.

My question is, why are they so pernicious? Is there a market here that people are exchanging “goods and services” within? Do people actually really make money from all those spams? Is there a real link in them?

This is, oddly, a high performance streaming computing problem. Separating the wheat from the chaff at high ingress speeds. Our pipeline is one of the methods that does work well … we have been mailbombed with quarter million spam per day without noticeable impact on the server. Compare that with fortune 500 companies that I know of whose central exchange server chokes and dies with 10% of that load.

Maybe its time to turn our attention to this problem, stop adopting simple defensive postures, and start walking the cat back to the source. This is an HPC problem. We can use HPC to elucidate the networks, identify the cash flows. Solve the problem.

During the few minutes I spent venting, 5 more spam showed up in the spam box.

