[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Who is spamming me - a bit of statistics



Sent to fitug-debate (actually a nontechnical discussion list) and to
spamassassin-talk. Reply-To set to me personally.

Please adjust accordingly.

A corpus of spam, freshly collected:

$ ls -l ~/Mail/OLD
total 96988
-rw-------    1 kris     kiel      1676771 2003-09-24 23:59 spammed-probable.01.gz
-rw-------    1 kris     kiel      2510905 2003-09-23 23:48 spammed-probable.02.gz
-rw-------    1 kris     kiel      1863673 2003-09-22 23:57 spammed-probable.03.gz
-rw-------    1 kris     kiel      1014158 2003-09-21 23:54 spammed-probable.04.gz
-rw-------    1 kris     kiel       617841 2003-09-20 23:16 spammed-probable.05.gz
-rw-------    1 kris     kiel      2861005 2003-09-20 06:13 spammed-probable.06.gz
-rw-------    1 kris     kiel       108846 2003-09-17 21:07 spammed-probable.07.gz
-rw-------    1 kris     kiel        12130 2003-09-16 19:48 spammed-probable.08.gz
-rw-------    1 kris     kiel        14029 2003-09-15 21:09 spammed-probable.09.gz
-rw-------    1 kris     kiel        35414 2003-09-15 01:51 spammed-probable.10.gz
-rw-------    1 kris     kiel     10032896 2003-09-24 23:58 spammed-sure.01.gz
-rw-------    1 kris     kiel     18746508 2003-09-23 23:58 spammed-sure.02.gz
-rw-------    1 kris     kiel     17935355 2003-09-22 23:57 spammed-sure.03.gz
-rw-------    1 kris     kiel     13535730 2003-09-21 23:48 spammed-sure.04.gz
-rw-------    1 kris     kiel     11984834 2003-09-20 23:57 spammed-sure.05.gz
-rw-------    1 kris     kiel     13597743 2003-09-20 08:40 spammed-sure.06.gz
-rw-------    1 kris     kiel       474242 2003-09-17 23:56 spammed-sure.07.gz
-rw-------    1 kris     kiel       665272 2003-09-16 23:59 spammed-sure.08.gz
-rw-------    1 kris     kiel       719339 2003-09-15 23:48 spammed-sure.09.gz
-rw-------    1 kris     kiel       584819 2003-09-15 06:42 spammed-sure.10.gz

Who sent me spam? Find out in perl:

$ cat ~/Mail/p.pl
#! /usr/bin/perl --

$hostname = "p15104972";

while (<>) {
	chomp;
	if (/^\s+/) {
		$line .= $_;
	} else {
		$line = $_;
	}

	if ($line =~ /^From /) {
		$state = "newmail";
	}
	if ($line =~ /Content-Description: original message before SpamAssassin/) {
		$state = "spammail";
	}
	
	if ($line =~ /^$/ and $state eq "newmail") {
		$state = "body";
	}

	if ($line =~ /^$/ and $state eq "spammail") {
		$state = "newmail";
	}

	if ($state eq "newmail" and $line =~ /^Received:/) {
		$line =~ /\[(.*?)\].*by\s+$hostname/;
		print "$1\n" if ($1 ne "" and $1 ne "127.0.0.1");
	}
}

Applied to my corpus above:

$ cd Mail/OLD
$ gzip -dc *gz | ~/Mail/p.pl > log
$ wc -l ~/Mail/OLD/log
   6614 /home/kris/Mail/OLD/log
$ sort ~/Mail/OLD/log | uniq -c | sort -rn > ~/Mail/OLD/log2
$ wc -l ~/Mail/OLD/log2
   1238 /home/kris/Mail/OLD/log2
$ head -10 ~/Mail/OLD/log2
    980	195.244.243.1
    532	193.98.110.1
    498	193.158.124.58
    196	193.110.157.89
     56	24.201.245.36
     40	209.225.8.34
     40	204.127.202.56
     34	216.148.227.85
     34	209.225.8.29
     32	204.127.202.55

These are my secondaries, an old mail address kris@toppoint.de,
which I have not been using for years, and the freeswan mailing
list, which I can really live without.

$ awk '$1 > 8 { print $2 }' ~/Mail/OLD/log2| xargs -i dig -x {} | grep PTR > ~/Mail/OLD/log3

This finds 64 machines that have me sent more than 8 spams, 58
of which resolve reverse.

$ perl -ne 'split; print join(".", reverse split(/\./, $_[4])), "\n";' ~/Mail/OLD/log3 | sort > ~/Mail/OLD/log4
$ cat ~/Mail/OLD/log4

au.net.iprimus.syd.smtp01
be.skynet.ferengi
be.skynet.gallantin
be.skynet.kira
be.skynet.sarek
be.skynet.sojef
ca.videotron.relais
com.btconnect.dswu26
com.btinternet.protactinium
com.cbeyond.atl.smtp
com.latinmail.smtp
com.ntlworld.mta02-svc
com.ntlworld.mta06-svc
com.rr.nyroc.ms-smtp-02
de.netuse.ns1
de.netuse.nuki
de.netzservice.hh.proxy
de.sczn.secondary
de.toppoint.archer
it.tin.vsmtp1
it.tuttopmi.fep01
lt.takas.mail-src
net.bellsouth.mail.imf16aec
net.bellsouth.mail.imf18aec
net.bellsouth.mail.imf19aec
net.bellsouth.mail.imf20aec
net.bellsouth.mail.imf22aec
net.bellsouth.mail.imf24aec
net.bellsouth.mail.imf25aec
net.charter.cluster1.remt19
net.charter.cluster1.remt20
net.charter.cluster1.remt21
net.charter.cluster1.remt22
net.charter.cluster1.remt23
net.charter.cluster1.remt24
net.charter.cluster1.remt25
net.charter.cluster1.remt26
net.charter.cluster1.remt27
net.charter.cluster1.remt28
net.charter.cluster1.remt29
net.comcast.rwcrmhc11
net.comcast.rwcrmhc12
net.comcast.rwcrmhc13
net.comcast.sccrmhc11
net.comcast.sccrmhc12
net.comcast.sccrmhc13
net.entelchile.ismtp5
net.entelchile.mail.real1.test_web_temp
net.libertysurf.mail
net.qwest.inet.mpls-qmqp-02
net.surewest.smtp2
net.telus.defout
net.telus.outbound02
net.telus.outbound04
org.freeswan.mj2
pt.telepac.mail.fep01-svc
pt.telepac.mail.fep02-svc
ro.rdsnet.mail3

The de-Addresses are just the secondaries of mine and the
Toppoint.de-address. The rest is a surprisingly short list when
you look at just the domains.

Perhaps SpamAssassin should really maintain a list of IP numbers
which have sent detected spam within the last n hours, and I
should build a sendmail access table from that every night.

If you repeat that analysis on your corpus, can you reproduce my
results?

Thought for improvement:

What happens if you take only the domain names of the above
hosts, resolve their MXes and list their mail servers - will
that result in a better blocking closure?

Kristian

-- 
To unsubscribe, e-mail: debate-unsubscribe@lists.fitug.de
For additional commands, e-mail: debate-help@lists.fitug.de