Spam dictionary (work in progress)

spam


In 2003, a dictionary of Spam was created by Guy Di Mattina whilst studying at the School of Information Technology and Electrical Engineering, University of Queensland. “The first step” explains the author, “was to create a list of features that appeared in Spam or normal mail but not in both.” The results were published in a thesis entitled : ‘Spam and Open Relay Blocking System’.


Here are some examples from the dictionary :


#18. accomplishments

#496. decreased

#1023. ifdlawdodcbpbiazifdlzwtzisehisancjxwigfsawdupsjjz

#1397. museum

#3049. grandpa

#4462. wont


An article reproduced on the Radio Australia website explains more :


“[...] we all got together and we all discussed what was going on and it came out that we were using the Support Vector Machine in an unthought of way, mainly because Guy was not trained in Support Vector Machines so we didn’t know how everybody is trained to use them. We came up with something completely different just purely and simply because he didn’t know what he was doing when he started out and that’s what’s made it so effective.”


 •  0 comments  •  flag
Share on Twitter
Published on September 15, 2014 05:39
No comments have been added yet.


Marc Abrahams's Blog

Marc Abrahams
Marc Abrahams isn't a Goodreads Author (yet), but they do have a blog, so here are some recent posts imported from their feed.
Follow Marc Abrahams's blog with rss.