I've noticed several times that my stepmother (whose family is comprised of forty or fifty spamaholics) sends me one or two of the Snopes items a few days after Snopes points to it. It then occurred to me that a component could be built to a spam filter where Snopes entries would be scraped for keywords, which could be weighted based on the date it landed on Snopes, and used to identify spam.
Once I get the hang of scraping pages (which is something I've been meaning to play with for a long damn time now), I plan to take a shot at this.
As a bonus, there are some patterns that the webmaster of Snopes has identified in topical spam revival (for instance, the series of war-related spam and horror-related spam that followed the WTC collapse), and if these were more closely identified, weight patterns could be shifted in anticipation.
Notion 0 Comments More | Login | Reply /