Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

chaoticset (2105)

chaoticset
  (email not shown publicly)
http://chaoticset.perlmonk.org/
AOL IM: chaoticset23 (Add Buddy, Send Message)
Yahoo! ID: illuminatus_foil (Add User, Send Message)

JAPH. (That's right -- I'm not Really Inexperienced any more.)

I'm not just here, I'm here [perlmonks.org], and here [javajunkies.org] too, I ramble randomly in my philosophical blog [blogspot.com] and my other blog [blogspot.com]. Soon I'll come in a convenient six-pack.

Journal of chaoticset (2105)

Thursday April 25, 2002
12:00 PM

Notion

[ #4461 ]
I visit Snopes every other day or so, checking the 'Additions' page to see what new (stupid|deranged|utterly nonsensical) thing people are emailing around to each other or passing off as fact in casual conversation.

I've noticed several times that my stepmother (whose family is comprised of forty or fifty spamaholics) sends me one or two of the Snopes items a few days after Snopes points to it. It then occurred to me that a component could be built to a spam filter where Snopes entries would be scraped for keywords, which could be weighted based on the date it landed on Snopes, and used to identify spam.

Once I get the hang of scraping pages (which is something I've been meaning to play with for a long damn time now), I plan to take a shot at this.

As a bonus, there are some patterns that the webmaster of Snopes has identified in topical spam revival (for instance, the series of war-related spam and horror-related spam that followed the WTC collapse), and if these were more closely identified, weight patterns could be shifted in anticipation.

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.