Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • Yeah, I think this highlights why bayesian analysis is better than rule based approaches. It's harder to tweak the message so it slips by. Also, it's my understanding that bogofilter can scaled much better on a high volume MTA than SpamAssasin...but I don't know this for a fact so perhaps I should keep quiet.
    • No it's true. SpamAssassin's rules are slow.

      On the flip side, bogofilter is a personal filter, so it's not going to perform that well on larger installations, which kind of breaks the point of being so much faster, doesn't it?
      • bogofilter can be run from procmail or invoked by your mta just like spamassasin. It just needs a nice corpus of spam/non-spam. What do you mean by being a "personal filter"?
      • Speed is important when you're dealing with lots of messages.

        BTW, it is also important to reduce server load if you try implementing some automated way to handle spam.

        I've been thinking about writing something that will be run from an alias such as $USER-spam and $USER-ham (it will have to check the origin of the message: only the user himself can send messages to these addresses) and will classify the message as one of those using the user's database. Then, procmail or some other thing can compare messag
        --
        -- Godoy.
        • With SpamAssassin 2.50 (nearly ready) you can have per-user bayesian databases. But that doesn't scale to a company with (say) 20,000 users. You can't expect everyone to train their systems.
  • So, obviously time to have signatures automatically checked, and rejected if invalid. =)

    [which has its own problems]
    --
      ---ict / Spoon
  • Eventually, I suspect that most email software will implement blacklist/whitelist technologies. If a domain/user is on your blacklist, they get discarded. Period. If they are on your whitelist, they get accepted. If they are on neither, any email received will receive an "auto-reply" saying "please respond with an email message requesting that you be added to my white list". Spammers will thus be forced to respond to millions of "whitelist" requests. (the auto response could also simply be informing th

    • There are already programs like that.

      They demand the answer you talked about. It will, certainly, remove some spammers that use invalid addresses.

      On the other hand, the easiness is not that great. See: I'm subscribed to several mailing lists; I see a post from John Doe asking something silly about XYZ's software; I'm in a good mood (this is very important to answer a silly question); I send John Doe an answer and his mail server demands another message from me. I'm sorry, John, I'm not in a good mood anym
      --
      -- Godoy.
      • I agree that Bayesian filters are a great way to go, but I still think that blacklist/whitelist systems can work.

        There are a couple of potential ways to get around the issue you mentioned. If you have a mailing list on your whitelist, than any email that is sent to or CC'd to the mailing list could make it past a whitelist. Of course, that also requires that whoever manages the mailing list take the time to manage the spam. I'm only on "members only" lists and that takes care of the problem quite nicel

        • The only problem I see with that scenario is spammers copying a mailing list with every email and having that mailing list slip past your filters. However, at that point, with the spammer sending millions of emails, how can he or she possibly know which mailing lists go past your filters? The amount of work necessary to figure out how to get past the filters would be ridiculous.

          I receive a lot of computer related spam. I suppose they can use messages from the same place where they harvested my email or

          --
          -- Godoy.