Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

Journal of nicholas (3034)

Wednesday January 03, 2007
10:52 AM

message de-duping in exim filters?

[ #32059 ]

Dear lazyweb...

There's a rather simple recipe in the procmail examples for filtering out duplicate e-mail messages using their message ID:

:0 Whc: msgid.lock
| formail -D 8192 msgid.cache

:0 a:
duplicates

I'm wondering, and Google fails me, is there an easy way to achieve this same effect* in an exim filter?

* Or something close. Specifically the ability to remember what message IDs have already been seen, and if the message ID has been seen already, deliver the duplicate message somewhere else.

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • # MBM had the idea of using a filter log file for lsearch lookups.
    logfile $home/.msgid.log
    if ${lookup{$h_message-id:}lsearch{$home/.msgid.log}} is "seen"
    then
        seen
        finish
    else
        save $home/inbox
        logwrite "$h_message-id: seen"
    endif
    • I had been thinking that something structured like this might work. But lsearch will be O(n), won't it? And if most message IDs aren't repeated, then most searches will be for the whole file. Whereas using a DBM file would be O(1), wouldn't it? But would require writing a custom program to insert seen message IDs into the DBM file, which is a fork hit.

      • You can write a persistent program to look after the database and use ${readsocket. Or link your Exim with perl and use ${perl :-)