johnseq's Journal johnseq's use Perl Journal en-us use Perl; is Copyright 1998-2006, Chris Nandor. Stories, comments, journals, and other submissions posted on use Perl; are Copyright their respective owners. 2012-01-25T02:35:03+00:00 pudge Technology hourly 1 1970-01-01T00:00+00:00 johnseq's Journal MIT Spam Conference <p>I wasn't able to attend the Spam Conference over at MIT, but I did catch the webcast. I found most of it quite interesting. It was great to hear from Yahoo, Microsoft and Brightmail about the scaling issues (and opportunities) that come with having billions of spam attacks/day. They're all beginning to leverage Cloudmark's collaborative filtering to some degree, but all hit the same issue that what people consider spam varies quite a bit. The Microsoftie also noted that 60% of spam offers require a domestic presense (i.e. financial services) These cannot be off-shored and are therefore vulnerable to legal remedies. The rest is (software, porn, nigerian 411, etc) and will probably go up as the laws are applied.</p><p>The first lawyer to present, last year's "Hi I'm Jon Praed, and I sue spammers", came across with:</p><p>Spam laws _could be_ good start</p><p>Identity, jurisdiction required to pursue legal cases</p><p> CAN-SPAM is good because spammers have long argued that what they were doing was not against the law. That's no longer true (in the U.S. at least).</p><p>Important provision of CAN-SPAM attaches liability to businesses profiting from spam. He thinks this is greatly under-appreciated.</p><p>The Tar Proxy talk wasn't all that interesting, but clearly making them pay (in CPU time at least) was very gratifying to the author.</p><p>The Brightmail speaker mentioned that they're implementing Paul Graham's filters that fight back, sort of. He didn't use that phrase, but his company is following links in email to see what is on the other side, and using factors from that to determine spaminess. They can leverage this inspection over a huge user base, so they don't risk slashdotting innocent joe-job victims. One challenge to identifying just URLs for spammers just by domain is the number of open redirect scripts web-wide ( being the most often abused) to disguise the ultimate destination of a spam offer. I winced when I realized that I've probably contributed two or three to the pool spammers can use. Also, I had a thought that 'Boy it'd be great if they shared the list of spam URLs' and shortly after he mentioned that they were considering some way of sharing.</p><p>Eric Kidd spoke on sender-pays/e-postage real world experience with the camram project. Although folks like me ( and Matts) often dismiss the sender pays idea because of either joe-jobs launched from virus-compromised computers, or the fact that you typically have to upgrade the whole internet at once to make them work, none of these concerns was news to Eric and his presentation did not sidestep these issues. His points were:</p><p>Sender-pays works great if you redesign the entire internet. Obviously not practical.</p><p>Hybrid sender-pays works well, with filtering s/w accomodating the metric for postage</p><p>What can be used as a stamp? This is a big issue that will likely evolve.</p><p>Money stamps don't work (centralization, theft, regulation)</p><p>Hash collision is very popular now, but memory-based problems are probably more appropriate than anything CPU based because of Moore's law, spammers building custom h/w, etc.</p><p>Whitelist someone who sends you stamped mail so future correspondence can be verified w/ signatures. "Strangers cost, friends fly free."</p><p>The best presentation was from Peter Kay' of Titan Key - . The technology was an elegant combination of simple concepts, but I liked it most because the speaker ( a chicago-born Hawaiian transplant ) was by far the most dynamic and convincing. He proselytized much more than he spoke, and the audience really bought it.</p><p>He described his company's product called KeyMail. Instead of disposal email addresses (, you have programmable addresses -- ones that auto whitelist in various ways (based on domain, exact email address etc). So you give out address '' to each person, company or mailing list that you want to correspond with. One typical rule would be that the email or domain that first responds to the email address is whitelisted for it. All subsequent use of that email would be put in a challenge response queue.</p><p>One key differentiator for KeyMail is that it's implemented at the SMTP/MTA level. The whitelisting rules implemented are simple enough that you can reject spam before it is delivered, saving a lot of CPU, bandwidth and disk space in the process.</p><p>Peter mentioned that there's always a need for a general purpose email address (like generic sales addresses on a corporate web site), so filters don't really go away. But he brought up his Outlook inbox and said "Look at this. No filters. No spam. For the last year". I think that's a less challenging result if you're committed to C/R, but the neat thing is that he mostly wasn't. C/R is just used as the filter of last resort, and rarely at that.</p><p>I see that user retraining issues ( having to pre-generate an email address for folks you meet on the street seems a drag ) and ISP lock-in are the two biggest problems with KeyMail. For the latter, there are a couple solutions. The rules seem simple enough that they could be as portable as mail filtering rules - ). Also, the IETF is working on making challenge/response interactions automated, so that you never feel that particular pain. Of course, if you had interoperable C/R, KeyMail's raison d' etre might largely disappear.</p><p>[Aside: I would love it if my email forwarding service implemented this. I'm pretty locked into them anyway, and it doesn't bind me to an ISP.]</p><p>In summary, from the keymail talk and the spam conference in general I think two themes came through: any spam solution needs to painless interoperate with the situation we have today ( <i>duh</i> ), and no single solution will really solve the problem. The 'drug cocktail' metaphor was used more than once, and I think appropriate on more than one level.</p> johnseq 2004-02-05T01:27:10+00:00 journal Spammers? Kill em all <p>I attended the New Scientist salon on spam last night (also attended by <a href="">Gregor</a>. It was actually hosted by Simson Garfinkle and Paul Graham. Simson's claimed that only about 200 people accounted for the world's supply of spam. His (yes, facetious) theory was that only extrajuditial means would solve the spam problem -- meaning hunting down and killing a number of spammers sufficient to deter the remainder, like John Travolta at the end of Operation Swordfish. Since spammers have both teamed up with and provided a profit motive for previously harmless crackers, we now have armies of compromised machines which will make future attempts at micro-payments and digital signatures (and other end-user dependent schemes) pointless.</p><p>I do not think they're pointless, but they probably won't fly on their own. I remember reading about a simulation of a internet super-worm -- a virus that spreads via several vectors at once and aggressively scams for and propagates itself to other machines. The authors of the study determined that it could spread to all vulnerable net-connected hosts in 15 minutes BUT if machines had an extremely simple limit on outbound IP connections it could not even spread fast enough to be a threat.</p><p>Generalizing this super-simple virus-fighting behavior a bit, I think our machines should establish baselines for things like outbound IP connections and the amount of email we send out. For the average user on a machine with a consistent usage profile, it should require some time of user intervention to perform network scans oustide the baseline. This is the equivalent of the credit card fraud division calling you up when they notice your recent purchases of Snoop Dogg in a Tiajuana Record store. Is this fantasy technology that we're years away from having available? Well, I talked to a company named Okena that was writing this software for Windows and Linux a couple years ago. They instrumented and rolled up the behavior of desktop applications to a central server, so that they could define deviant behavior by comparing a machine with it's peers. They could then stop behavior as it emerge, instead of retroactively looking for infected file signatures.</p><p>Microsoft recently floated a trial balloon about enabling firewalls by default and implementing some sort of behavior profiling in the OS. While I'm realistic that this is more about escalation than an end-game, it will be interesting to see what kind of traction it gets with MS's money (and, at this point, desperation) behind it.</p> johnseq 2003-11-12T05:34:38+00:00 journal Language Savant <p>I attended the BloggerCon kickoff party on Friday, and was talking to <a href="">Carl Robert Blesius</a>, a Heidelberg medical student doing an MGH clerkship and fellow openacs/.LRN enthusiast. languages, and I mentioned that my wife and I had studied German for about a year and a half after visiting some friends in Munich.</p><p>We were taking lessons from a Harvard Square linguist named Lee Riethmiller, at the <a href="">Intercontinental Foreign Language Program</a>. Lee has a very unique approach to foreign language instruction in several respects. He believes that you learn languages faster and with better recall if you study multiple languages concurrently. He never really said why, but my oversimplified explanation is that this is similar to the better recall/comprehension claims of speed reading. There are other reasons why this makes sense from a mnemonics perspective, and it has the added benefit of being very appealing from a student perspective (learn more in less time).</p><p>In addition to encouraging you to take multiple languages simultaneously (you can choose from about 20 that Lee teaches), he also eschews the standard grammar-based approach. Instead, he writes interactive question and answer type scripts that resemble beat poetry -- quite absurb. You don't do 'going to the movies' or 'in the kitchen' vocabulary fests. Instead, you converse with mushrooms, cheese-boys, italian bees and strawberry girls, and each verb tense you memorize is associated with a flavor of ice cream. Occasionally the scripts will overlap with some 60s pop song, and Lee will break into song.</p><p>The lessons are quite entertaining, and while you can't necessarily recall how to say arbitrary sentences, the ones you know come quite easily. You are encouraged to study the scripts, but Lee is a realist about how much time working folks hav to devote to language study so you never get discouraged.</p><p>Anyway, Carl was enthused by my description, and just let me know he contacted Lee about learning Chinese, French and Spanish.</p> johnseq 2003-10-06T16:44:26+00:00 journal Ambitious Autrijus <a href="">Template::Generate</a><blockquote><div><p> <tt>Template:&nbsp; ($template + $data) ==&gt; document&nbsp; &nbsp;# normal<br>Template::Extract: ($document + $template) ==&gt; $data&nbsp; &nbsp;# tricky<br>Template::Generate: ($data + $document) ==&gt; $template # very tricky</tt></p></div> </blockquote><p> In terms of ambition, I think this module ranks right up there with Parrot+Perl6.<nobr> <wbr></nobr>:-) </p><p>Actually, I have read about refactoring tools that look for duplicated or nearly duplicated source code that could be candidates for consolidation into subroutines. Implementing something similar would be an enormously useful addition to the Template::Toolkit (the BLOCK-ifier), and I would think it would be a achievable goal. </p><p> I'm using <a href="">LEO's</a> cloned nodes feature to fake this type of refactoring in ASP/CFM/HTML/XML/etc. The benefit is you can apply the concept of subroutine code consolidation to arbitrary blocks of text. A cloned node is simply a pointer to the block of text so that you can update it in one place, and have it reflected immediately in all other references. For the many languages a web/db developer works with that don't support that kind of reuse, it's a godsend. I should add that although most web languages do support some kind of include file syntax, I've found it much easier to use the cloned nodes as an intermediary step before I break things out into different parts of the filesystem. It's much easier to edit the files in context, and use LEO's outliner to show the structure of the document you're working with, and delaying having to commit to a certain file structure (in cvs,docs etc.) until as late as possible helps you name and organize things better. </p><p> There's a good <a href="">LEO Tutorial</a> w/ a perl example.</p> johnseq 2003-09-18T13:27:29+00:00 journal what is a geepblog? If you correlate GPS data tracking your position and time with date-stamped photos and voice/text commentary, you can come up with a pretty cool <a href="">web-based visualization</a> [ you must click this link - it's awesome ] of the trips you take in your life. Even better, you can easily connect the data feeds of you and your friends to have a mob-blogged/collaboratively annotated experience. <p> I was thinking that something like this would be useful after I attended a friend's wedding last weekend. Wouldn't it be cool if I could type in the date/time/location of the event and be able to browse through other's photos and/or thoughts on the event? I know, what a geekish thought about a wedding for-crying-out-loud, but OTOH I took some pretty cool photos. It would be nice to lower the sharing barrier to entry to simply having interested parties look them up via google/kazaa/etc. </p><p> Hmmm. <i> Must... acquire... gps.</i> </p><p> Resources: </p><p> <a href="">Smart-mobs</a> </p><p> <a href=",2000029587,20276202,00.htm">The merging of GPS and the web</a></p> johnseq 2003-09-16T14:15:29+00:00 journal TT2 vs. ASP <div><blockquote><div><p> <tt>Dim rs, conn&nbsp; ' because you're using option explicit,&nbsp; right?<br>... define connection etc.<br> <br>Set rs = Conn.Open("select distinct place_id from locations")<br>While Not rs.EOF<br>&nbsp; %&gt;&lt;%=rs("place_id")%&gt;&lt;%<br>&nbsp; rs.MoveNext<br>Wend</tt></p></div> </blockquote></div><p> OR</p><blockquote><div><p> <tt>[% "$location.place_id " FOREACH location =&nbsp; DBI.query("select distinct place_id from locations") %]</tt></p></div> </blockquote><p>I like TT2</p> johnseq 2003-09-03T02:44:44+00:00 journal