Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

barbie (2653)

barbie
  reversethis-{ku. ... m} {ta} {eibrab}
http://barbie.missbarbell.co.uk/

Leader of Birmingham.pm [pm.org] and a CPAN author [cpan.org]. Co-organised YAPC::Europe in 2006 and the 2009 QA Hackathon, responsible for the YAPC Conference Surveys [yapc-surveys.org] and the QA Hackathon [qa-hackathon.org] websites. Also the current caretaker for the CPAN Testers websites and data stores.

If you really want to find out more, buy me a Guinness ;)

Links:
Memoirs of a Roadie [missbarbell.co.uk]
[pm.org]
CPAN Testers Reports [cpantesters.org]
YAPC Conference Surveys [yapc-surveys.org]
QA Hackathon [qa-hackathon.org]

Journal of barbie (2653)

Tuesday March 07, 2006
04:51 PM

The Irritating Empire

[ #28912 ]

Some call it the evil empire, I'm just finding it persistently annoying. Thanks to MSN's ignorant robots (they ignore the robots.txt file), they have been continually been creating dead nodes on Birmingham OpenGuides. So far I've been wasting alot of my time trying to delete them as soon as they are created. As a consequence I started noting what IPs have been creating nodes, then doing a few reverse DNS lookups. MSN have not been the only one, but they have been the most persistent, and by ignoring the robots.txt file have caused me to create code to block the creation of specific nodes (they auto redirect to the home node). I shouldn't have to create the code in the first place.

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • It is generally known that some web crawlers (especially Spam harvesters) ignore robots.txt, and follow links that they should not follow. The best way to resolve it is to make sure that such operations can only be performed by submitting a form - not by following a link.

    Why is it not the case in the Birmingham OpenGuides?