Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

jplindstrom (594)

jplindstrom
  (email not shown publicly)

Journal of jplindstrom (594)

Wednesday September 25, 2002
05:01 PM

OpenDirectory::Category

[ #8013 ]

(((OpenDirectory::Category is a set of classes to match text/urls against the OpenDirectory/dmoz.org categories, without access to the http://dmoz.org/ web site. You throw it a URL and get back a number of OpenDirectory categories where it belongs.)))

I have been polishing my OpenDirectory module the last few days, cleaning things up, refactoring a few things, writing tests. I even wrote a tutorial for spidering dmoz.

Actually, that last thing was a self defence move. I always try to document manual procedures for myself, because I _know_ I'm gonna need to do it again, and I'm too lazy to figure it out... again. Also, it would be cool if someone would like to help keeping the match data up to date (they keep rearranging the category structure all the time).

One thing I still have trouble with is the namespace. I asked for opinions on PerlMonks
some time ago:
http://www.perlmonks.org/index.pl?node_id=165912

After having thought long and hard about this one more time (and having tried out four different sets of names) I finally ended up with

OpenDirectory::Category
OpenDirectory::Category::Matcher::Word
OpenDirectory::Category::MatchResult

for the most important classes.

That way I don't hog the entire OpenDirectory name, and I still get OpenDirectory::Category (which I need).

It might belong in Search::, but I don't know.

It does not belong in WWW::, because this is not a web thing. Sure, the OpenDirectory contents happen to be central to http://dmoz.org/ but that's not the point. And the module doesn't access the web site when it categorizes data.

As for OpenDirectory vs DMOZ, well the module to actually search http://dmoz.org/ is called WWW::Search::OpenDirectory.

Maybe I'll just try to register the namespace at CPAN and see what happens.

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.