NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.
All the Perl that's Practical to Extract and Report
Stories, comments, journals, and other submissions on use Perl; are Copyright 1998-2006, their respective owners.
Interesting Test Cases (Score:2)
a change of name ? (Score:1)
Perhaps you're looking at only one aspect of how a module like this may be used. Yes, it can be used for detecting plaigarism, should the user choose to do so. But it can also be used as a similarity detection metric; which has uses far beyond seeing if journalists borrowed copy or if students cribbed essays.
Related articles ? contextual matching ? I can think of a few more uses for this type of module. I'd actually like to see how you do it, out of academic interest.
Re:a change of name ? (Score:1)
-DA [coder.com]
Re:a change of name ? (Score:2)
Because of the way the code is designed, I seriously doubt that it could be used for related articles or contextual matching. It's slow, but that's because of the algorithm I chose (which turned out to be surprisingly faster than some of the other options I was looking at.) It does a sentence by sentence comparison to determine "how far apart" two sentences are in terms of insertions, deletions and replacement. If they're close enough (under the user defined threshold), then a match is reported. It's th
Re:a change of name ? (Score:1)
------------------------------
You are what you think.
You can bet... (Score:1)
Re:You can bet... (Score:1)