NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.
All the Perl that's Practical to Extract and Report
Stories, comments, journals, and other submissions on use Perl; are Copyright 1998-2006, their respective owners.
Breaking it down by word? (Score:2)
What sort of correlations are you looking for? I was focusing on detecting plagiarism [perlmonks.org] at one point and found that breaking things down by sentence was more useful. As I don't know what you're trying to do, I've no idea if that link will prove useful.
Re:Breaking it down by word? (Score:1)
Detecting plagiarism is much more specific than this problem. I want to be able to analyze a document and suggest a handful of other documents that, from their intertextual context at least, appear to discuss similar things. For example, a tutorial about creating homemade pizza dough is probably not very similar to a journal entry about linguistic analysis, but probably is similar to an article discussing different types of pizza ovens.
I'm trying to answer the question "Do the relevant topics of these
Re:Breaking it down by word? (Score:2)
Reply to This
Parent