I'm looking for reasonable quantities of text in as many languages as I can get my hands on (note: I mean "text in English", "text in French", etc. I do not mean "text with as many languages as possible inside it").
Basically, I'm looking for better training text for my Lingua::Identify project.
If anyone has a couple of pointers (or even the corpus by itself, even if just of one language), I'd really appreciate that
Oh, one other thing: by "reasonable", I think I'm aiming for something like 10M... but I'd just like to get my hands on corpus, right now (hey, 1M today, 1M tomorrow...)