Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

nkuitse (193)

nkuitse
  (email not shown publicly)
http://www.nkuitse.com/

Journal of nkuitse (193)

Thursday April 08, 2004
03:25 PM

Harmless drudgery

[ #18240 ]

I don't know if anyone's interested, but I've released a new version of my English word list:

http://www.nkuitse.com/freli/

It's small, sleek, and the cause of much typing and eye strain for yours truly.

Factoids...

Number of entries: 50,000

Number of entries that I've checked: 50,000

Number of entries added since the last release: ca. 11,000

Number of part-of-speech indications: 50,000

Number of definitions: 0

Number of proper nouns: 0

How I add entries:

$ frop add 'kerfuffle (n)'
$ cat new-words | frop add
$ frop review
** Press 'y' to add a word, 'n' to reject it, or space to skip it **
...
$ frop commit

Resources I use to discover words and decide what to add:

  • Project Gutenberg
  • /usr/share/dict/web2
  • dict
  • google
  • Random House Webster's Unabridged Dictionary (2nd ed.)
  • The American Heritage Dictionary of the English Language (2nd ed.)
  • The Compact Oxford English Dictionary
  • My poor overtaxed brain

To make a long story short: Lexicography is hard work (and FRELI's only a word list, not a dictionary).

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • Also maybe use the Moby Lexicons [shef.ac.uk] as a source?
  • Thanks for the pointer. I considered using the Moby Lexicon as one of my sources when I first created the list, but I don't recall if I actually did. Time to take another look!

    My principle sources were Roget's International Thesaurus (the 1911 edition that's in the public domain) and the data files in the Link Grammar project, because these provided part-of-speech information.