Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

chaoticset (2105)

chaoticset
  (email not shown publicly)
http://chaoticset.perlmonk.org/
AOL IM: chaoticset23 (Add Buddy, Send Message)
Yahoo! ID: illuminatus_foil (Add User, Send Message)

JAPH. (That's right -- I'm not Really Inexperienced any more.)

I'm not just here, I'm here [perlmonks.org], and here [javajunkies.org] too, I ramble randomly in my philosophical blog [blogspot.com] and my other blog [blogspot.com]. Soon I'll come in a convenient six-pack.

Journal of chaoticset (2105)

Tuesday July 22, 2003
09:07 AM

Siesta

[ #13613 ]
Wrote a letter frequency analyzer, because I wanted to. (Actually, it's the first step in a not-altogether ambitious project to write something that attempts to solve simple shift ciphers automatically.) Did a baseline frequency on a dictionary file and used Storable to keep those stats. (I know, I need something better, but it was the biggest wad of English text I had available that I was reasonably sure wasn't riddled with errors or "stylistic" typos. Any suggestions for a better text are appreciated.)

Started learning how to footle with CGI::Application and HTML::Template, and I'm pleasantly surprised. One little hangup -- not being able to make clickable images -- is not such a big deal, and I suspect there's a way to do it that I haven't discovered yet.

No other news...

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • If you need any texts of any size check out Project Gutenberg.
    • Any specific ones you'd recommend there? I wouldn't be against using a group of texts -- the process takes about two minutes with a 1.3 meg text file, and I want this to be really good, so I'd be willing to stick a couple hours to it. I'd like the stats to be much more precise than normal, so I'd be all for processing ten or twelve texts. It's just that the ones I happen to have on my drive at home (forgive me) aren't normal. Perl docs, while reasonably grammatic and spell-correct and all, aren't normal
      --

      ------------------------------
      You are what you think.
      • Funny you should ask. About a month ago on the Perl Quiz of the Week mailing list, one of the quizzes concerned repeated substrings. One of the folks used the following text (extracted from an email):
        -----------------------
        'The Life and Opinions of Tristram Shandy, Gentleman' by Laurence Sterne, which when downloaded weighs in at around 1 Mo (as compared with 27 Ko for Dan Schmidt's US constitution).
        -----------------------

        The location of these (from another email):
        http://www.dfan.org/constitution.txt
        http:/
  • If you're intending to solve short newspaper ciphers, consider that they tend to be quotations. Doing a frequency analysis on a set of quotes files might not be a bad idea. Having a separate distribution for "First letter of first word in a sentence, by wordlength" might help gain a toehold.