Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

educated_foo (3106)

educated_foo
  (email not shown publicly)
+ -

  Comment: Ugh. (Score 1) on 2010.04.23 15:13

AFAICT, the URL you need to block is "connect.facebook.net". Unchecking the box in your privacy settings doesn't seem to disable this mis-feature.

Read More 1 comments
Comments: 1
+ -

  Comment: Nice work! (Score 1) on 2010.03.26 19:56

by educated_foo on 2010.03.26 19:56 (#71798)
Attached to: Understanding world Perl blogging

311 aggregated feeds is a bit too much for me to follow, but it's a good idea and a simple, clean design. Too bad Japanese is so hard to machine-translate, though...

Read More 3 comments
Comments: 3
+ -

  Comment: Re:*facepalm* (Score 1) on 2010.03.05 6:25

the effort and ingenuity involved to work out what s/w you can re-use can make it risky effort-wise to even try

I had much more hope for what people called "component-oriented programming" in the 90s -- large pieces of functionality with very simple interfaces. Small-grained objects are a symphony of fail.

If I understand you correctly you had Octave generate batches of instances in the required format and then pumped them directly into the SVM-Light engine?

Actually, I allowed 2-argument Octave functions as kernels, shoved the relevant data pointer into SVM-Light's data pointer, and used SVM-Light's custom_kernel() interface to call back into Octave. I don't remember if I special-cased the standard 2-norm kernel, but that should not be too hard.

Read More 62 comments
Comments: 62
+ -

  Comment: Re:MojoMojo installs fine with cpanm (Score 1) on 2010.03.05 6:12

It looks like 273 dependencies (be patient) (which together guarantee its failure to install). This is nothing close to 99% of CPAN, which contains 8020 authors and 17538 modules. As Sage Mencken declared, "For every complex problem there is an answer that is clear, simple, and wrong." Translated into the modern vernacular, "clear" means "99%."
Read More 9 comments
Comments: 9
+ -

  Comment: *facepalm* (Score 1) on 2010.02.28 22:05

How did I miss that? (Probably because the synopsis only used add_instance(), and I skimmed the rest too fast.) SVMLight format is pretty simple, so it's not too hard to dump your data in that format and then call read_instances(). So one minor suggestion -- adding instances in bulk, particularly for training, is far more common than adding them individually, so it should be in the synopsis.

FWIW, when I wrote an Octave binding to SVM-Light some years back, I used direct calls to the SVM-Light C interface (init_doc(), custom_kernel, etc.) to add a whole batch of instances. It was more work, but way more efficient (and flexible!) than serializing and going through the file system.

Read More 62 comments
Comments: 62
+ -

  Comment: Re:LOLPERL? (Score 1) on 2010.02.21 21:47

Who, after all, can point at Acme::Pointless or laugh at Acme::LaughAtMe?

Read More 3 comments
Comments: 3
+ -

  Comment: LOLPERL? (Score 1) on 2010.02.21 19:50

Requirements...
Spiffy 0.30
Really? Your Enterprise Solution Management Spork has a few extra tines...

Read More 3 comments
Comments: 3
+ -

  Comment: Re:Interface (Score 1) on 2010.02.20 21:58

My point is that a non-lousy interface to SVM-Light needs to handle large datasets. Algorithm::SVMLight was clearly written by someone who never used SVMLight on a decent-sized dataset. Such a dataset will almost always contain hundreds of megabytes of data, and come from either (1) a text file you download or (2) a C or FORTRAN function you call.

I don't think specific cases will help here. Here's the general problem: I have one million labeled data points generated by some program, and I want to use them to train a classifier. If I use the current lousy interface, I perform one million Perl function calls, and either run out of patience, or run out of RAM and crash. I don't want to do that.

Read More 62 comments
Comments: 62
+ -

  Comment: Speaking of lame... (Score 1) on 2010.02.20 8:39

Do you realize how ridiculous this comment sounds? One word can, amazingly enough, have two definitions:

1 (of a person or animal) unable to walk normally because of an injury or illness affecting the leg or foot : his horse went lame.
• (of a leg or foot) affected in this way.
2 (of an explanation or excuse) unconvincingly feeble : it was a lame statement and there was no excusing his behavior.
• (of something intended to be entertaining) uninspiring and dull.
• (of a person) naive or inept, esp. socially : anyone who doesn't know that is obviously lame.
• (of verse or metrical feet) halting; metrically defective.

Read More 3 comments
Comments: 3
+ -

  Comment: Interface (Score 1) on 2010.02.11 12:44

A function to read data from a TSV/CSV file (in C, without going through Perl) would be extremely useful; ideally, $file would be a file handle rather than a name, so I could pipe it from standard input. A function to operate on a double** generated by some other C library would also be useful.

More generally, my point is that biologists need to be able to interface with many, many programs, and you can't expect canned interfaces to all of them to be available on CPAN. These programs are often UNIX commands, in which case writing a text file and calling them works. But if not, you can usually talk to them in C.

Read More 62 comments
Comments: 62
+ -

  Comment: Re:FYI (Score 1) on 2010.02.11 12:13

That's completely useless for large datasets:

$s->add_instance
        (attributes => {foo => 1, bar => 1, baz => 3},
          label => 1);

C is a good least common denominator, so it helps to make a scripting language's interface to C as painless as possible. XS is hardly "painless," so there's room for someone to create such an interface.

Read More 62 comments
Comments: 62
+ -

  Comment: Re:FYI (Score 1) on 2010.02.11 8:07

SVM-Light and the NCBI tools come to mind.

Read More 62 comments
Comments: 62
+ -

  Comment: FYI (Score 1) on 2010.02.11 5:07

Speaking from experience in both academia and bioinformatics, BioPerl is the opposite of a selling point; it's over-engineered and half-implemented, almost always more trouble than it's worth. Perl is attractive for its text processing and system scripting. If you want to make Perl more attractive to biologists, make it easier to interface with C and R (e.g. via Inline::).

Read More 62 comments
Comments: 62
+ -

  Comment: Re:In what way is Padre changing the Perl communit (Score 1) on 2010.02.10 16:34

Existing Perl hackers (including vim/emacs users) understanding that novices might be better off with Padre than a text editor.

At least in the long term, they aren't. Someone adept at hacking FORTRAN, C, Perl, Java, LaTeX, and BASH is much more employable than someone who can only hack Perl.

Padre, the Perl IDE [perlide.org]

LOL. Even replying to a post criticizing your shameless pimping of some Perl module, you couldn't resist pimping it.

Read More 62 comments
Comments: 62
+ -

  Comment: How does your rant address Adams post? (Score 1) on 2010.02.09 22:39

"... and Padre changes the Perl community." I'll let you figure this one out for yourself.

I use Emacs, but am interested in editors, so I check them out from time to time. I tried Padre last month, and had to substitute my own Perl to get it to launch (I forget if it was a threading or 64-bit-only issue). I eventually ended up with a half-assed text editor with some non-compelling features.

Read More 62 comments
Comments: 62