Arador's Journal http://use.perl.org/~Arador/journal/ Arador's use Perl Journal en-us use Perl; is Copyright 1998-2006, Chris Nandor. Stories, comments, journals, and other submissions posted on use Perl; are Copyright their respective owners. 2012-01-25T02:48:57+00:00 pudge pudge@perl.org Technology hourly 1 1970-01-01T00:00+00:00 Arador's Journal http://use.perl.org/images/topics/useperl.gif http://use.perl.org/~Arador/journal/ Prototypes: the good, the bad,and the ugly http://use.perl.org/~Arador/journal/39536?from=rss <p> <i>This is a reply to chromatic's post <a href="http://www.modernperlbooks.com/mt/2009/08/the-problem-with-prototypes.html">The Problem with Prototypes</a>, but his blog didn't allow me to post it there so I post it here.</i> </p><p>Pretty much everyone agrees that there are good (such as blocks and sometimes reference) prototypes and bad ones (scalar most of all). Few discuss the third class: the ugly glob prototype.</p><p>perlsub describes them as such:</p><blockquote><div><p>A * allows the subroutine to accept a bareword, constant, scalar expression, typeglob, or a reference to a typeglob in that slot. The value will be available to the subroutine either as a simple scalar, or (in the latter two cases) as a reference to the typeglob.</p></div></blockquote><p>In other words, they are the same as scalar prototypes, except that they also accept globs and barewords. This is mainly used to pass filehandles, like this:</p><blockquote><div><p> <code>sub foo (*) {...}<br> <br> foo STDIN;</code></p></div> </blockquote><p> but in fact it can be used to pass any bareword to function, as it leaves the interpretation of it to the function.</p><p>It's tempting to call this bad, but it offers some API possibilities that would otherwise not be possible, hence I would call it ugly rather than bad <i>per se</i>.</p> Arador 2009-08-26T15:30:37+00:00 journal use 5.011/5.012 should be progressive http://use.perl.org/~Arador/journal/39509?from=rss <p>As some may know, <code>use 5.011</code> activates strictures. I think that's a good idea, but I don't think that's enough. I strongly feel that it should be more progressive than that. I think it should do what <code>Modern::Perl</code> aims to do implement: Some small, important and uncontroversial pragmas (as opposed to <i>Perl 5i</i>, which does do big and controversial things). My list of things would be:</p><ul> <li> <code>use feature '5.011';</code> </li><li> <code>use warnings;</code> </li><li> <code>use IO::Handle;</code> </li><li> <code>no indirect;</code> </li><li> <code>use mro 'c3';</code> </li></ul><p>Anyway: I'd like to hear what other people think <code>use 5.011</code> (and thus eventually <code>use 5.012</code>) should mean.</p> Arador 2009-08-22T16:10:41+00:00 journal CPAN2 http://use.perl.org/~Arador/journal/39075?from=rss <p>Recently there has been a discussion about what the <i>New CPAN</i> of Perl6, called CPAN6 by some, is going to be like. I've heard a lot of good ideas from different people, but I'd like to take a slightly different approach. I'm all in for CPAN6, but let's have CPAN2 first.</p><p>I don't think it's helpful to reinvent all pieces at the same time. Let's keep the good pieces and get rid of the bad pieces. Module::Build and Module::Install got that right. ExtUtil::MakeMaker sucked ass, so people decide to make something better. We need more of that.</p><p>CPANPLUS.pm didn't get it right. It's a better implementation of a wrong idea. CPAN's handling of meta-data sucks. In particular, the way it doesn't deal with dependencies is problematic. We can easily come up with something better. The data exists, let's make it more accessible.</p><p>A perfect solution would require a major overhaul, but it could be fixed too by simply adding an extra tier to the CPAN ecosystem that deals with meta-data. I can think of better solutions than that for the future (where the meta-data is integrated into the whole stack), but to be honest I'm more interesting in a solution for NOW than a solution in the far future.</p> Arador 2009-06-04T20:43:23+00:00 journal Reddit is lamer than I thought http://use.perl.org/~Arador/journal/38510?from=rss <p>Reddit is way more broken than I thought it was. Someone posted <a href="http://www.wellho.net/archives/2009/02/index.html#002047">this</a> piece of crap (see <a href="http://www.reddit.com/r/perl/comments/7yci5/small_web_server_in_perl/c07qwzw">this</a> link for why I think it's crap) and it gets +10.</p><p>Really, some people must be voting things up way too easily.</p><p>:-|</p> Arador 2009-02-19T01:03:46+00:00 journal Casting magic against segfaults http://use.perl.org/~Arador/journal/38419?from=rss <p> <strong>The problem</strong> </p><p>For years, there has been the <a href="http://search.cpan.org/perldoc?Sys::Mmap">Sys::Mmap</a> module, however, it has a few issues. For example, let's take this piece of code:</p><p> <code>use Sys::Mmap;<br> open my $fd, '+&gt;', 'filename';<br> mmap my $var, -s $fd, PROT_READ|PROT_WRITE, MAP_SHARED, $fd, 0;<br> $var = 'Foobar';<br> munmap $var;</code> </p><p>First of all, it's simply not user-friendly. mmap takes 6 arguments in a weird order, and uses weird constants. Also munmap shouldn't be necessary: variables should dispose of themselves when they run out of scope.</p><p>But more importantly, this program does not do what you think it does, though the only hint of that is an <i>Invalid argument</i> exception when doing munmap. During the assignment, the link between the mapping and the variable is lost, so nothing is written to the file. Worse yet, this can even lead to a segfault in some circumstances.</p><p>Ouch!</p><p> <strong>Tying things up?</strong> </p><p>The documentation clearly says that you shouldn't do this (or anything else that changes the length of the variable), but IMHO this hole shouldn't be left open in the first place, if only because it is extremely counterintuitive (and thus a maintenance nightmare). Modules should fail more gracefully than this.</p><p>Sys::Mmap offers a tied interface as compensation, but this didn't work out. The tied interface indeed is safe, but it creates another problem.</p><p>Every time it is read, it copies the <em>whole</em> file into the variable. Every time the variable is modified, it writes the whole new value to the file, even if the change only affects a single byte.</p><p>Ouch!</p><p>Obviously, that doesn't scale at all. One user of the module reported a 10-20 times slowdown of his program after converting to ties. That's not a workable solution.</p><p> <strong>The solution</strong> </p><p>Perl has a powerful but rarely used feature called magic. (It's rare use by module authors is indicated by the fact that the prototypes of the magic virtual table as documented in <a href="http://perldoc.perl.org/perlguts.html#Magic-Virtual-Tables">perlguts</a> aren't even complete: they lack <code>pTHX_</code>'s). They are used by the perl core to implement magic variables such as <code>$!</code> and ties (surprise, surprise). It offers 8 hooks into different stages of handling a variable, the three most important being fetching(<code>svt_get</code>), storing(<code>svt_set</code>) and destruction(<code>svt_free</code>).</p><p>In my case, I didn't need <i>get</i> magic, but I did use <i>set</i> and <i>free</i> magic. Freeing the variable is not that interesting (simply unmapping the variable), but setting it is. This function is called just after every write to the variable:</p><p> <code>static int mmap_write(pTHX_ SV* var, MAGIC* magic) {<br> &nbsp; &nbsp; struct mmap_info* info = (struct mmap_info*) magic-&gt;mg_ptr;<br> &nbsp; &nbsp; if (SvPVX(var) != info-&gt;address) {<br> &nbsp; &nbsp; &nbsp; &nbsp; if (ckWARN(WARN_SUBSTR))<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; warn("Writing directly to a to a memory mapped file is not recommended");<br> <br> &nbsp; &nbsp; &nbsp; &nbsp; Copy(SvPVX(var), info-&gt;address, MIN(SvLEN(var) - 1, info-&gt;length), char);<br> &nbsp; &nbsp; &nbsp; &nbsp; SvPV_free(var);<br> &nbsp; &nbsp; &nbsp; &nbsp; reset_var(var, info);<br> &nbsp; &nbsp; }<br> &nbsp; &nbsp; return 0;<br> }</code> </p><p>This function is called after every write to the variable to check if the variable is still linked to the map. If it isn't, it copies the new value into the map, frees the old value and restores the link. As copying is potentially expensive, it will issue a warning if <code>warnings</code> (or actually, <i>'substr'</i> warnings) is in effect.</p><p>There is no perfect solution to this problem, but getting a friendly warning is undeniably better than getting a segmentation fault or data loss.</p><p>Anyway, you can find Sys::Mmap::Simple <a href="http://search.cpan.org/~leont/Sys-Mmap-Simple/">here</a>. It offers more goodies, such as portability to Windows, a greatly simplified interface, and built-in thread synchronization.</p> Arador 2009-02-06T21:33:05+00:00 journal Elegance in minimalism http://use.perl.org/~Arador/journal/38352?from=rss <p>Some time ago, I read <a href="http://use.perl.org/~Aristotle/journal/37831">this</a> journal entry by Aristotle. I liked it and suspected it could be easily implemented in XS. It turned out to be the most elegant piece of XS I've ever written.</p><p> <code>void<br> induce(block, var)<br> &nbsp; &nbsp; SV* block;<br> &nbsp; &nbsp; SV* var;<br> &nbsp; &nbsp; PROTOTYPE: &amp;$<br> &nbsp; &nbsp; PPCODE:<br> &nbsp; &nbsp; &nbsp; &nbsp; SAVESPTR(DEFSV);<br> &nbsp; &nbsp; &nbsp; &nbsp; DEFSV = sv_mortalcopy(var);<br> &nbsp; &nbsp; &nbsp; &nbsp; while (SvOK(DEFSV)) {<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; PUSHMARK(SP);<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; call_sv(block, G_ARRAY);<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; SPAGAIN;<br> &nbsp; &nbsp; &nbsp; &nbsp; }</code> </p><p>I assume most readers don't know much C, let alone the perl API or XS, so I'll explain it piece by piece.</p><p> <code>void<br> induce(block, var)<br> &nbsp; &nbsp; SV* block;<br> &nbsp; &nbsp; SV* var;<br> &nbsp; &nbsp; PROTOTYPE: &amp;$</code> <br> This declares the xsub. It has two parameters, both scalar values. The function has the prototype <code>&amp;$</code>. So far little surprises.</p><p> <code>&nbsp; &nbsp; PPCODE:</code> <br> This declares that a piece of code follows. Unlike <code>CODE</code> blocks, <code>PPCODE</code> blocks pop the arguments off the stack at the start. This turns out to be important later on.</p><p> <code>&nbsp; &nbsp; &nbsp; &nbsp; SAVESPTR(DEFSV);<br> &nbsp; &nbsp; &nbsp; &nbsp; DEFSV = sv_mortalcopy(var);</code> <br> These lines localizes <code>$_</code> and initializes it to <code>var</code>.</p><p> <code>&nbsp; &nbsp; &nbsp; &nbsp; while (SvOK(DEFSV)) {</code> <br> This line is equivalent to <code>while (defined($_))</code>.</p><p>Now comes the interesting part:<br> <code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; PUSHMARK(SP);<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; call_sv(block, G_ARRAY);<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; SPAGAIN;</code> <br> To understand what this block does, you have to know how perl works inside.</p><p>If you've read <a href="http://perldoc.perl.org/perlxs.html">perlxs</a>, you may notice this function does not push any values on the stack. Are careless reader might be mistaken and think this function doesn't return anything: they couldn't be more wrong!</p><p>If you've read <a href="http://perldoc.perl.org/perlcall.html">perlcall</a>, you would notice a lot more is missing. For starters, the function calls <code>SPAGAIN</code> (pretty much equivalent to saying <i>I accept the values you return to me</i>), but it doesn't do anything with them.<br> Also, you may notive that both <code>ENTER/LEAVE</code>(needed to delocalize <code>$_</code>) and <code>SAVETMPS/FREETMPS</code> (needed to get rid of temporary values) are missing. The function that calls the xsub automatically surrounds it by an <code>ENTER/LEAVE</code> pair, so that one isn't necessary. The lack of <code>SAVETMPS/FREETMP</code> however is not only deliberate but also essential.</p><p>The loop calls the block without arguments (<code>PUSHMARK</code> &amp; <code>call_sv</code>). The xsub accept the return values on the stack and leaves them there. This sequence repeated. This way it assembles the induces values <b>on the stack</b>. <code>PPCODE</code> removing the arguments at the start prevents it from returning those as first two return values. It also adds a trailer that causes all elements that have been pushed on the stack to be recognized as return values of this function. That's why a <code>SAVETMPS/FREETMPS</code> pair would break this code: the values must live after the code returns.</p><p>That's the elegance of this function. It doesn't even touch it's return values, it delegates everyting to the block. All the things that are missing make that it does exactly what it should do.</p> Arador 2009-01-27T19:05:41+00:00 journal