Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • The new Tcl regular expression package has all of the advanced features that Perl has, plus Unicode support which Perl is lacking.

    I am trying to think how the statement could be more wrong. The Tcl regex package does not have all the advanced features Perl has in its regex engine. I don't know when this thing was written, but I doubt very much it supports all the perl 5.005 features, let alone the 5.6 features. This isn't necessarily a bad thing, but it is still a wrong thing.

    As to lack of Unicode su

  • I remember this from a while ago. The bottom of the document includes (c)1998-2000 and was last updated March 9,2000. The last update was probably s/Scriptics/Ajuba/

    I think we can take this for what it is- marketing.
  • The Tcl regex package does not have all the advanced features Perl has in its regex engine.

    From the last section of the document:

    Tcl has most but not all of the advanced features of Perl's regular expressions and it also has some of its own unique features.

    As to lack of Unicode support, well, that's just plain false.

    Really? Then Dominus must be lying in his summary [perl.com] when he writes:

    Now here's a dirty secret: Overloading the regex engine this way is difficult, and hasn't been done yet.Regex matchin

  • From the last section of the document

    That was from a reader comment, not from the article.

    >As to lack of Unicode support, well, that's just plain false.

    Really? Then Dominus must be lying in his summary when he writes

    Huh? Unicode regexes are supported, as Dominus' quote says. Apparently support could be better, but that still makes the statement patently false.

    And also the last thing you mention is not from the article, but from a reader comment.

    Of course, a lot more is demonstrably false

  • Misconceptions

    If you explore the Tcl Web sites, you'll find various claims about why Tcl is superior to Perl. Many of these claims are either incorrect or out-of-date, including all of the following:

  • They are a commercial company selling products which are based on Tcl - in the modern marketing mind they have to rubbish the competition ...

    /J\
  • Again, I don't know what "marketing" is, though.
  • Again, I don't know what "marketing" is, though.

    Perhaps this will help you understand, grasshopper:

    At the bank for which I used to work, there was a robbery. This is not unusual, since there are a lot of stupid people out there, except that it was unusually stupid... a daylight robbery, a downtown bank branch which was glass on all four walls (giving cameras and drive-through customers a good view of the guy before the dye pack blew up in his face), etc.

    It was sufficiently entertaining that it made t

  • Unicode regexes are supported, as Dominus' quote says.

    Uhm, no. Regexes don't check whether the string they match against are UTF8 strings or not. They follow the abondened "all or nothing" strategy.

    #!/opt/perl/bin/perl -w

    use strict;

    sub one_char {$_ [0] =~ /^.$/;}

    {  use utf8;
       my $str1 = v1024;   # Unicode ch ar 1024.
       my $str2 = v32;   

  • Hmm, I guess your post got cut off, but there are lots:
    not binary clean.
    everything is a string. no numerics whatsoever.
    no standardardized OO implementation.
    not as many prefab modules (a la CPAN)

    are just some of the first that floated to the top of my head.

    At this point. Perl can do everything that Python and Tcl can, and more. IMO. I personally think that it's also superior to C/C++ for all applications programming as well, but then again, I'm crazy. ;p
  • >Unicode regexes are supported, as Dominus' quote says.

    Uhm, no. Regexes don't check whether the string they match against are UTF8 strings or not.

    And this does not appear in any way make my statement above false. It mitigates the statement, it clarifies it, it makes it look worse that originally one might have thought, but it does not make it false.

    However, your code is curious. Does it fail because v1024 is UTF-16 and not UTF-8?

    As to the "line breaks": they are inserted spaces. When a linke

  • And this does not appear in any way make my statement above false. It mitigates the statement, it clarifies it, it makes it look worse that originally one might have thought, but it does not make it false.

    I wouldn't call the fact that regexes ignore the fact that a string is in UTF-8 format "supporting Unicode". It makes the term "support" meaningless. The criticism on this point is correct.

    However, your code is curious. Does it fail because v1024 is UTF-16 and not UTF-8?

    That is a question sho

  • I have used TCL for a while, mostly for exploiting Tk, and what I say is that in my opinion Tcl code is by far less "neat" and "clean" than Perl code could be. Like nearly any other programming language, it makes you type a lot of conceptually unnecessary details; in this sense Perl is a different animal, and thats why I love it.

    --
    "Love, work and knowledge are the well-springs of our life. They should also govern it." - W. Reich
  • It does not make the term "support" meaningless. Sorry, it just doesn't. If the regex engine supports UTF-8 when the utf8 pragma is in effect, then UTF-8 is de facto supported, though not supported in the way you want it to be.

    As to my not understanding Unicode: I understand the basics of how it works. I understand encodings. I do not know specifics if Unicode, or the specifics of the various Unicode encodings. I did not know if v1024 is a proper character in UTF-8, because if it were, and the perl d

  • I did not know if v1024 is a proper character in UTF-8, because if it were, and the perl docs were true, then it would be supported by the regex engine.

    But, as I telling you, and showing you with code, the regex machine does not properly support Unicode. It follows the old model, abandoned by the rest of Perl. Only were the old and new model happen to coincide it "works", but that's purely by accident.

    Old model: everything is assumed to be in Unicode if, and only if, the utf8 pragma is in effect.
    New mo

  • From perldoc utf8:

    In the absence of inputs marked as UTF-8, regular expressions within the
    scope of this pragma will default to using character semantics instead
    of byte semantics.

    Perhaps I am wrong, but that tells me that regexes will be forced to treat text as utf8 if the utf8 pragma is in effect.

  • Perhaps I am wrong, but that tells me that regexes will be forced to treat text as utf8 if the utf8 pragma is in effect.

    Exactly my point. Considering what the rest of Perl does, that is outdated, and hence wrong. It might give the right answer by accident. But it will often give the wrong answer.

    -- Abigail

  • But ... if this were true -- that it will be forced to be treated as utf8 -- then why did it not treat v1024 as utf8?

    Oh wait, I think I am being a moron. This is because the regex is outside the pragma's scope. Sigh. It does do exactly what I thought, but your code failed because I can't read; I'm sorry for wasting your time about that whole thing.

    Now, as to being "wrong": should it be different? Yes, that would be nice. But the fact remains that it does work, even though it is a kludge. UTF-8 IS
  • I'm a heavy perl programmer myself, but I have to say, there's one plus I see to tcl. This is not a merit of the language itself, but what's been done with it: There's an excellent web/db toolkit written in tcl called the ACS, or Arsdigita Community System [arsdigita.com]. This toolkit supports a huge amount of web/db functionality (a recent version included over 3000 scripts). Something like this could certainly be done in Perl, but there's not one like this available know that I know of. There are smaller projects-- but