Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

Hansen (4428)

+ -

  Comment: Correctness (Score 1) on 2010.04.03 13:30

I'm looking forward to you in-depth review of the actual codebases and not just syntextual suger, as you are probably aware some of the some competitors has re-implemented a lot of existing mature codebases on CPAN, such as RFC: 2109, 2388, 2616, 4627, etc.

And simple thing like I/O and syscalls (logging error reporting) etc...

-- chansen

Read More 21 comments
Comments: 21
+ -

  Comment: UTF-8 (Score 1) on 2010.01.27 17:29

by Hansen on 2010.01.27 17:29 (#71595)
Attached to: Illegal character 0x1FFFF

I'm sorry to disappoint you, but Perl_is_utf8_string can't be used to check for well-formed UTF-8. Perls utf8 encoding form is superset of Unicode and ISO/IEC 10646. Perls encoding form supports codepoints up to 2**64-1 and has no problems with encoded UTF-16 surrogates or any other permanently reserved codepoints.

Read More 7 comments
Comments: 7
+ -

  Comment: Re: Unicode URLs, wtf? (Score 1) on 2010.01.07 7:50

by Hansen on 2010.01.07 7:50 (#71494)
Attached to: Unicode URLs, wtf?

It usually comes from broken javascript applications that uses escape() instead of encodeURI()


escape("\u263A") -> %u263A
encodeURI("\u263A") -> %E2%98%BA

--
chansen

Read More 5 comments
Comments: 5
+ -

  Comment: Iconv (Score 1) on 2010.01.07 7:25

by Hansen on 2010.01.07 7:25 (#71493)
Attached to: Decoding multiple encoded utf-8 in perl or ruby

Iconv can't transcode your data to US-ASCII since it contains octets greater than 0x7F. Your double encoded data has been transcoded from Latin-1 to UTF-8, in order to reverse it you need to transcode from UTF-8 to Latin-1.

Change:
Iconv.new( 'UTF-8', 'ASCII' )

To:
Iconv.new( 'LATIN1', 'UTF-8' )

and it should work just fine.

You can aslo narrow down your regexp to [\xc2-\xc3][\x80-\xbf], since UTF-8 encoded Latin-1 is within that range.

--
chansen

Read More 7 comments
Comments: 7
+ -

  Comment: CGI.pm (Score 1) on 2010.01.07 6:31

by Hansen on 2010.01.07 6:31 (#71490)
Attached to: Unicode URLs, wtf?

CGI.pm decodes the non-standard (and invalid according to RFC 3986) pct escape into a UTF-8 octet string, but it doesn't decode it into perl unicode string. I think the current behavior is desirable since the data can contain any octets in any encoding.

--
chansen

Read More 5 comments
Comments: 5
+ -

  Comment: Re:Strongly, strongly disagree (Score 1) on 2009.10.01 6:42

by Hansen on 2009.10.01 6:42 (#70735)
Attached to: Why is Perl on Mac such a disaster

I have pushed Mac::SystemDirectory 0.02_01 to cpan. I have added support for DomainMask and returning multiple directories in list context.

http://idisk.mac.com/christian.hansen/Public/perl/Mac-SystemDirectory-0.02_01.ta r.gz

Example http://idisk.mac.com/christian.hansen/Public/perl/macdirs.pl (I was not allowed to post it here: Your comment violated the "postercomment" compression filter.)

I'll add you and Alias as co-maint, feel free to hack on it as you see fit.

--
chansen

Read More 44 comments
Comments: 44
+ -

  Comment: Re:Strongly, strongly disagree (Score 2, Informative) on 2009.09.30 17:14

by Hansen on 2009.09.30 17:14 (#70721)
Attached to: Why is Perl on Mac such a disaster

and a simple Cocoa implementation. Only tested on 10.5.

http://idisk.mac.com/christian.hansen/Public/perl/Mac-SystemDirectory-0.01.tar.g z

perl -MMac::SystemDirectory=:all -wle 'print FindDirectory(NSDocumentDirectory);'
/Users/chansen/Documents

--
chansen

Read More 44 comments
Comments: 44