NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.
All the Perl that's Practical to Extract and Report
Stories, comments, journals, and other submissions on use Perl; are Copyright 1998-2006, their respective owners.
UTF-8 (Score:1)
I'm sorry to disappoint you, but
Perl_is_utf8_stringcan't be used to check for well-formed UTF-8. Perls utf8 encoding form is superset of Unicode and ISO/IEC 10646. Perls encoding form supports codepoints up to 2**64-1 and has no problems with encoded UTF-16 surrogates or any other permanently reserved codepoints.Re: (Score:1)
Baf, I'm not disappointed. Nothing can be more disappointing than encoding problems...
Hansen can you have a loot at http://github.com/jozef/String-isUTF8/blob/master/t/01_String-isUTF8.t [github.com] and send a patch with failing tests?
Re: (Score:1)
Wrong wrong wrong (Score:1)
Don’t look at the UTF8 flag. The UTF8 flag does not mean what you think it means. You can have a perfectly valid Unicode string that does not have its UTF8 flag set, and you can have a JPEG image in a string that does have its UTF8 flag set. The UTF8 flag is a lie. It should not have been called the UTF8 flag. There is no flag in Perl that means what you think the UTF8 flag means. Don’t look at the UTF8 flag.
What you want to do is very simple:
Re: (Score:1)
Re: (Score:1)
I misunderstood where the problem is in the code, but it’s still wrong. Since it’s XS, you specifically do need to look at the flag, explicitly:
Re: (Score:1)