NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.
All the Perl that's Practical to Extract and Report
Stories, comments, journals, and other submissions on use Perl; are Copyright 1998-2006, their respective owners.
Accents (Score:2)
I've been surprised at how consistently my name turns up on official documents here in France. The spelling Rafaël being completely abnormal, of course, and no-one ever spells it correctly, (even I can't bother spelling it correctly most of the time), but on my passport it's right.
I remember when I registered François last year, I got asked quite precise questions about the spelling: a dash or no dash between Garcia and Suarez? They care about that kind of stuff.
Re: (Score:2)
Bah apparently the use.perl comment boxes are not friends with my browser :/
Those were, in order :
00EB LATIN SMALL LETTER E WITH DIAERESIS
00E7 LATIN SMALL LETTER C WITH CEDILLA
Re: (Score:1)
To spell Rafaël and François properly you need to entity-encode the, uh, extravagant characters: use.perl is a Latin-1 Only Zone. Quelle bêtise…
Re: (Score:1)
Hehe, I would say ASCII-only. Rafael's accents perfectly fit in the latin-1 charset.
Re: (Score:2)
I'm suspecting browser character set headers on the form submission, because I can paste a literal ć in no problem. It looks like his browser sent UTF-8, but either described it as ISO-8859-1, or didn't say, resulting in the far end treating it as ISO-8859-1.
Ho ho ho. When that ć comes back to me on preview, the HTML source has turned into
ć.Which reminds me. Currently, does
pod2textusemanas an intermediate step when generating its output?Re:Accents (Score:1)
The initial problem is that the use.perl.org pages declare iso-8859-1
as its charset. So form data has also to be sent as iso-8859-1. Maybe
a browser shouldn't accept any non-latin1 characters when entering or
pasting data into form fields, but at least gecko-based browsers
doesn't do this. To do something with non-latin1 characters,
gecko-based browsers on Unix system seem to do use this heuristic:
* codepoints below 256 are fine
* if there are codepoints in the 0x80-0x9f range of win1252, then they
are send like this (try LATIN CAPITAL LETTER S WITH CARON for a test)
* every other codepoint is sent as a numerical HTML entity
About pod2text: no, *pod2text* does not use man, but *perldoc* uses by
default pod2man. The plan was to fix pod2text encoding issues (there
are still some, but they are fixable, in contrast to pod2man) and then
to use something like Pod::Text::Overstrike or Pod::Text::Termcap
instead of Pod::Man.
I just right now created and uploaded
Pod-Perldoc-ToTextTermcap-0.00_50.tar.gz to CPAN. Just install it and
set
export PERLDOC=-MPod::Perldoc::ToTextTermcap
or
export PERLDOC=-MPod::Perldoc::ToTextOverstrike
and perldoc will use the new renderer. It looks somewhat different
than man output, but at least bold and underline is done (unlike with
stock Pod::Perldoc::ToText).
Reply to This
Parent