NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.
All the Perl that's Practical to Extract and Report
Stories, comments, journals, and other submissions on use Perl; are Copyright 1998-2006, their respective owners.
Coordinate with RJBS (Score:1)
Re: (Score:2)
It tries to encode the messages into a narrow-to-wide certain set of encodings and see if all characters are safely encoded, using Encode:: and possibly with Dan's Encode::InCharset.
Anyway I'll think about it more.
Encode::First (Score:2)
I was going to name it Encode::First, and duplicate Encode's encode interface, but with a colon (or perhaps comma) separated list of encodings, of which the first that supports all codepoints will be used. It would return a two-element list: encoding and byte string.
Typical usage would be:
my ($enc, $buf) = encode_first('us-ascii:iso-8859-1:iso-8859-15:utf-8', $string)
Re: (Score:2)
Does it really have anything to do with mail? (Score:1)
It seems to me that email is just what you want to use the module for. I don’t see how the module’s operation actually has anything whatsoever to do with email. “Best” doesn’t really say anything; maybe Encode::MinCharsetPicker?
(Btw, I’d have the module only suggest the minimal applicable charset, but not actually do the encoding itself (or only if you ask for it by way of a convenience function). Probably the main function should simply take a list of encodings and the
Re: (Score:2)
I'd probably make two functions, one is compatible as encode() (and does encoding itself) and other one like detect_best_encoding(), which returns the name of the encoding but doesn'nt encode itself.
Re: (Score:2)
my ($enc) = encode_first(...);
Or, have you found another efficient way of finding a suitable encoding?
Re: (Score:2)
The reason we want the encoding itself back it that we'd like to use it in the Email header. If we return the encoded string only, the caller doesn't know which encoding it's actually encoded in.
Re: (Score:1)
The reason I suggested that sort of interface is that some APIs expect to receive character strings that they will then encode themselves; XML serialisers come to mind. In such a case, giving the caller an encoded string is pretty useless.
Re: (Score:2)
Minimal Enclosing (Score:2)
What you would appear to be wanting to do, is to find an as small as possible character set that contains every single character in your text. That appears to be related to finding a minimum size geometric shape that contains every vertex in a set. Terms that spring to mind are minimal enclosing circle [google.com] or rectangle [google.com] the latter is also
You have seen this? (Score:2)