Leader of Birmingham.pm [pm.org] and a CPAN author [cpan.org]. Co-organised YAPC::Europe in 2006 and the 2009 QA Hackathon, responsible for the YAPC Conference Surveys [yapc-surveys.org] and the QA Hackathon [qa-hackathon.org] websites. Also the current caretaker for the CPAN Testers websites and data stores.
If you really want to find out more, buy me a Guinness
Links:
Memoirs of a Roadie [missbarbell.co.uk]
[pm.org]
CPAN Testers Reports [cpantesters.org]
YAPC Conference Surveys [yapc-surveys.org]
QA Hackathon [qa-hackathon.org]
From MyFileFormats.com I found this CSV definition. Nowhere is it prejudice against non-US users of the format, so why does Text::CSV insist on:
Allowable characters within a CSV field include 0x09 (tab) and the inclusive range of 0x20 (space) through 0x7E (tilde).
Nowhere in the specification I found (and it wasn't easy to find that!), does it make an assumption on what can be inside a field. As long as it's contained in quotes, it's valid. As it should be.
The reason I'm taking issue with this, is the fact we have a field in our CSV that is a currency field. As we are in the UK, we quite rightly use a £ symbol. Text::CSV spits it out as invalid, even if the field is contained in quotes as the specification states. According to Text::CSV specification, it also means that no european language characters, other currency symbols (eg the Euro or the Yen) or special symbols (eg ® or ©) are ever allowed to appear in a CSV file. I wonder if these producers of spreadsheets applications, with the capability of saving to CSV, realise they write out illegal characters?
Then again Text::CSV is over 5 years old and still at version 0.01! Seeing as the author hasn't written anything else, I wonder if they've disappeared?
Is this another module I'm gonna have to look at and attempt to patch? I seemed to be finding alot of inacurate or restricted modules of late!
Standards (Score:2)
Second, don't use Text::CSV. Use Text::CSV_XS [cpan.org]. It's got far more parameters for your tuning enjoyment.
--
xoa
Re:Standards (Score:2)
I'm pretty sure Text::CSV_XS is the successor to Text::CSV. It's always a good idea to search CPAN [cpan.org] and look for more recent modules.
For even more enjoyment, see if you can make use of DBD::CSV.
J. David works really hard, has a passion for writing good software, and knows many of the world's best Perl programmers
Re:Standards (Score:1)
Plus it was easier to parse the file directly rather than store it locally, parse it, then delete it.
Re:Text::CSV_XS (Score:1)
Usefull, but I do have a hard time explaining why you have to use binary mode to write non-binary data!
I would love for it to have an eight bit mode, where control characters are forbidden, ie. 0x00-0x17, 0x7f-0x97 and 0xff (if I got my ranges right). Of course this would annoy M$-users, that have some printable characters embedded in the high control range (0x80-0x9f
Re:Text::CSV_XS (Score:1)
Well it does the job
Re:Standards (Score:1)
Your example still follows the standard as I understand it. Fields can have quotes around them, or the quotes can be omitted if the field doesn't contain the quote character or the field separator. The standard way of escaping double quotes is to double them. Much like SQL in that respect.
Re: (Score:1)
The new Text::CSV will include a pure perl version of Text::CSV_XS and will itself be just a wrapper. If Text::CSV_XS is installed, it will use it, otherwise, it will used the bundled Text::CSV_PP (or Text::CSV_PurePerl as the snap currently states).
Text::CSV_XS is extremely faster than the pure-perl version(s).
See also http://www.perlmonks.org/?node=617577 [perlmonks.org]
Enjoy, have FUN! H.Merijn