Slash Boxes
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
More | Login | Reply
Loading... please wait.
  • If it's not well formed, it's not XML. Period. Please don't encourage such vileness into the world.

    At least, if you do, don't put it into the XML namespace because that's not what it's producing.

    Sorry to come across mad about this, but dealing with invalid XML (and worse, SGML) over the last 5 years has made me bitter and twisted.

    For your well formed XML generation needs, I'm open to suggestions as to how I can improve XML::Genx [] (within the constraints of the underlying library).


    • You really want to see the Perlmonks thread on this (I linked to it in the parent story). XML::Composer can easily produce valid XML and, to be honest, it produces XML much easier than most of the XML modules out there. Rather than slapping my hand when I use a namespace which is illegal or upper-case tags or improperly escaped data (shudder), it trusts me to really mean what I say. However, the fact is that we often have to deal with bad XML and there's no way around it. I hate it. You hate it. We ha

      • sigh. I certainly see the need. It just makes me cry. :-)

        And I apologise; I should have read the perlmonks thread first.

        As to the name, how about XML::NotWellFormed? It's the most accurate description even if it is an oxymoron.


      • I still don’t like the idea, even as I understand the predicament. I would suggest you use a templating system instead of writing a module for this. Text:Template and the Template Toolkit can easily produce arbitrarily complex and arbitrarily broken XML output.

        (Oh, and please get in touch with the people who’re asking broken XML from you and call them bozos []. Not offensively, of course.)

        • Believe me, I already had a phone call with a Yahoo! rep. He was very apologetic but there's not much I can do as a lone developer to shove Yahoo!

  • A friend and I had a similar discussion about generating XML in a simple way today -- we trawled CPAN and found XML::Generator []. Does it do (most of) what you're looking for?

    • I had looked at XML::Generator and I liked it, but it had some problems. First, because of autoload, it's easy to do this:

      print $xml->feild('foobared');

      I probably meant "field". My version forces you to map methods to tags and will die if you try to print a tag that doesn't exist (though you can add methods/tags on the fly).

      Also, as far as I can tell, I would not be able to conveniently dump out data in the Yahoo! IDIF format. Here's an snippet:

      <?xml version="1.0"?>

      • I probably meant “field”. My version forces you to map methods to tags and will die if you try to print a tag that doesn’t exist (though you can add methods/tags on the fly).

        What if you wrote the correct tag name, but it gets inserted in the wrong place? What if the content is bad or a required attribute is missing?

        Of course, a schema can only be used to validate well-formed documents…

      • This is not XML, so really it should not be labelled as such. AFAICT it is pretty close to being SGML though. It might even be SGML, the XML declaration would be seen by an SGML parser as a regular PI, you can add a DTD when you parse the file, and unenclosed tags and &\W are valid in SGML. You just need the characters to be in latin1.

        So why don't you go and pollute the SGML namespace ;--)

        Seriously, I don't think you should release your module in the XML namespace. An IDIF module would be OK, it its o

        • I have decided not to put this in the XML namespace. That much I agree on.

          I think my main problem with your module is the way you seem to advertise it, which sounds a bit like "let's generate more quasi XML to p.o. real XML guys" to me.

          I can see how that might appear to be what I was doing. I'll be sure to clarify that. Unfortunately, this is a real problem space that developers constantly face: legacy XML "variants" or third-party resources which require malformed XML. Since it's not always possib

          • down to the ordering of attributes

            I feel your pain, having had to deal with exactly that request for XML::Twig (not a model of XML purity itself): apparently some Microsoft tool needs attributes in a specific order. The easiest solution I found was to use Tie::IxHash objects to store the attributes.

      • Also, as far as I can tell, I would not be able to conveniently dump out data in the Yahoo! IDIF format. Here's an snippet:

        That is not even remotely well-formed XML (note the unclosed "br" tags, for example), but it's perfectly valid for Yahoo!'s IDIF format. Trying to produce a bunch of stuff like this with most XML modules is what finally led me to start writing my own. From what I can see, XML::Generator will not allow me to write that.

        I'm sure the usefulness of this information is long gone for you,

  • In Defense of Not-Invented-Here Syndrome []

    Of course, Joel is fun to read because he writes a lot more confrontatively than the subject really warrants. I wrote something related [] recently.

  • What we need is less bad not-quite-XML and more good XML. For generating good XML, the libaries that guarantee producing correct XML are the way to go. My impression is that they are easier to use than the not-so-nice ones because you don't have to worry about screwing up. They do things like always encoding strings, always using UTF-8, always worrying about namespace, and complaining about improperly nested tags.

    If you have to generate not-quite-XML or bad XML, then you have a bigger problem. I am no

    • But there's plenty of bad XML out there already and there are programmers who have no choice but to implement it, particularly if it's a third party requirement. Usually this bad XML tends to cause plenty of problems. Why have even more problems by creating yet another hand-rolled module which may or may not do what you want? Programmers in this unfortunate situation should at least be able to get the job done and not waste time having to reimplement something.

      The good thing about Data::XML::Variant [] is