Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

TorgoX (1933)

TorgoX
  sburkeNO@SPAMcpan.org
http://search.cpan.org/~sburke/

"Il est beau comme la retractilité des serres des oiseaux rapaces [...] et surtout, comme la rencontre fortuite sur une table de dissection d'une machine à coudre et d'un parapluie !" -- Lautréamont

Journal of TorgoX (1933)

Tuesday April 19, 2005
05:25 AM

A disable-output-escaping="yes" workaround

[ #24272 ]
Dear Log,

So I've been using XSL for the task of making RSS pretty by using the browser's XSL engine to turn it into HTML so it's all pretty and stuff.

But there's a problem, or a nasty Gordian knot of misfeatures that collectively de-synergize to make a problem that no single person bears the blame for. The problem is this: there's an option on several "interpolate content here" operators in XSL, called disable-output-escaping="yes", or DOEY for short. It's exactly what I want to use for the situation I'm in, namely: an RSS feed (=XML) contains escaped HTML, and I want to have the browser render that HTML. I want to use DOEY. But sometimes it doesn't work.

Yes, I know.

I know, I know. I shouldn't be escaping the markup, I should be using namespaced XHTML. But, folks, like Larry said, "the Golden Gate wasn't our fault either, but we still put a bridge across it". (And that applies even in cases where the Golden Gate might, arguably, be your fault!)

So, I want to use the DOEY feature, so I can basically say "drop in some stuff here, but not as literal text, but instead as whatever you'd get parsing it as markup" -- so that the given input ten-character string "&lt;br&gt;" doesn't produce the resulting four-character text string "<br>", but instead produces a resulting element node with the tagname of "br".

As far as I understand these things, that's just the sort of problem that DOEY is meant to solve. But the problem I'm facing is made of these parts:

  • Few XSL processors support DOEY -- not even Firefox's supports it.
  • And these DOEY-less processors, instead of yowling when they see DOEY being used (as in: <xsl:value-of disable-output-escaping="yes" select="description"/>), just ignore it -- i.e., disobey it, merrily interpolating the text just as if you'd said "interpolate the text, but like normal". (like a plain <xsl:value-of select="description"/>).
  • And, to add a final spot of glee to the situation, XSL provides a way to have processors report what features they support. But apparently, few processors support that way of reporting whether they support DOEY! Yay!

So, what to do.

I think and I think.

And, eventually, I come upon a brilliant idea:

Once the output text node has been produced (from the input text node that you wish the XSL processor had applied DOEY to), go use JavaScript to find that node's content and do node.innerHTML = node.textContent; on it, forcing the browser to replace the text node with whatever it gets from parsing that its current content as HTML source.

It's a brilliant idea that someone else already come up with, as I find out later. But, 1) I implement it, and 2) I do it better, because I also come up with a way to figure out whether that JavaScript needs to apply at all! Because some XSL processors do implement DOEY, and in those cases, you don't want to do the node.innerHTML = node.textContent; thing.

What I do is this, very early in the XSL template that produces the HTML:

<html:div id="cometestme" style="display: none;">
  <xsl:text disable-output-escaping="yes" >&amp;amp;</xsl:text>
</html:div>

Then I have the JavaScript look at the content of that (offscreen) text node -- if its text content is a single character, an ampersand, then we know that the XSL processor actually implemented the DOEY feature, and so we don't need to do anything to "repair" some of the text nodes in this document. If, however, the text content of our test-node is the five character string "&amp;", then we know our wicked little XSL processor silently ignores the DOEY attribute, so we know to go attacking the text nodes in this document (doing x.innerHTML = x.textContent for each) that are liable to contain escaped markup.

And that's just what I do, and it works.

Just. Like. That.