How to read a CDATA node with XPath? | Community
Skip to main content
April 8, 2011
Question

How to read a CDATA node with XPath?

  • April 8, 2011
  • 4 replies
  • 18901 views

I'm using Workbench ES2 (9.0) to build a process and am trying to use xpath to read a CDATA node from an XML variable.

But when I do, I get a different value than when I read a text node without CDATA.  How can I read this data?  Any node in the document could have CDATA in it.

xml document

<properties>

  <generation>

    <inputGenDocs>

      <inputGenDoc><![CDATA[\\myserver.com\input\200100647\IFI_Attachments2_V2_Outlier & Historical Data IFI #575.TIF]]></inputGenDoc>

    </inputGenDocs>

  </generation>

</properties>

xpath

/process_data/xmlProperties/properties/generation/inputGenDocs/inputGenDoc[

number(/process_data/@numDocIdx

)]

actual result

<?xml version="1.0" encoding="UTF-8"?>
<inputGenDoc><![CDATA[\\myserver.com\input\200100647\IFI_Attachments2_V2_Outlier & Historical Data IFI #575.TIF]]></inputGenDoc>

expected result (if not wrapped in CDATA this is the returned value)

\\myserver.com\input\200100647\IFI_Attachments2_V2_Outlier & Historical Data IFI #575.TIF

thanks.

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.

4 replies

October 12, 2012

Hi J_Dev_77,

Did you manage to find a solution to your problem because I'm facing the same issue ...

So far I found 2 solutions:

- Escape all characters in my CDATA to obtain a "clean" encoded string (with &lt; instead of < etc.)

- Use XSLT to extract CDATA value

Regards,

Thomas

April 11, 2013

Hi J_Dev_77,

We 'solved' it the following way:

substring-before(substring-after(serialize(xpath_to_cdata_node, false), "CDATA["), "]]")

I know it's ugly, but it gets the job done without having to use additional steps in the process.

Cheers

WP

J_Dev_77Author
April 11, 2013

I escaped the offending characters in the XML file

"<" --> "&lt;"

">" --> "&gt;"

"&" --> "&amp;"

"\"" --> "&quot;"

"'" --> "&apos;"

This approach was a bit tricky since the ampersand is both a character to be escaped when it is by itself, or a character NOT to be escaped when it is part of the escape sequence.  Our data was extremely "open" and we couldn't discount the possibility that these escape sequences alread existed prior to us attempting to replace other unescaped characters.

Hindsight being 20/20, it might have been easier to use XSLT to strip away the CDATA tags in the LiveCycle process.  Take a look at Willem-Paul's solution.  A fairly lengthy XPATH statement, but potentially a lot less work than what I did.

Something like this I think:

substring-before(substring-after(serialize(/process_data/xmlProperties/properties/generation/inputGenDocs/inputGenDoc[number(/process_data/@numDocIdx)], false), "CDATA["), "]]")

April 11, 2013

Hi guys,

Just found out there is an another, easier, way. A post on stackoverflow got me on to this: simply use the XSLT component in a process to 'convert' the incoming xml to the same xml. This will automagically convert all CDATA tags to encoded text tags which of course can than accessed used xpath without any problems.

I used the exact xslt given in the answer in the XSLT component and it works like a charm.

Here's the direct link to the answer on stackoverflow: http://stackoverflow.com/a/10543572

Cheers

WP