Expand my Community achievements bar.

Don’t miss the AEM Skill Exchange in SF on Nov 14—hear from industry leaders, learn best practices, and enhance your AEM strategy with practical tips.
SOLVED

<div> tag not reported by SAX parser.

Avatar

Former Community Member

When I add a component to a global pipeline I am not able to catch events for <div> element.

I am trying to rewrite certain attributes in markup generated by CQ5. For this I created a custom component and added this to a default global pipeline.

When I debug I can see that events are thrown for most of the tags, e.g. <script>, <a>, <body>, but <div> tags are not entering startElement() method.

... import org.apache.cocoon.xml.sax.AbstractSAXPipe; import org.apache.cocoon.xml.sax.AttributesImpl; import org.apache.sling.api.SlingHttpServletRequest; import org.apache.sling.rewriter.Transformer; import org.xml.sax.Attributes; import org.xml.sax.SAXException; public class CustomLinkTransformer extends AbstractSAXPipe implements Transformer{     ... @Override public void startElement(String uri, String loc, String raw, Attributes attributes) throws SAXException { Attributes transformedAttr = attributes; // do some processing }     ... }

In CustomLinkTransformer I set up a breakpoint on line

Attributes transformedAttr = attributes; 

so I can see all of the elements passed to this method. And I am not able to see a single <div>.

Seems like SAX parser is not generating events for div elements.

Any ideas why?

1 Accepted Solution

Avatar

Correct answer by
Employee

Hi,

I was actually referring to the generator property of your rewriter configuration node. I'm going to assume it is "htmlparser" which is the CQ-provided HTML parser.

This parser by default only parses a handful of HTML tags. To change this configuration, create a new node called "generator-htmlparser" under your rewriter configuration node (so something like /apps/myco/config/rewriter/default/generator-htmlparser). The node type should be nt:unstructured. On this node, create a multivalued string property named includeTags and set the value to A, /A, IMG, AREA, FORM, BASE, LINK, SCRIPT, BODY, /BODY, DIV, /DIV

(i.e. set each of those to a single value in the property).

That should do the trick.

Justin

View solution in original post

3 Replies

Avatar

Employee

Which parser specifically are you using in your transfomer pipeline?

Avatar

Former Community Member

I am using AEM 5.6.1. So, as I guess, internally a Cocoon parser is used (cocoon-xml-2.0.2.jar from bundles view).

Basically, my implementation is really similar to the one you pointed me in other post.

https://github.com/Adobe-Consulting-Services/acs-aem-commons/blob/master/bundle/src/main/java/com/ad...

Avatar

Correct answer by
Employee

Hi,

I was actually referring to the generator property of your rewriter configuration node. I'm going to assume it is "htmlparser" which is the CQ-provided HTML parser.

This parser by default only parses a handful of HTML tags. To change this configuration, create a new node called "generator-htmlparser" under your rewriter configuration node (so something like /apps/myco/config/rewriter/default/generator-htmlparser). The node type should be nt:unstructured. On this node, create a multivalued string property named includeTags and set the value to A, /A, IMG, AREA, FORM, BASE, LINK, SCRIPT, BODY, /BODY, DIV, /DIV

(i.e. set each of those to a single value in the property).

That should do the trick.

Justin