Expand my Community achievements bar.

SOLVED

<div> tag not reported by SAX parser.

Avatar

Former Community Member

When I add a component to a global pipeline I am not able to catch events for <div> element.

I am trying to rewrite certain attributes in markup generated by CQ5. For this I created a custom component and added this to a default global pipeline.

When I debug I can see that events are thrown for most of the tags, e.g. <script>, <a>, <body>, but <div> tags are not entering startElement() method.

... import org.apache.cocoon.xml.sax.AbstractSAXPipe; import org.apache.cocoon.xml.sax.AttributesImpl; import org.apache.sling.api.SlingHttpServletRequest; import org.apache.sling.rewriter.Transformer; import org.xml.sax.Attributes; import org.xml.sax.SAXException; public class CustomLinkTransformer extends AbstractSAXPipe implements Transformer{     ... @Override public void startElement(String uri, String loc, String raw, Attributes attributes) throws SAXException { Attributes transformedAttr = attributes; // do some processing }     ... }

In CustomLinkTransformer I set up a breakpoint on line

Attributes transformedAttr = attributes; 

so I can see all of the elements passed to this method. And I am not able to see a single <div>.

Seems like SAX parser is not generating events for div elements.

Any ideas why?

1 Accepted Solution

Avatar

Correct answer by
Employee

Hi,

I was actually referring to the generator property of your rewriter configuration node. I'm going to assume it is "htmlparser" which is the CQ-provided HTML parser.

This parser by default only parses a handful of HTML tags. To change this configuration, create a new node called "generator-htmlparser" under your rewriter configuration node (so something like /apps/myco/config/rewriter/default/generator-htmlparser). The node type should be nt:unstructured. On this node, create a multivalued string property named includeTags and set the value to A, /A, IMG, AREA, FORM, BASE, LINK, SCRIPT, BODY, /BODY, DIV, /DIV

(i.e. set each of those to a single value in the property).

That should do the trick.

Justin

View solution in original post

3 Replies

Avatar

Employee

Which parser specifically are you using in your transfomer pipeline?

Avatar

Former Community Member

I am using AEM 5.6.1. So, as I guess, internally a Cocoon parser is used (cocoon-xml-2.0.2.jar from bundles view).

Basically, my implementation is really similar to the one you pointed me in other post.

https://github.com/Adobe-Consulting-Services/acs-aem-commons/blob/master/bundle/src/main/java/com/ad...

Avatar

Correct answer by
Employee

Hi,

I was actually referring to the generator property of your rewriter configuration node. I'm going to assume it is "htmlparser" which is the CQ-provided HTML parser.

This parser by default only parses a handful of HTML tags. To change this configuration, create a new node called "generator-htmlparser" under your rewriter configuration node (so something like /apps/myco/config/rewriter/default/generator-htmlparser). The node type should be nt:unstructured. On this node, create a multivalued string property named includeTags and set the value to A, /A, IMG, AREA, FORM, BASE, LINK, SCRIPT, BODY, /BODY, DIV, /DIV

(i.e. set each of those to a single value in the property).

That should do the trick.

Justin