Expand my Community achievements bar.

SOLVED

Problem with <link> tag in rss; something is causing it to self close

Avatar

Level 4

We have custom logic which produces rss output.

We're having an issue where a particular tag in the rss output: <link> - is rendering the opening <link> tag as if it's a self closing tag (which makes the rss invalid):

<link/>https://example/article.html</link>

instead of:

<link>https://example/article.html</link>

Something in the system appears to be changing this opening <link> tag as if we change the <link> tag to something arbitrary, like:

<foo>https://example/article.html</foo>

the opening <foo> tag remains as it should (it does not change to <foo/>).

We also tried changing the <link> tag to <img> - and saw the same behavior as with <link>, the opening <img> tag self closed as <img/>https://example/article.html</img>. 

Maybe this is a clue - as link and img tags usually are self closing?

One other thing,  though feeds are accessed via calls like: /foo/bar/articles.rss - we actually proxy this to /foo/bar/articles.rss.html on the back end - as it's required for it to work - without this proxying (or adding of .html) - calling it directly as /foo/bar/articles.rss gives a 404.

Calling the feed via cURL with: /foo/bar/articles.rss - via the proxy, i.e. through Apache and AEM Dispatcher with: curl -I https://domain.tld/foo/bar/articles.rss - I can see the following is set: Content-Type: application/rss+xml;charset=utf-8 - and I receive a 200 - the feed is delivered but with incorrect <link/> tags.

Some other interesting things: <title></title> tags render properly and so do <pubDate></pubDate> tags. I guess <title> is an example of a tag that is not self closing and <pubDate> doesn't seem like an html tag so it's more like what happened when we tried <foo> ... it worked as desired.

I tried disabling link rewriting and link checking (in /system/console/configMgr - Day CQ Link Checker Transformer), deleted the /var/linkchecker contents, and restarted the instance.

I'm wondering what else in the system might be applying this behavior to the output?

More info: In other code that does generate rss with working <link> tags, I notice the Java code sets the content-type with: <%@page contentType="application/rss+xml"%> - our code though is Sightly/HTL ... I wonder if including a jsp (with data-sly-include) that does: <%@page contentType="application/rss+xml"%> would work? Or would it just print this in the body? Is there some other way to set content-type in Sightly/HTL?

1 Accepted Solution

Avatar

Correct answer by
Community Advisor

Hi @this-that-the-otter 

 

Can you check if any link checker or transformer is doing this?

 

Related link: https://medium.com/@toimrank/link-checker-and-transformer-381d4f245d12

 

Thanks,

Kiran Vedantam.

View solution in original post

1 Reply

Avatar

Correct answer by
Community Advisor

Hi @this-that-the-otter 

 

Can you check if any link checker or transformer is doing this?

 

Related link: https://medium.com/@toimrank/link-checker-and-transformer-381d4f245d12

 

Thanks,

Kiran Vedantam.