How to remove html tags from the content of Rich Text Editor ? | Community
Skip to main content
October 16, 2015
Solved

How to remove html tags from the content of Rich Text Editor ?

  • October 16, 2015
  • 3 replies
  • 7765 views

For the purpose of auto tagging, I am storing the contents of RTE in a string array and comparing it against tag namespace. When I try to store the contents of RTE in string array I am getting all the html tags \n, &nbsp, etc in the array. I am able to manually remove these tags by the following code:

textEntered = textEntered.replaceAll("\\<.*?\\>", "");
textEntered = textEntered.replace("\n", " ").replace("\r", " ");
textEntered = textEntered.replaceAll(" ", " ");

Is there any other proper or more logical way to remove all the tags from the content of RTE? Any help for more understanding would be much appreciated. Thanks in advance. 

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.
Best answer by Kunal_Gaba_

You should use HTML parser libraries in Java for this use case. See an example here - http://jsoup.org/cookbook/extracting-data/attributes-text-html

3 replies

Chandra_gupta
Level 4
October 16, 2015

I think another way is to rewrite the HTML before you deliver it to the end browser.

http://jcr-nosql.com/2013/12/14/custom-rewriter-transformer-to-rewrite-any-html-output-generated-by-sling-rendering-process/

 

Thanks,

Chandra

Kunal_Gaba_
Kunal_Gaba_Accepted solution
October 16, 2015

You should use HTML parser libraries in Java for this use case. See an example here - http://jsoup.org/cookbook/extracting-data/attributes-text-html

edubey
Level 10
October 16, 2015

Hi

Its seems every time you get data from the richtext your code will remove all the HTML tags, Its looks a simpler way to me

But still if you wish to get a better way, I would recommend you to customized the richtext according to your behaviour like it has been done here [1]

[1] http://experience-aem.blogspot.in/2014/02/aem-cq-56-extend-richtext-editor-add-new-plugin-pullquote.html