Exporting Content into 3rd party search
Hey All,
I am trying to figure out the best way to export pages and their content into a 3rd party search company. The data needs to look like this :
{
"op": "add",
"path": "/path/to/page",
"value": {
"attributes": {
"title": "Page Title",
"url": "https://www.website.com/path/to/page.html",
"description": "This is basiclly all of the content on the page. So if there is 2 different text area's on the page it should put that content inside this description."
}
}
}
I know I can ask the page for everything including a description, but that doesn't account for things on the page. For example, lets say I put a new text component on the page and I added text that I want searched on, then it wouldn't pull that data. I started to look into jSoup (and httpclient since I am a SPA), to crawl the page, but is that the best option?
Thanks for anyone who has an opinion
