Expand my Community achievements bar.

Guidelines for the Responsible Use of Generative AI in the Experience Cloud Community.
SOLVED

URL on a page

Avatar

Level 4
Level 4

Team, 

Any suggestions on how to retrieve a list of URL of pdf links hosted on external server and  available on a HTML page in AEM. 

Edit:

Use case - our authors are adding PDF links to their content (using Content Fragment - RTE), and those PDFs are hosted on SharePoint/Teamsite servers. The business team is looking for a way to create a report that shows these links within the rendered HTML. I'd really appreciate any suggestions you might have on how to make this happen!

 

Thanks

1 Accepted Solution

Avatar

Correct answer by
Community Advisor

Hi @nj2 

This is a really basic example, how you can exact the list but if you are looking for a solution that can be used by Authors then create a landing page with a servlet which return results as EXCEL/CSV

String externaSharepointlLink = "www.sharepoint.com";
String externaPDFpointlLink = "www.pdf.com";

//Query to match text and other type for above string

 def query = buildQuery();
 def result = query.execute();


def buildQuery() {
  def queryManager = session.workspace.queryManager;
  def statement = "SELECT * FROM [nt:unstructured] AS text WHERE ISDESCENDANTNODE([/content/dam/cf]) AND (nodetypes RTE or other fields) AND (text.[text] LIKE '%"+externaSharepointlLink+"%' OR text.[text] LIKE '%"+externaPDFpointlLink+"%'")
  queryManager.createQuery(statement, 'sql');
}


if(result.nodes.size()>0){
    println qPath+' text components with external links: '+result.nodes.size(); 
    total+=result.nodes.size();
    result.nodes.each { node ->
         println node.path;
    }
 }

 



Arun Patidar

View solution in original post

8 Replies

Avatar

Community Advisor

@nj2 If this external service exposes a REST API that you can tap into , you can create an OSGI service that lets fetch pdf links from the external server as a REST call. 

Once you have this data , you can use the same in your page via an AEM component backed by a sling model that has fields that hold this REST call response data.

Some useful content around this : 

 https://experienceleaguecommunities.adobe.com/t5/adobe-experience-manager/how-to-call-3rd-part-rest-...

https://medium.com/@codeandtheory/invoke-rest-services-in-aem-the-right-way-c5fb0af43afe

 

Avatar

Level 4
Level 4

Thanks for your response. I've updated the original question to better match the real situation.

Avatar

Community Advisor

Hello @nj2 

 

Requesting you to please check, if the javascript code on the link helps.

https://www.datablist.com/learn/scraping/extract-urls-from-webpage

 

It helps extract all URLs from a webpage. You can probably customize it to extract only pdfs.


Aanchal Sikka

Avatar

Community Advisor

Hi,

With Groovy script, it is possible.

You check the link components, RTE and looks for external host match.



Arun Patidar

Avatar

Level 4
Level 4

Thank you for the solution. Could you provide some additional details or elaborate on it further, please?"

Avatar

Level 4
Level 4

Thank you for your quick response. However, I am already familiar with the Grovy script. My initial question pertained to the solution approach proposed by @arunpatidar , which I didn't quite understand initially.

Avatar

Correct answer by
Community Advisor

Hi @nj2 

This is a really basic example, how you can exact the list but if you are looking for a solution that can be used by Authors then create a landing page with a servlet which return results as EXCEL/CSV

String externaSharepointlLink = "www.sharepoint.com";
String externaPDFpointlLink = "www.pdf.com";

//Query to match text and other type for above string

 def query = buildQuery();
 def result = query.execute();


def buildQuery() {
  def queryManager = session.workspace.queryManager;
  def statement = "SELECT * FROM [nt:unstructured] AS text WHERE ISDESCENDANTNODE([/content/dam/cf]) AND (nodetypes RTE or other fields) AND (text.[text] LIKE '%"+externaSharepointlLink+"%' OR text.[text] LIKE '%"+externaPDFpointlLink+"%'")
  queryManager.createQuery(statement, 'sql');
}


if(result.nodes.size()>0){
    println qPath+' text components with external links: '+result.nodes.size(); 
    total+=result.nodes.size();
    result.nodes.each { node ->
         println node.path;
    }
 }

 



Arun Patidar