Highlighted

How to determine whether a Page is an AEM Page or Non-AEM page

sumanmaparu

02-10-2018

Hi,

I have AEM angular JS Single Page application. is there any way to determine in the aem code whether the page is AEM page or Angular page.

Replies

Highlighted

Arun_Patidar

MVP

02-10-2018

Hi,

In AEM page is created using template, you can check for template used to create page.

Based on template you can identify if the page is created using page template or It is just HTML snippet or snippet created using other template.

Highlighted

sumanmaparu

02-10-2018

Hi,

May be i should elaborate more on the issue.

My homepage is: https://mysite.com/gb/en/           -        This page is in AEM.

There is a link on my homepage when clicked it goes to https://mysite.com/gb/en/my-link​    -   This is not AEM page. It does not exists on AEM but created by angular JS but it gets the title, description, resourceType and template from the https://mysite.com/gb/en/   i.e. Homepage.

Now, i have a requirement to update the title and description of pages like https://mysite.com/gb/en/my-link​  which are not existing on AEM

But I am unable to figure out how to do that. How to identify if  https://mysite.com/gb/en/my-link​ is a non-AEM page or not.

Highlighted

Arun_Patidar

MVP

02-10-2018

Hi,

you need to check and identify something which make difference between AEM and non AEM page.

As you said:

There is a link on my homepage when clicked it goes to https://mysite.com/gb/en/my-link    -   This is not AEM page. It does not exists on AEM but created by angular JS but it gets the title, description, resourceType and template from the https://mysite.com/gb/en/   i.e. Homepage.

How does it do, are you using kind of SPA architecture, where only content changed but page remains same. Or you are setting all those property via code.

Highlighted

sumanmaparu

02-10-2018

I want to update title and description of https://mysite.com/gb/en/my-link , currently angular JS does some funny stuff and it updates the meta info (e.g. title and description) and title of the page(https://mysite.com/gb/en/my-link ) is updated in browser tab title area but when i do view source of the page https://mysite.com/gb/en/my-link

i see the title of the HomePage i.e. https://mysite.com/gb/en which was set in AEM while creating the page on AEM.

I want to update title of https://mysite.com/gb/en/my-link So that, google crawler can take a note of it.

Highlighted

Arun_Patidar

MVP

02-10-2018

Hi,

The view source will display code which was loaded initially, if you do DOM modification via script after load then you can see the changes in browser DOM tree but not in view source.

Yes, for SEO you need to do something to update title before page load.

check below:

Common SEO problems of Single Page Applications

There’s a lot of talk about how well Google can handle JavaScript when it comes to crawling and indexation.

Crawling and indexing is critical to ranking.

Google discovers web pages using software called Googlebot during a very fast process often called “crawling” or “spidering”, during which it downloads an HTML file it finds, extracts the links and visits them simultaneously, and then sends the downloaded resources to the indexer.

But when it comes to a JavaScript-based single page application website, the process gets a bit more complicated.

It’s like the process noted above, but there’s a delay and extra step involved because part of the indexer must do some heavy lifting by parsing and executing the JavaScript, and the new links found then must be passed back to the crawler to look at and then sent back to the indexer; you can see that this is less efficient because of the JavaScript.

SEO is more than just having “great content” and earning high-quality links; it’s also about making your web pages easy to discover by search engines like Google and making it simple for them to know which pages are more important than other pages via internal linking.

A “traditional” HTML-based site is far easier to crawl and index, and by extension, rank. Google can get all the links easily and see what the importance of pages are via internal linking.

A JavaScript-based SPA website makes Google’s life more difficult, and some testing would seem to indicate that there may be downsides when relying on JavaScript for purposes of indexation.

Google is evidently willing to do the extra heavy lifting here, and to my mind that indicates that they’ll improve over time rather than announce to webmasters in the future that they have decided they don’t want to bother with the extra work required to crawl and index JavaScript-based sites.

Another potential SEO problem related to the extra work to discover links is that Google may have issues with evaluating the link equity of those pages.

It’s likely that in time, at least some of the SPA frameworks in popular use will evolve the rendering process to make it easier for Google to crawl and index, perhaps even making it on par with “traditional” HTML-based websites.

But in the meantime, we’re where we are and those who’ve tested how well Google can handle JavaScript-based sites have shown that Google’s ability is inconsistent, and we’re also still in a place where those who have developed SPAs frequently must use workarounds, for example using prerender.io along with Angular to serve fully-rendered pages to the crawler.

Another solution is isomorphic JavaScript, sometimes called “Universal JavaScript”, where a page can be generated on the server and sent to the browser, which can immediately render and display the page. This solves the SEO issues as Google doesn’t have to execute and render the JavaScript in the indexer.

Headless Chrome is another option recently proposed as an easy solution by a Google engineer, who also mentions another solution called Preact, which ships with server-side rendering.

It’s also a good idea to create a properly formatted XML Sitemap and submit that to Google Search Console.

Right now, there doesn’t appear to be any single solution or a paint-by-numbers approach to handing the problems you may encounter if you’re an SEO assisting a client with launching or redeveloping a website using an SPA.

It boils down to effectively communicating the correct end result that’s needed, and dealing with issues as they’re presented based on the library or framework being deployed.

sumanmaparu

03-10-2018

Hi,

Thanks for your valuable input. I am gonna propose that to team but is there anything that can be done on aem side to identify if page is existing on AEM given that its Parent page is on AEM.

Highlighted

raj_mandalapu

03-10-2018

currently angular JS does some funny stuff and it updates the meta info (e.g. title and description) and title of the page(https://mysite.com/gb/en/my-link )

From where are you getting title and description to update angular pages?

Approach 1 :

If you have so many pages to update then you need to write a script which traverses all anchor links and check whether the page is AEM page or not. by checking sling:resourceType and generate a report in the excel form, the excel will help you to quickly identify AEM page or not

Approach 2:

If the pages are less then you can write a servlet which takes page path as a parameter and returns non AEM anchor links by performing the same above resourceType business logic.

Approach 3:

Check, If angular pages are logically segregated or not, if segregated then you dont need to perform above logic you can directly go an update.