you can identify this base on page replication properties like cq:lastReplicated, cq:lastReplicatedBy or cq:lastReplicatedBy.
when the page is unpublished or never published, these properties will be missing from /content/pagepath/jcr:content.
and As soon as the author publishes the page, these properties get added automatically by the system to jcr node.
Thus, you can write a validation code around these properties to identify the fresh pages or existing published pages.
you can also use "jcr:created" property of the page to identify page age comparing to the current timestamp & filter out new or existing pages base on your requirement. cq:lastModified can use to check the last activity on the page.