Crx2oak migration from AEM 6.1 (all in segment store) to AEM 6.3 (separate DataStore & Segment Store) and push to AWS S3

Avatar

Avatar

Mishelini

Avatar

Mishelini

Mishelini

28-04-2020

Hi Folks, 

 

I have to do a PoC for using a AWS S3 bucket as external datastore. The Instance I have is with AEM 6.1 using default CRX2 TarMK (all in segment store). 

 

From what I have read and understood, the S3  works only for data store, so to accomplish my task I have to:

  1. Use the crx2oak migration tool to migrate/upgrade the AEM 6.1 repo to newer AEM6.3 with separate Segment Node store and Data Node Store
  2. Second run of crx2oak tool to "push" the DataStore to S3 bucket (assuming the S3 connector is already configured.)  

I have challenges with step 1. When I start the migration tool  with:

 

 

/opt/jdk1.8.0_191/bin/java -jar  crx2oak.jar --copy-binaries --src-datastore=/home/me/snap/author/crx-quickstart/repository/segmentstore --datastore=/home/me/new_repo/datastore segment-old:/home/me/snap/author/crx-quickstart/repository /home/me/new_repo

 

 

where /home/me/snap/author/crx-quickstart/repository/segmentstore is the place of current all in one TarMK segmentstore;

           /home/me/new_repo/datastore is the place where I want new Data Store to go;

           /home/me/new_repo is the dir for the new repo 

the migration process somehow freezes shortly after start:

 

# /opt/jdk1.8.0_191/bin/java -jar  crx2oak.jar --copy-binaries --src-datastore=/home/me/snap/author/crx-quickstart/repository/segmentstore --datastore=/home/me/new_repo/datastore segment-old:/home/me/snap/author/crx-quickstart/repository /home/me/new_repo
09:23:46.636 INFO   c.a.g.c.CRX2Oak: started with args: [--copy-binaries, --src-datastore=/home/me/snap/author/crx-quickstart/repository/segmentstore, --datastore=/home/me/new_repo/datastore, segment-old:/home/me/snap/author/crx-quickstart/repository, /home/me/new_repo]
09:23:46.715 INFO   c.a.g.c.c.VersionPrinter: CRX2Oak version: 1.10.0 (STANDALONE mode)
09:23:47.004 INFO   c.a.g.c.c.VersionPrinter: crx2oak.jar (version: 1.10, checksum: 8582adb2b4f999866d4b24a48364295c4665e1ab9ea164460e122f555dd3e6421a81f55e74b45d72b86777079b7e48a29e2b8e6703c2a31d3b772e115743bfa5)
09:23:47.014 INFO   c.a.g.c.p.ProfileHandler: Applying partly the command line (before loading a profile): [--copy-binaries, --src-datastore=/home/me/snap/author/crx-quickstart/repository/segmentstore, --datastore=/home/me/new_repo/datastore, segment-old:/home/me/snap/author/crx-quickstart/repository, /home/me/new_repo]
09:23:47.015 INFO   c.a.g.c.p.ProfileHandler: The following template tags has been defined: {}
09:23:47.016 INFO   c.a.g.c.p.ProfileHandler: The command line (after loading a profile): [--copy-binaries, --src-datastore, /home/me/snap/author/crx-quickstart/repository/segmentstore, --datastore, /home/me/new_repo/datastore, segment-old:/home/me/snap/author/crx-quickstart/repository, /home/me/new_repo]
09:23:47.018 INFO   c.a.g.c.c.MigrationSpecGenerator: The effective command line for migration: [--copy-binaries, --src-datastore, /home/me/snap/author/crx-quickstart/repository/segmentstore, --datastore, /home/me/new_repo/datastore, segment-old:/home/me/snap/author/crx-quickstart/repository, /home/me/new_repo]
09:23:50.256 INFO   o.a.j.o.p.s.f.FileStore: TarMK ReadOnly opened: /home/me/snap/author/crx-quickstart/repository/segmentstore (mmap=false)

 

 

And stays like this forever... I ran this in background and waited for it for a day - no change. 

I ran the migration on a mounted LVM snapshot made from the AEM data dir when instance was stopped. 
I can't find any other info in any logs. Nothing to point me to a problem. 

Any ideas what may be the problem? 

I also appreciate guidelines for the migration approach to S3. 

Accepted Solutions (1)

Accepted Solutions (1)

Avatar

Avatar

Jörg_Hoh

Employee

Total Posts

3.0K

Likes

910

Correct Answer

1.0K

Avatar

Jörg_Hoh

Employee

Total Posts

3.0K

Likes

910

Correct Answer

1.0K
Jörg_Hoh
Employee

30-04-2020

Hi,

 

your 6.1 instance doesn't have a datastore, so you must not provide the "src-datastore" parameter. You just have a segmentstore, so the final command should look like this:

crx2oak.jar --copy-binaries  --s3datastore=/home/me/new_repo/datastore --s3config=.... segment-old:/home/me/snap/author/crx-quickstart/repository /home/me/new_rep

(you need to adapt the parameter "--s3config" according to https://jackrabbit.apache.org/oak/docs/migration.html)

 

You can either add the parameter "--trace" to dump all log statements to the console, or use "--log-level TRACE" or "--log-level DEBUG" to cause crx2oak to add more information to a file upgrade.log

 

 

Hi Jörg,

Hi Jörg,

Thanks for the response.
I tried the suggested approach, but I keep getting same behavior - migration freezes on:

05.05.2020 06:56:20.711 INFO   o.a.j.o.p.s.f.FileStore: TarMK ReadOnly opened: /home/me/snap/author/crx-quickstart/repository/segmentstore (mmap=false)

I get same result even if I try simple "upgrade" attempt (CRX2->Oak, according to https://jackrabbit.apache.org/oak/docs/migration.html) with: 

# /opt/jdk1.8.0_191/bin/java -jar  crx2oak.jar segment-old:/home/me/snap/author/crx-quickstart/repository /home/me/new_repo
05.05.2020 07:16:25.489 INFO   c.a.g.c.CRX2Oak: started with args: [segment-old:/home/me/snap/author/crx-quickstart/repository, /home/me/new_repo]
05.05.2020 07:16:25.566 INFO   c.a.g.c.c.VersionPrinter: CRX2Oak version: 1.10.0 (STANDALONE mode)
05.05.2020 07:16:25.849 INFO   c.a.g.c.c.VersionPrinter: crx2oak.jar (version: 1.10, checksum: 8582adb2b4f999866d4b24a48364295c4665e1ab9ea164460e122f555dd3e6421a81f55e74b45d72b86777079b7e48a29e2b8e6703c2a31d3b772e115743bfa5)
05.05.2020 07:16:25.857 INFO   c.a.g.c.p.ProfileHandler: Applying partly the command line (before loading a profile): [segment-old:/home/me/snap/author/crx-quickstart/repository, /home/me/new_repo]
05.05.2020 07:16:25.858 INFO   c.a.g.c.p.ProfileHandler: The following template tags has been defined: {}
05.05.2020 07:16:25.859 INFO   c.a.g.c.p.ProfileHandler: The command line (after loading a profile): [segment-old:/home/me/snap/author/crx-quickstart/repository, /home/me/new_repo]
05.05.2020 07:16:25.860 INFO   c.a.g.c.c.MigrationSpecGenerator: The effective command line for migration: [segment-old:/home/me/snap/author/crx-quickstart/repository, /home/me/new_repo]
05.05.2020 07:16:29.041 INFO   o.a.j.o.p.s.f.FileStore: TarMK ReadOnly opened: /home/me/snap/author/crx-quickstart/repository/segmentstore (mmap=false)

which should just transform the repo in Oak format at /home/me/new_repo

If I run similar but with tool version < 1.6.x and java 1.7 it works, but it only copies the repository (v.<1.6 doesn't support upgrade to Oak):

java -jar crx2oak-1.4.6-standalone.jar /home/me/snap/author/crx-quickstart/repository /home/me/new_repo
05.05.2020 07:24:43.007 [main] *INFO*  org.apache.jackrabbit.oak.upgrade.cli.parser.MigrationOptions - DataStore needs to be shared with new repository
05.05.2020 07:24:43.009 [main] *INFO*  org.apache.jackrabbit.oak.upgrade.cli.parser.MigrationOptions - copyVersions parameter set to 1969-12-31
05.05.2020 07:24:43.010 [main] *INFO*  org.apache.jackrabbit.oak.upgrade.cli.parser.MigrationOptions - copyOrphanedVersions parameter set to 1969-12-31
05.05.2020 07:24:43.010 [main] *INFO*  org.apache.jackrabbit.oak.upgrade.cli.parser.MigrationOptions - Cache size: 256 MB
05.05.2020 07:24:43.014 [main] *INFO*  org.apache.jackrabbit.oak.upgrade.cli.parser.StoreArguments - Source: SEGMENT[/home/me/snap/author/crx-quickstart/repository]
05.05.2020 07:24:43.016 [main] *INFO*  org.apache.jackrabbit.oak.upgrade.cli.parser.StoreArguments - Destination: SEGMENT[/home/me/new_repo]
05.05.2020 07:24:43.018 [main] *INFO*  org.apache.jackrabbit.oak.upgrade.cli.parser.StoreArguments - Using Oak segment format V_11 - please make sure your version of AEM supports that format
05.05.2020 07:24:43.018 [main] *INFO*  org.apache.jackrabbit.oak.upgrade.cli.parser.StoreArguments - Requires Oak 1.0.12, 1.1.7 or later
05.05.2020 07:24:43.202 [main] *INFO*  org.apache.jackrabbit.oak.upgrade.cli.parser.StoreArguments - Source blob store: DummyBlobStore
05.05.2020 07:24:47.884 [main] *INFO*  org.apache.jackrabbit.oak.plugins.segment.file.FileStore - TarMK opened: /home/me/snap/author/crx-quickstart/repository/segmentstore (mmap=false)
05.05.2020 07:24:47.974 [main] *INFO*  org.apache.jackrabbit.oak.plugins.segment.file.FileStore - TarMK opened: /home/me/new_repo/segmentstore (mmap=false)
05.05.2020 07:24:59.312 [main] *INFO*  org.apache.jackrabbit.oak.upgrade.RepositorySidegrade - Copying node #10000: /home/users/H/HP5sB4sp7e5RCWrAY3Jg/activities/dam/2017/11/15/b62a77ba-de49-4f71-a71e-988846c76589/object
05.05.2020 07:25:12.079 [main] *INFO*  org.apache.jackrabbit.oak.upgrade.RepositorySidegrade - Copying node #20000: /home/users/H/HP5sB4sp7e5RCWrAY3Jg/activities/dam/2018/08/03/aab2a585-c32e-49fb-8669-51b86f5bc5fa/target

 

There logs says "TarMK opened" there is no "TarMK ReadOnly opened" as with tool v. > 1.6.

When I use migration tool (v. 1.6 and higher) to migrate or copy only Oak format repos it also works fine.

I suspect there may be something related to the "upgrade" functionality from CRX2 to OaK format.
Could you please review?

Answers (1)

Answers (1)

Avatar

Avatar

akhoury

Avatar

akhoury

akhoury

30-04-2020

This issues sounds similar to a known issue with crx2oak on AEM6.3.

 

Please try running the same command using the latest oak-upgrade 1.6.20 jar here instead of crx2oak jar:

https://repo1.maven.org/maven2/org/apache/jackrabbit/oak-upgrade/1.6.20/

 

To migrate to S3 from file datastore, the easiest way is to do the following:

1. Run your oak-upgrade or crx2oak migration with --datastore, as you have done.

2. On the destination repository instance, remove all FileDataStore related config files - both from crx-quickstart/install (and from crx-quickstart/launchpad/config folders to be safe) and make sure your SegmentNodeStoreService OSGi config still has customBlobStore set to true.
3. Add your S3 related jars and configs per the official documentation: https://helpx.adobe.com/experience-manager/6-3/sites/deploying/using/data-store-config.html#AmazonS3...

4. Start the destination instance and it would automatically migrate to S3 in the background.

Hi Akhoury,
unfortunately I'm getting same behavior. I've tried with different versions and variations of migration tool: crx2oak-1.10.0-all-in-one.jar, crx2oak.jar, oak-upgrade-1.18.0.jar. All is same:

/opt/jdk1.8.0_191/bin/java -jar oak-upgrade-1.6.20.jar segment-old:/home/me/snap/author/crx-quickstart/repository /home/me/new_repo
05.05.2020 07:51:42.544 [main] *INFO*  org.apache.jackrabbit.oak.upgrade.cli.parser.MigrationOptions - copyVersions parameter set to 1969-12-31
05.05.2020 07:51:42.546 [main] *INFO*  org.apache.jackrabbit.oak.upgrade.cli.parser.MigrationOptions - copyOrphanedVersions parameter set to 1969-12-31
05.05.2020 07:51:42.546 [main] *INFO*  org.apache.jackrabbit.oak.upgrade.cli.parser.MigrationOptions - Cache size: 256 MB
05.05.2020 07:51:42.549 [main] *INFO*  org.apache.jackrabbit.oak.upgrade.cli.parser.StoreArguments - Source: SEGMENT[segment-old:/home/me/snap/author/crx-quickstart/repository]
05.05.2020 07:51:42.551 [main] *INFO*  org.apache.jackrabbit.oak.upgrade.cli.parser.StoreArguments - Destination: SEGMENT_TAR[/home/me/new_repo]
05.05.2020 07:51:46.019 [main] *INFO*  org.apache.jackrabbit.oak.plugins.segment.file.FileStore - TarMK ReadOnly opened: /home/me/snap/author/crx-quickstart/repository/segmentstore (mmap=false)