Expand my Community achievements bar.

SOLVED

Potential performance issues with large scale deactivcation?

Avatar

Level 5

So we have potentially ~300,000 nodes that we would need to remove from the publishers, this is likely to occur no more than once a quarter although potential for the frequency to increase but we are not talking about multiple times a day. The delete command is called from the author to all 4 publishers like so:

replicator.replicate(session, ReplicationActionType.DELETE, path, replicationOptions); 

although I understand I could use:

replicator.replicate(session, ReplicationActionType.DEACTIVATE, path, replicationOptions); 

I'm thinking the second might be better as would delete leave orphaned nodes below the node supplied in the "path" variable, no?

Would the removal/deactivation of these nodes cause sufficient load on the publishers to take them offline or is the deactivation handled on a separate thread to the rendering of pages? If there is a risk to the site do you have any suggestions to reduce the risk? Considering moving the nodes that we need to delete on each publisher and then deleting at a later date when the publishers are quiet but that is a less desirable solution. 

1 Accepted Solution

Avatar

Correct answer by
Employee Advisor

orotas,

when you delete the root node of the subtree, the dispatcher will invalidate this node plus its siblings as well. But the dispatcher does also invalidate all its child nodes plus their childnodes as well, as the invalidation is hiearchy-aware. Therefor you don't have to worry about the dispatcher. And in any case you can do this dispatcher cache cleanup manually, if you execute this operation once a quarter.

Jörg

View solution in original post

7 Replies

Avatar

Level 8

For a publish server the repository action for a delete or a deactivate are the same - the publish server deletes the node and all it's children. When you deactivate the item is only retained on the author server. 

Whether or not a deactivation of that scale will impact your publish servers is highly dependent on your environment so with this amount of limited information it's hard to predict. The best thing to do to limit impact on web site would be to put a delay into your code between deactivations - either timed or potentially duplicate the behavior of the tree activation code - it monitors the replication queues and waits until the first replication queue is empty for adding the next item to the queue to prevent from overloading the queue. 

Avatar

Employee Advisor

When you just have a large subtree which you want to delete it's sufficient to replicate the delete operation for the root node. It will then delete all nodes below and update the status on authoring accordingly. No need to deactivate/delete each node individually.

Jörg

Avatar

Level 5

OK monitoring the queue sounds like a good idea before moving on to the next folder and having a delay to let the pub recover. Will look into that hopefully the Agent allows me to see the queue size. Thanks for the idea...will report back

Avatar

Level 5

although thinking about it, this is going to block our replication queue up so new content can't be activated, so this might not be the best approach.

Avatar

Level 8

What about dispatcher. I assumed they were replicating everything in order to get dispatcher to delete the items as well. If you just delete the parent node wouldn't you also need to run a script to get dispatcher to delete the cached versions? 

Avatar

Level 8

That's why you monitor the queue and only add the next items to the queue once the queue is empty. If you just dump everything on the queue immediately it will block up your ability to publish. But if you add and item to the queue, and then wait until it's empty normal publishing isn't impacted because it gets added to the queue and then executed on before you add your next item. 

Avatar

Correct answer by
Employee Advisor

orotas,

when you delete the root node of the subtree, the dispatcher will invalidate this node plus its siblings as well. But the dispatcher does also invalidate all its child nodes plus their childnodes as well, as the invalidation is hiearchy-aware. Therefor you don't have to worry about the dispatcher. And in any case you can do this dispatcher cache cleanup manually, if you execute this operation once a quarter.

Jörg