Authors: Jaemi Bremner, Avinash Kumar Gupta, Arun Singh, Nishant Sinha, and Jody Arthur
This article provides a behind-the-scenes look at howAdobe Experience Platformis using event-driven automation to improve reliability and stability.
In today’s world of microservices and loosely coupled software modules, events are often the binding pieces of how the desired functionality is achieved. Software components generate events and corresponding actions on these events define how the components would behave. These events might be the signal generated by users making requests, or base computing components sending in failure indications, or even the applications pointing out the potential issues due to different inputs.
Even when developers have access to generated events and know the sequence of actions that are required to be taken to complete a process, they are required to write a significant amount of code to implement them. Not surprisingly, this resource-intensive process has enterprises looking at several available technologies to automate these complex sequences of actions.
Enter event-driven automation (EDA). EDA offers enterprises the ability to increase efficiency in their operations by replacing manual processes with automated workflows.
Event-driven automation defined
EDAs are computer programs written to “listen” and respond to events generated by the user or the system. Applications rely on programming that separates event-processing logic from the rest of its code. With EDA, an event can be any identifiable occurrence that has significance for the workflow for which it is designed. Examples might include events caused by a large user-generated volume of requests and system-generated events such as program failing to load, sensor outputs, or messages from individual threads.
EDA is accomplished through sensors that listen for the events, which then trigger a potentially complex sequence of actions either sequentially or in parallel. These actions form a workflow where values derived from a set of actions are passed through to a subsequent set of actions based on specified conditions or predetermined criteria. These actions can be written in any programming language to improve responsiveness, throughput, and flexibility in a given workflow.
EDA offers endless possibilities for improving workflows
AtAdobe Experience Platform, we’re exploring how EDA can be used to analyze operational patterns and develop mechanisms to address bottlenecks in our processes. One of the biggest advantages of using EDA for workflows is that it works around the clock without human intervention.
Adobe teams are using EDA internally for several different kinds of workflows ranging from auto-remediations of alerts received from system and application, achieving scalability of application proportionate to the user-generated load, security remediations, and providing information to teams on the ongoing health of the system. Currently, our teams have identified a wide variety of use cases including:
Auto-remediation of Resque job deadlock and Sidekiq job failures
Selective remediation of Solr collections
Health check and pipeline restart while messages are queuing up
Recovery of quarantined streaming segments in Siphon Stream
Auto-remediating lag when while copying data in MirrorMaker processes
Scheduled and manual recovery of failed Kafka messages
Autoscaling of Kafka broker on high load
Auto-remediation of alerts coming from Nagios such as journal threads, OOM, disk space, etc.
Detection of vulnerable security policies
CSO’s problem management using auto-remediation to improvise problem management
In order to successfully implement a workflow using EDA, teams must do two things: first, detect the events on which they want actions to be taken, and then identify the appropriate action sequence. The EDA system needs to be provided access to the environment where the event-driven automation has to be executed. This requires the networking layer to work in accordance with the requirements of the application and EDA both.
While EDA is expected to provide higher uptime, and better reliability to the applications utilizing it, it is even more important for EDA itself to be highly available and not fail. Its hosting architecture needs to ensure that the EDA system is scalable as per the load and has redundancy built in to ensure continued availability in case one part of the system fails.
Adobe Experience Platform uses API-first design to make all of its functions available to developers for use with Adobe Experience Platform services, Adobe solutions, and third-party applications.
EDA honors the design principles prescribed by AEP and provides AEP developers the freedom to develop in open development manner. Our developers are free to write their own code and workflows to provide actions in response to the events they wish to address.
Early results have been very successful. With one of its first automations in the production environment, AEP developers were able to execute approximately 900 workflows within the first two months for social, saving more than 30 potential major outages.
Figure 2: Event Driven Automation — Social Use Case for POC
Automation creates new opportunities for innovation
Innovation has long been recognized as a key driver of success. Virtually any type of mundane, repetitive task or set of tasks currently handled by people could unleash new levels of creativity and productivity if automated.
Here at Adobe, we know that when we optimize our workflows, we optimize our people — freeing them to focus on what they do best. With several event-driven workflows currently under development and more on the way, we are improving the stability and reliability of Adobe Experience Platform and giving our developers more time to focus on developing new and innovative solutions for our customers.