If you think about how we progressed since the first part of our series, you will notice the pattern of abstracting a concept and discussing at a higher-level each time. We discussed transmission-layer protocols, application-layer protocols, and then about messages. Each concept was at a higher-level of abstraction over the earlier one. We discussed on how the Producer-Consumer Problem was handled, initially through a buffer and, eventually through a cluster of systems that ran the messaging server functionality.
Throughout the early years of computing, the goal of the systems was limited to processing the input messages, and more importantly, these messages were treated independent of each other. The better system processed higher number of messages at a faster rate (higher throughput). The traditional way of processing messages had a fixed “state” to deal with, and any changes to the state were changes to the system context, that needed re-configuration and/or re-deployment. It did not matter to the system, if a hundred-thousand similar messages were received, it still processed each of these as if it were the first message. The attempt to correlate these messages was an after-thought. For insights into data, separate processes were built to trigger on a schedule that retrieved these messages from a datastore. The retrieved data was processed through aggregation functions and data patterns were detected. Most times, it was too late to perform any action based on the discovered patterns on these messages.
The very essence of an event-based architecture is the ability to process of events against a dynamically updating “state” of the system. If you wondered what the “state” of a system refers to, consider the following example. The current shopping cart on your online shopping portal is the “state” of your cart. It is dynamic and it changes with every addition or deletion of items in your cart. If online shopping portal provides offers on certain products, based on the items in your cart, then each update to the cart would need to be processed against a set of rules to show the expected total price of your order. To extend this further, the offers themselves can represent the state of the application. New offers may be added, older ones may be removed, or they may expire. The addition and removal of these offers could happen either through a set of pre-defined rules that generate appropriate events, or through external events.
To consider what an event-based system can bring to the table, let us discuss a simple example below.
System S1 is built to provide the possible routes from a source location to a destination location. S1 functionality is simple and well-defined. It follows a traditional (sequential /procedural) approach in its processing. It has a relatively fixed state, in terms of the available “routes” between any two points on the map. When a request for a route is made by a customer, c1 to system, S1, the following steps are taken
- The system gathers all available routes from p1 to p2
- Based on its internal algorithm, S1 calculates all possible routes customer, c1 could take
- Sends a response back with the top 3 results, r1, r2, r3
- Customer, c1, confirms one of the routes, say r1, and starts the journey
- As the request is received from the customer, the system marks journey r1 as “incomplete” and generates the next set of routes, r4, r5, r6 for the customer to take
- Customer, c1, may pick route, r4, and complete his journey. r4 is marked complete by S1.
In the above scenario, if a customer is faced with a roadblock at location, p3, on his way, he invokes a second request to S1, for the same destination, p2, but this time from source, p3.
So, as the system receives the second request from c1, it marks the earlier journey, r1, as an unfinished and processes re-routing request so that c1 can pick an alternate route, r4. This is the use-case of one customer. Now, imagine if there were 100,000 customers that were de-toured due to this!
Well, surely, we could do better if only we could correlate incoming events in the system and act on it in real-time.
In the example above, the static nature of the “state” of “available routes”, makes it difficult for the system to react in real-time. The stats of unfinished journeys, customer requests for a pair of coordinates , de-tours, can be gathered in a daily batch process triggered on a schedule, but, it can be useful only for future, it is now too late for any real-time action.
Now for some reflection on our existing system, S1, before we move on:
- What is the definition of “available routes” from Step 1? Can we make it dynamic?
- Who “triggers” the changes to available routes?
- Where do we store the system state? An I/O operation to an external data store for each event can be expensive
- How do I determine when to remove obsolete data?
Let us design the next version of System, S2. In the next version, we plan to keep the needed information in the engine memory for real-time decision making and update the system state based on event triggers. Here are the action items to build the next version
- Store the “routes” data in engine memory
- A process is developed to add or update the available routes and their status (active/inactive)
- Store all customer requests for the past one hour (or a configurable unit of time)
- The process stores requests made by customers with appropriate information
- The process also stores if the journeys are finished or unfinished
- Utilities to retrieve aggregated data by given parameters like sources, destinations, finished journeys, unfinished journeys etc.
- A process that triggers on a schedule to analyze re-routing requests If
- The system receives too many re-routing requests from a certain location
- And, if these number of requests exceed a certain threshold
- And they fall within a certain time-range then
- The system generates an event to update available routes that makes this route “inactive” for an hour (or a configurable amount of time)
- [Existing Functionality] For every request that a customer makes, the system runs the steps to get the latest available “routes”, and calculates the routes based on its algorithm, and send a response with multiple feasible paths to take
- The only change is that the list of “available routes” is fetched from the dynamically updated set in real-time
After above changes, the system starts to become self-correcting. It suspends any route that meets the criteria above and helps customers avoid the route before they would have to re-route. The system helps reduce the incoming requests by self-adjusting data in “available routes”. There is a new process that evaluates on a schedule to generate internal events to update “available routes”. In practice, the exact criteria to suspend the “routes” temporarily would be fine-tuned over time and constantly improved upon based on many other parameters.
This is obviously a simplistic approach to explaining event-based systems. Real-world scenarios are more complex with a lot of internal and external events that constantly update the “state” of the system. To keep it simple, I have avoided discussion on the additional functionality needed to retain the state of the system over multiple restarts and maintaining it on a distributed system, and other essential properties that are part of a complex-event processing system. Here are some takeaways on what is needed to implement an event-based system.
- In-memory datastore to make quick decisions. Event-based systems need easily accessible data
- This is commonly referred to as cache, and cache-management is one of the critical components in implementing a complex-event processing system
- The cache-management also needs a rich set of features to calculate the needed aggregations on the fly
- Cache-management to also support pre-loading of systems state, adapt to the nature of distributed systems for maintaining the state etc.
- Event definition
- The system should provide for defining an event and the appropriate attributes that it carries with it
- Object Definition
- An event may create or update an object and an object could be (1) stored as a transient object held until the customer request is processed or (2) stored part of the cache, for use with correlation with later events and flushed out at a later time or (3) could be stored in a backend datastore for later use
- A good event-based implementation would provide options to define the lifecycle of the objects
- Event producers and Event consumers
- An event can be external or internal, and it can be inbound or outbound, for all these scenarios, an event-based system would encapsulate the needed properties would help in building a robust implementation of event-based system
- Ability to define and implement transmission protocols through which events may be received or sent
- Event processors, that are enabled to perform the actual work, based on priorities and appropriate condition-based processing, for example, refer to the re-routing analysis needed in our second solution for System S2 above.
As one could see, when systems were enabled to correlate seemingly independent messages, there was much more it could do, than it would in a traditional approach. It is worth adding that we have only touched the tip of the iceberg in terms of event-based processing. A newer variant of event-based data processing is streaming data. It is a use-case that handles “unbounded” data. In our next part we shall explore more on this.
At Prowess, we have an army of Integration experts, that specialize on event-based architecture implementation. We provide Architecture advisory services, help with Implementation and full lifecycle support of event-based architecture needs for large enterprises. Whether you need your systems to be evaluated, or you need recommendations for tools to implement for complex-event processing needs, we will be glad to help you. Whether you are looking to start your journey or take it to the next level, connect with us!