Three Critical Capabilities for Intelligent Automation of Incident Response

In my last blog, we looked at the various challenges impacting incident response and why its mostly a manual process. It’s time to look at what it takes to introduce automation into incident response. We see this as a three pronged approach: platform, content & context and customizability. Let me explain.

Open and Extensible Platform

In my last startup, as we scaled our customer base, we were impacted by the classic operations toil that is so common in many acquisition stories. Whileeveryone agrees that we should “automate everything”, this is not quite the future we are living in.

We dug deeper to understand why this is the case.

One of the very first challenges we saw was a lack of standardization. There is a lot of tools with their own formats which poses two problems:

  1. Learning curve
  2. Private implementations of reusable modules

Let us start by acknowledging the reality that most operations teams are automating similar use cases in their playbooks, runbooks and scripts. While the exact combination of the building blocks is dependent on the particular context, the building blocks themselves are mostly identical. But in practice, a lot of teams we spoke to are reimplementing these building blocks in their environment. Even though the master copy is really in some community website that starts with stack and ends in overflow.

Having duplicated the master copy into our environment, the second challenge is integration with the adjacent automation steps i.e. connecting the inputs and outputs of these individual building blocks.

This can get quite complex since there is no standard framework for connecting different connectors and actions in a linear automation construct.

We conclude that there is a need to have a uniform, open and extensible platform that allows integration of the building block automation with a standard and well understood framework in a safe and secure manner. First key aspect of the platform is human in the automation loop. Key to such a platform is the safety aspect. Automation can run away very easily and having the user control the behavior of the automation is almost a mandatory first

Lack of framework causes messy automation!

step towards building confidence in the automation. Especially during the adoption phase when maturing of the concepts is questionable.

Second key aspect of the platform is the need to be in an open and non-proprietary environment to ease the learning curve of tools and encourage reusability of building blocks.

Code, Content, Context

Continuing on the thread of an open ecosystem, a lot of the building blocks are being reused in an ad hoc way. The drivers of this behavior are convenience and lack of a framework to think about automation.

Given an open platform and framework, we can focus on the reusability of the building blocks and also their lineage. As I mentioned earlier, a lot of these code snippets are sourced from social sites. But they are not catalogued well which leads to a lot of duplication within the organizations. For example the same organization can use a different implementation of a given functionality because of a lack of catalogue.

A searchable catalogue also provides the benefit of composing the automation quickly. Additional contextual AI can help by suggesting the intermediate steps in the automation making the process easier.

Lastly, we recognize that the social knowledge not only includes the code snippets, but also the context of the snippet. There is an opportunity to learn the context from the code snippets and apply this knowledge to suggest the correct automation to select given a use case or a live alert.

Customizability

The last mile of DevOps automation is always about making small tweaks to the boilerplate code that apply to a specific context or environment, i.e., customizability. Customizations could be thought of at two different levels.

First, the ability to customize the individual step in a sequence of steps. In order to accomplish this objective, we need for the individual steps to be open with an ability to make changes easily. In the graphical tools we saw, this ability gets lost and while in theory the tool allows customization, in practice the operator has to learn around the interface and try to conform to the screen in front of him.

The second level of customization applies to how the individual building blocks are connected together to actually accomplish the task at hand. Again what we see is that the existing tools make it difficult to accomplish these customizations because of an invisible framework at play that forces the users to work with the screen, rather than the tool working with the user.

Not having easy customizations is usually the death knell of workflow tools because of the static flows it creates.

What does such a platform look like?

How would that open and extensible platform look like? How will it bring content, context, and code to improve DevOps collaboration, and ops shift-left? What would the customizability look like? I plan to cover a lot of the “how” in my next blog, stay tuned.

Oh but if don’t want to wait that long, I happen to be at AWS re:Invent, if you’re too then we can geek out on the how. I can also share notes in terms of what we’re seeing with our customers and peers of yours. Same time, learn your specific use cases to see if I can add value to your journey. I can be reached at LinkedIn or drop me a note at abhishek@unskript.com

Share your thoughts