Paradigm shifts often have unanticipated consequences and those consequences can take years to fully understand. Cloud computing is a prime example. Cloud computing ushered in an era of flexible infrastructure and lower capital requirements. Engineers were liberated from waiting for deployments as resources were just an API call away. However, all that was just the beginning.
Nimble companies leveraged cloud to break down silos between operations and development, and embraced agile development for faster cycle times to create strategic advantage. They merged teams of engineers across the application lifecycle from development and test to deployment and operations, and created new roles requiring a range of skills not limited to a single function. Then they pushed the envelope even further with CI/CD and DevOps to automate pipelines for even faster delivery.
Is there a downside to all this? Ask your DevOps team.
DevOps is tasked with maintaining a toolchain for automated delivery of new code, scaling on-demand, and five nines uptime. In their spare time, they work on improving performance and controlling cost. For a sizeable application, there can be thousands of virtual machines or containers, each with a stack of software, plus cloud services like load balancers and auto scalers all of which have to be configured and maintained. And all of that is in constant motion.
One large unicorn I talked with has hundreds of developers, more than a hundred code pushes per day, over 4,000 virtual machines in the cloud and petabytes of data collected each month. Their DevOps team is just a dozen strong and didn’t even have a VP until last year. This is a Herculean Sisyphean task.
Managing this myriad of challenges defies the speed and scale of human beings. Fortunately, AIOps is emerging as a solution.
The term AIOps was coined by Gartner, who defined it as:
AIOps platforms enhance IT operations through greater insights by combining big data, machine learning, and visualization. IT leaders should initiate AIOps deployment to refine performance analysis today and augment to IT service management and automation over the next two to five years.
Gartner may have coined the phrase but in my humble opinion, they fell short of the mark. Their definition centers around humans in the loop while basically describing advanced big data analytics. To solve the DevOps Dilemma, we need to aim higher.
So, what should AIOps be?
Let’s start with what it shouldn’t be – a whitewash of existing decision support and ticketing tools by large vendors adding “powered by AI” to their sell sheets. This is already happening, as it always does when new technologies threaten established vendors. Adding an API to an existing tool doesn’t qualify. If decisions require human intervention you don’t have AIOps.
Here are my three key requirements for AIOps
- AIOps systems learn from your data and adapt to how your app works
- Meaning they won’t don’t do the same thing every time
- AIOps systems make and implement decisions without human intervention
- Yes, you can keep a human in the loop until you trust the system
- AIOps systems run continuously
- They become a standard part of your delivery
The transition to AIOps is in its infancy, but the battle is heating up and there are already success stories. VCs are placing bets and vendors small and large are bringing new solutions to market. Starting with log analysis systems a few years back, we began to see automated root cause analysis and even failure prediction. Intrusion detection systems now learn from aberrant traffic, some even across companies. More recently predictive auto scaling systems have debuted. Optune, our AIOps product, can make decisions about virtual machine type, instances, and application parameters and deploy them into test or production using the customers’ existing DevOps toolchain and monitoring.
DevOps is replacing the traditional IT department. The titles have changed as have the roles but the challenges IT departments were built to address have not gone away; they have only been multiplied by the scale inherent in microservice architectures. Hence we need systems designed for these new challenges. AIOps must evolve over the coming years beyond Gartner’s vision to enable DevOps to embrace the scale and speed of modern development.