We all know the world is moving at the speed of AI, and nowhere is that clearer than in the world of software development. AI coding assistants like Cursor and Claude Code are writing and shipping code at a velocity that’s hard to imagine. This rapid pace is revolutionizing the way developers work, so it’s natural to wonder: how are on-call engineers and security professionals supposed to keep up? According to Shmuel Kliger, founder of Causely, the answer certainly isn’t collecting even more data:
“For years, the IT industry has struggled to make sense of the overwhelming amounts of data coming from dozens of observability platforms and monitoring tools,” said Kliger.
Instead, the Causely system works by automatically mapping an application’s topology and service dependencies, then applying a finite set of likely root causes to this data. This novel approach is counter to traditional tooling and methods that encourage businesses to collect as much data as possible, a situation that hasn’t fundamentally changed in decades, which then requires human troubleshooting to respond to alerts, make sense of patterns, identify root cause, and ultimately determine the best action for remediation.
A Paradigm Shift in Observability
The overwhelming amount of data produced in the AI era creates cognitive overload for SREs, making it impossible to isolate the meaningful signals amongst the noise. And the more complex a system is, the more precarious it is to perform effective root cause analysis. So while in the cloud era, a “collect everything” approach may have been feasible, in the modern AI era, it’s unsustainable. As systems become increasingly dynamic, the need for smarter, more targeted data management solutions has never been more critical.
Causely’s causal reasoning system is always running in the background of a user’s environment, attaching symptoms and alerts to a finite set of potential root causes. This is a critical shift. Their method centers around a live causal model that perpetually maps relationships among services, data flows, and infrastructure.
Rather than stagnant dashboards, the platform ensures a continuously updated view of the interaction between components. It enables Causely to distinguish between real causes and downstream effects, which is essential for safe automation and effective incident response. This dynamic approach allows for faster, more accurate decision-making, minimizing downtime and enhancing system reliability. By constantly adapting to environmental changes, the platform ensures issues are identified and addressed before they escalate.
The Autonomous Future with Causely
Causely is one of a handful of software vendors leading the conversations around the rise of AI SRE. Kliger recently participated in a roundtable discussion hosted by The New Stack, where difficult conversations around the limitations of LLMs and the evolving role of the human operator were discussed.
Kliger is the most provocative of the panel, and his opinion carries a lot of weight as the former founder of Turbonomic (acquired by IBM) and CTO of SMARTS (acquired by EMC). He states there’s no reason to keep humans in places where machines can perform the tasks better. And while we may never reach a point where our online systems are fully self-healing and self-operating, this should ultimately be the future we strive for.
In an article titled If Planes Can Fly Themselves Then Why Can’t IT Management be Autonomous? Kliger writes, “When it comes to IT operations, our goal should be to get humans out of the loop as much as possible. In a world run by software, where SLAs guarantee 99.999% uptime, even one incident in a month — which requires human intervention — is enough downtime to violate your commitments to your customers.”
This is certainly a lot to think about. What we know for sure is that simply throwing LLMs at the problem won’t do the job when it comes to managing the reliability, performance, and security of online systems. A causal reasoning system that deeply understands the context of applications and how they connect to the business may be the best way forward.











