I would like to share my first-hand experience with one of the most powerful AIOps tools out there on the market, InsightFinder.

I have been an investment banking technology leader until recently I joined InsightFinder as a principal solution architect. After working with most of the banks on the street, I can confidently say that when it comes to the financial impact of an outage, the range could be anywhere from a few hundred thousands to several millions. Because of this significant business impact, it is essential to find a tool to help avoid downtime.

In one of my previous organizations, I led a few high-touch applications with real-time data delivery to multiple businesses. SLAs for data delivery were lower than target for the vast majority of businesses. My objective has always been to achieve zero downtime for business applications, at least during business hours across the globe. Hence, I was on a continuous hunt for the right partner who could help me achieve this target and find a solution that could be scaled organization-wide.

That’s when I was introduced to InsightFinder by a colleague in the organization. I was on a mission to try any product that could help me achieve zero downtime. Helen, founder of InsightFinder, promised me to partner in the journey and lead us there. In the first meeting, Helen introduced the capabilities of the product, and it all looked “too good to be true.” We started with the scope of PoC (Proof of Concept) and wanted to run all the common outages through the platform to understand the real-time unsupervised learning and incident prediction capabilities.

In all, my team did six different PoC, including hardware issues, application issues, batch processes, and change-related issues. We were able to easily stream in different hardware metrics like CPU, memory, I/O, etc. from our central data-lake. Streaming application log into InsightFinder was also a cake walk with build-in integrations in the product for most common observability products like Splunk, Elastic etc. InsightFinder could also source past incidents from ServiceNow directly, making it easier for the team to start getting insights.

Once all the data was ingested and real-time streaming was set up, magic started to happen.  The Unified Health View started showing all the applications on different cards with timeline charts. Once I drilled into the application, I could see the detected and predicted incidents with their corresponding anomalies, which InsightFinder generated after correlating all the metrics and logs using its multivariate and multi-modality mechanism. These predictions also had a lead time of at least 3 hours, this gave my Ops team time to react and prevent the outage from happening. Diving more into the product, I could easily figure out the root cause of the predicted incident, which could help the team to reduce MTTR (mean time to resolution). My team could also automate the actions using scripts and integration, either to create tickets or add disks as required, etc.

At around the time when the team was doing this PoC, one of my applications had a major production outage. Due to the market volatility, the volume of incoming data increased, and the disk ran out of space in a few hours. The Ops team had no idea what was going on at that time, and it took more than 20 hours to resolve that incident, costing us significant financial loss. This was the perfect use case to run into InsightFinder because it had never happened before in that application and no one in the team could anticipate it. These are the cases that are most difficult to predict and resolve unless the root cause is known. When it was run into InsightFinder, I was surprised with the results that it could predict it with a lead time of 10 hours.

That was the moment when I was sure I had found the right product and partner to achieve the objective of near zero downtime and MTTR in minutes, if not seconds. I was so thankful to Helen and her team for all the support they provided during the PoC and extended scope of that last use case.

AIOps is evolving, and having the right product and company to partner with is the key to the success of your business. InsightFinder uses its patented technology to solve the most difficult industry challenges and has a great team to support their customers.

If you want more information, please reach out. You can try InsightFinder for free.

Other Resources

Our unified Kubernetes collector gathers metrics, logs, traces, and events in real-time from a single aggregation point. KubeInsight leverages all

Observe your entire IT system health in real-time with one central view across all services, applications, and infrastructure. Catch production

Deploy our purpose-built AI platform to empower you and your teams with hours of advance notice. See how it works

Unified Intelligence Engine™ is the system that drives InsightFinder anomaly detection, root cause analysis, and incident prediction. It ingests and

A major credit card company’s mobile payment service experienced severe performance degradation on a Friday afternoon.