Today we’re announcing $15 million in Series B funding, led by Yu Galaxy, bringing our total raised to $35 million.The raise is driven by our scaling customer traction rather than capital needs.. Our revenue has multiplied and we’ve closed million dollar deals with Fortune 50 companies and supported fast growing AI startups.
We’ll use these funds to build our go-to-market team to match: customer success expansion, enterprise sales reach, and the ability to deliver on a service model that looks very different from the traditional Silicon Valley “ship-it-and-done” playbook. Let’s look at our broader story to see how we help you build AI that understands your business.
An AI company from day one
InsightFinder has been an AI company since before “AI company” was a category worth claiming. I founded InsightFinder on the belief that the hardest operational problems in distributed systems—anomaly detection, root cause analysis, incident prediction and prevention—required fundamentally different AI technologies than traditional AI models offer.
That foundation shapes everything at InsightFinder. When the industry shifted to LLMs and AI agents, we didn’t bolt on an AI layer. We extended what we’d already built. The patented composite AI techniques that power our IT observability platform are the same techniques we’ve applied to AI reliability. The ability to extract root causes from noisy, multi-source, multi-modality data in production infrastructure is exactly what it takes to diagnose why an AI agent behaved unexpectedly, why a model’s outputs drifted, or why an agentic workflow failed in a way that didn’t trigger any obvious alerts.
We didn’t pivot into AI reliability. Building that was a natural progression of the same hard problem: making complex, probabilistic systems trustworthy in production.
The problem that’s bigger than it looks
Here’s what we hear from enterprise teams running AI in production: everything works great in demos or in development. But when AI systems (or agents!) start interacting with the real world, that’s when everything breaks and the real damage begins. There’s a gap between their language, their workflows, their edge cases and what the system expects. Things start to break in ways dashboards just can’t easily show.
The reason is simple: general-purpose AI models don’t know your business. They know natural language. They know patterns. But they don’t know what “normal” looks like for your payment processing system, what your support organization considers a high-severity escalation, or what quality means for your specific domain. The gap between what a foundational model can do and what your operations actually require is where unpredictable risk lives.
The market’s current answer is to add monitoring. Track latency. Count tokens. Alert on error rates. That’s necessary, but it doesn’t address the underlying problem. As we told TechCrunch this week: the biggest misconception in AI observability is that it’s limited to LLM evaluation during development and testing. An effective platform needs to support development, evaluation, and production; it must support the entire lifecycle and close the loop between all three.
Most tooling today addresses one or two of those areas. Nobody connects them. That’s the gap we’re filling with InsightFinder.
Closing the feedback loop
The core of InsightFinder’s AI Reliability platform is a feedback loop that most of the market treats as someone else’s problem: taking what happens in production and using it to continuously improve the AI systems that run there.
You need far more than just evals, guardrails, or observability to deliver reliability. Yes, those things matter and are essential. But you absolutely need to close the feedback loop between what happens in production and using that to tune reliability performance.
That loop runs through several capabilities we’ve built and recently shipped:
Multidimensional prompt comparison. When a team says “Prompt B is better,” the right follow-up question is always “better where?” A prompt that performs well on internal test data can fail on real-world inputs filled with partial context, domain jargon, or edge cases your test suite didn’t anticipate. InsightFinder’s Prompt Comparison capability in LLM Labs lets teams version prompts, test them across a matrix of models and datasets, and evaluate quality, token cost, and latency simultaneously. It surfaces a winner backed by evidence, not instinct.
SLMs as domain-tuned judges. General-purpose LLMs make poor evaluators of domain-specific quality. They don’t know your standards, your terminology, or your definition of “correct.” InsightFinder supports the creation of Small Language Models (SLMs) fine-tuned to your specific business context and deployed as Judge SLMs—evaluators that understand your domain and apply a quality bar relevant to your business, not a generic one. The difference between a generalist judge and a domain-tuned judge is the difference between evaluations that maybe sound plausible and evaluations you can trust.
Model fine-tuning from production failures. When prompts fail in production, InsightFinder can automatically generate training datasets from those failures and run fine-tuning jobs against them. That adapts foundational models into custom models tuned for your domain and your reality. Adapting to real production evidence is what continuously drives meaningful improvements that quickly deliver reliable AI.
Multi-agent tracing. Traditional observability was built for call graphs that repeat. Agents break that assumption. A single user request can fan out across retrieval, tool calls, model invocations, and internal services; and those execution paths can vary every time. When something goes wrong, you can stitch together hundreds of event logs to guess at what happened, or you can clearly reason about it using a causal trace. InsightFinder’s distributed tracing preserves the full execution story across complex agent workflows, so teams can answer the questions that actually matter during an incident.
Together, these capabilities form something the market currently lacks: a closed loop between production behavior and development decisions, built to run continuously, automatically, and grounded in your specific business context.
A different kind of service model
The way that we deliver InsightFinder also doesn’t look like typical enterprise SaaS.
We don’t hand customers a login and a documentation link and measure time-to-first-value in weeks. We go deep. Our team works alongside customers to build and tune AI models that work for their environments—the system architecture, the business logic, the evaluation criteria, the domain context that makes the platform’s insights actionable in their specific workflows. Think of it as “AI reliability as a service,” not AI observability tools you’re left to configure yourself.
In part, this funding will help us scale that capability: to hire the customer success talent and operational infrastructure that lets us deliver that kind of depth to more organizations without diluting what makes it work.
What comes next
Yu Galaxy led this round because they see what we see: as AI moves from experimental to operational, reliability stops being a feature request and starts being a matter of public accountability. Today, AI runs in places like hospitals, financial infrastructure, logistics networks, and government services—environments where probabilistic failure at scale has real social impact.
As PR Yu, Managing Partner at Yu Galaxy, said, “InsightFinder isn’t just optimizing IT, it’s building the immune system for the digital infrastructure that powers hospitals, banks, and other mission-critical industries.” That resonates with us. That isn’t hyperbole. It’s the direction the stakes are rapidly moving toward. Reliability is no longer a luxury. It’s a requirement for a safe society.
This funding gives us resources to further accelerate the product roadmap, build the go-to-market infrastructure to reach the teams who need this platform most, and deepen the customer success model that makes enterprise AI truly functional in production..
We’re not arriving in the AI reliability category. We’ve been building toward it for years. What’s different now is that the market is finally asking the right questions. We have the answers, the customer evidence, and the platform to deliver real help.
Thank you
To our customers who trusted our ambitious mission before the category had a name: the work you’ve done with InsightFinder by putting it inside production systems that matter is what makes all of this real. Thank you.
To the InsightFinder team: what you’ve built holds up in the places that count. That doesn’t happen by accident. Thank you.
To Yu Galaxy and our broader investor community: thank you for the conviction.
There’s a lot left to build. We’re just getting started.
Read the full press release here. Read the TechCrunch coverage here. Or request a demo and see how the platform works with your own reliability challenges.