Welcome to InsightFinder Docs!

Monitoring Application Availability

Purpose

To monitor application availability. The example will onboard application logs, check application status, detecting when anomalies occur that indicate application issues.

Data Source

Logs that include API return status, such as HAProxy logs or Nginx logs, can come from any integration or agent, such as AWS cloudwatch or elastic. The log format can be plain text or JSON.

Plain text example:

[24/May/2024:12:02:19 +0000] 24.199.98.33 52.91.69.202 "/autodiscove/" 404 0.000 -

JSON example:

{

“time”:2024-05-24T08:46:13+00:00
“status”:200
“path”:/agent-upload-instancemetadata
“request_query”:
“upstream_status”:200
“upstream_name”:insightfinder-insightfinder-dataserver-443
“remote_addr”:
“remote_user”:
“bytes_sent”:2256
“request_time”:0.372
“vhost”:app.insightfinder.com
“request_proto”:HTTP/2.0
“request_length”:488
“duration”:0.372
“method”:POST
“http_referrer”:
“http_user_agent”:ReactorNetty/1.1.16

}

Project Set up

Log Ingestion

Check out our integration guide and set up any log project.

View Log Data

The data can be viewed on the InsightFinder Log/Trace analysis page. The data may take a few minutes to begin streaming and showing up in the UI.

Configure Alerting

Alerting can be configured based on certain keywords appearing in the logs such as HTTP Status “500” “400”. This can also include the detection of custom messages such as “application is not responding”.
Regular expressions are supported to define the keywords in logs, these are then keyword alerts.
To configure go to System setting → project setting → Advanced setting → Labels → Detection Keywords

View Alerts

When alerts are generated they can be viewed in the incident investigation page. Choose the Log Anomalies tab.

Configure SLI (Service Level Indicator)

To Configure go to, System settings → project setting → Advanced setting → Log To Metric setting
Count all status messages. Take the count of different status logs, and transform it into a metric. This can also be used to detect increases or decreases in status log counts.
Count all successful HTTP status messages. Any delta will be unsuccessful requests that indicate a drop in service level. Regex is used to filter only the correct HTTP statuses.
The operation performed is division which divides the actual value (all successful) by the base value (all status messages)

View Availability Chart

Now you can see the availability (SLI) in the top left chart. More details on the status codes are available in the other charts.

Configure Incident Detection

Set error code metrics to enable 0 filling to auto detect anomalies when error code increases

Set as a KPI, escalate to incident when the KPI duration is exceeded.
Set “Near constant detection” for minor deviations from the baseline

Now the system will automatically detect anomalies.
Note: A user can set a specific SLA value if they desire.

View Incident

The Incident generated and associated anomaly can be viewed and analyzed in the Incident Investigation page

From the Blog

Blogs

Temporal + InsightFinder: LLM Observability for Agentic Workflows

TL;DR: Temporal tracks if your AI workflows run, but it can’t tell you if…

Blogs

How InsightFinder’s patents deliver reliability that other tools can’t

InsightFinder holds exclusive patents covering anomaly detection, root cause analysis, and automated remediation; three…

Blogs

Why A Unified Reliability Platform Beats Tool Sprawl

Your systems are on fire. A cascading latency spike is rippling through your payment…

See how InsightFinder helps your team deliver reliable services across every layer of the stack

Take InsightFinder AI for a no-obligation test drive. We’ll provide you with a detailed report on your outages to uncover what could have been prevented.

ARI

IT Reliability

AI Reliability

Unified Intelligence Engine - UIE

Integrations

Release Notes

Welcome to InsightFinder Docs!

Monitoring Application Availability

Purpose

Data Source

Project Set up

Log Ingestion

View Log Data

Configure Alerting

View Alerts

Configure SLI (Service Level Indicator)

View Availability Chart

Configure Incident Detection

View Incident

From the Blog

Temporal + InsightFinder: LLM Observability for Agentic Workflows

How InsightFinder’s patents deliver reliability that other tools can’t

Why A Unified Reliability Platform Beats Tool Sprawl

See how InsightFinder helps your team deliver reliable services across every layer of the stack