You can access Causal Analysis by clicking: Analysis > Causal Analysis
Causal Analysis is used to analyze the relationship between metrics, log entries, and KPIs. It provides a cross-project view of data by analyzing different projects and instance groups within a Causal Group.
To view an existing Causal Group, select from the left side or click “Create Causal Group” to create a new one. After creating the Causal Group, you can create Causal Analyses for different time windows.
With Causal Analysis, you can analyze the causal relationship between a Metric project or Log project. For both Metric and Log project, based on the causal analysis result, Causal Prediction shows the possible predicted event of the selected event. With it, user could know the impacts of the current event and take actions to avoid it. For example, based on historic result, when CPU on stg-app is higher than normal, it will mostly cause LoadAvg5 on stg-cnn to be higher than normal.
For both Metric and Log projects, Casual Graph shows the causal relation or correlation between the historic detected events. The events should be Metric or Log events. The causal graph settings are configured in the Causal Analysis. After setting it, system automatically links the causal analysis result with the events.
The Causal Graph helps user quickly find the possible root causes of the event. For example, in the figure below, LogAnalysisPage metric higher than normal are mostly caused by NetworkIn on stg-worker higher than normal and MemUsed on stg-app higher than normal.
For metric or log projects, we can obtain a variety of anomalies in the data, such as metric detected or predicted anomalies, log rare events and log hot/cold events. By analyzing the interrelationship of these anomalies in time or space (host instance), type and so on, we can identify the causal relationship and correlation between anomalies or the corresponding host instance, application, metrics, etc. Identifying the relationship automatically provides an effective means for us to find, locate, analyze, and predict the problem.
Moreover, combined with the requirements of KPI or SLA and analysis results of the causal relationship, we can deduct the requirements for metrics reversely from KPI or SLA. This deduction function helps us locate the best optimization point to improve the quality of service quickly and accurately.