Welcome to InsightFinder AI Observability Docs!

Traceviewer

LLM Traces Page is a powerful and intuitive tool designed to help users understand anomalous behavior in your LLM and ML models. Powered by InsightFinder’s advanced anomaly detection engine, the LLM Traces Page automatically identifies and surfaces unusual patterns, performance issues, and problematic behaviors in AI systems and Models. Whether you’re a data scientist investigating model degradation, a developer troubleshooting unexpected AI agent responses, a DevOps engineer monitoring ML system health, or a team lead ensuring reliable AI service performance, AW LLm Traces Page gives you focused visibility to improve efficiently.

What are LLM Traces?

An LLM trace is a timeline of events that occur when a large language model processes a request or generates a response.

Every trace is made up of spans, which represent individual operations or steps within that process. For example, when a user submits a prompt, you might have spans for prompt parsing, model tokenization, retrieving context or embeddings, running the forward pass through the model, applying safety or relevance checks, and streaming the generated tokens back to the user. Each span includes details such as when it started, how long it took, and what computations or interactions occurred during that stage.

LLM traces can capture many different types of activities. Model traces show the flow of input through tokenization, inference, and generation. System traces record lower-level operations like GPU kernel executions, memory transfers, or API calls to external tools. In multi-component or agentic LLM systems, traces can follow a request across different steps such as retrieval, reasoning, tool use, and response synthesis—providing end-to-end visibility into how the model and supporting infrastructure handled the request.

Who can use Traceviewer?

AW LLM Traces page can universally be used by any role/profile and can provide a unique view and insights into application performance depending on the use case. Some examples of use cases:

Data Scientists use AW LLM Traces to quickly identify when their LLM and ML models are behaving unexpectedly. Rather than manually sifting through thousands of normal traces, they can focus on the anomalous ones that indicate model drift, performance degradation, or data quality issues. This targeted approach helps them validate experimental results, detect when models need retraining, and understand the root causes of unexpected model outputs

DevOps and Site Reliability Engineers use AW LLM Traces to proactively monitor AI system health by focusing on anomalous patterns rather than normal operations. The tool highlights unusual service interactions, performance spikes, and error patterns across distributed ML and LLM systems, making it easier to prevent incidents before they impact users and reduce downtime in case of incidents.

System Administrators appreciate AW LLM Traces Page’s ability to automatically flag resource utilization anomalies and unusual performance patterns. When users report issues, administrators can immediately see if there are corresponding anomalous traces related to infrastructure problems, model inference delays, or system bottlenecks

Trace Ingestion

When creating a model in InsightFinder AW, need to have at least 1 data type as ‘Trace’ to ingest traces and perform Trace Analysis on the ingested data.

After choosing the Data Source (OpenTelemetry in this example), a pop up window will allow the users to enter all the endpoint and credentials information that will be needed to configure InsightFinder’s trace collector(s) as well as any additional configuration that might be required.

To learn more information about configuring the collector agents, users can click on the ‘Deploy Agent’ button at the bottom which will redirect the users to the ReadMe documentation for the specific data collectors.

Once the required fields are filled, users can click on the ‘Generate Config’ button that becomes clickable to have InsightFinder generate the configurations required to set up streaming.

If need more information on Model Creation, please refer to the <Model-Creation-document-link>

InsightFinder Integrations with Different Tracing SDKs

Overview

This document covers 4 approaches to integrating LLM agent tracing with InsightFinder. By capturing LLM calls, responses, and metadata, it enables you to monitor and analyze LLM performance and response quality in real time.

1. Arize (via OpenInference)

Automatic LLM instrumentation
Child spans for each API call
Best for: Modern Python agents, cleaner code

2. Temporal SDK

Distributed workflow framework
Built-in tracing interceptor
Best for: Complex async workflows, Temporal users

3. DDTrace (Datadog)

Lightweight auto-instrumentation
Environment variable configuration
Best for: Existing Datadog users, quick setup

4. Pure OpenTelemetry (OTel)

Explicit manual span creation
Full control, minimal magic
Best for: Simple agents, explicit logging

Arize Tracing SDK Integration with InsightFinder

Overview

This section explains how to integrate the Arize tracing SDK into agent systems to send traces to InsightFinder for LLM prompt/response monitoring and analysis. The Arize tracing sdk SDK provides automatic instrumentation of LLM calls with OpenInference and OpenTelemetry while maintaining full compatibility with InsightFinder’s OTLP backend.

NOTE: Examples in this document use OpenAI as the LLM provider for reference, but can be extended to any LLM provider library that is supported by Arize’s openinference-instrumentation-* libraries.

Architecture

The system consists of the following components:

1. Agent Application (Your Code)

↓ OpenTelemetry spans (OpenInference-instrumented)

2. Instrumentation Layer

→ Auto-captures LLM calls, responses, and metadata

→ Adds standard LLM attributes (llm.prompt, llm.response, etc.)

↓ OTLP/gRPC

3. OTLPSpanExporter with InsightFinder Headers

→ Exports spans with InsightFinder authentication

↓ gRPC to InsightFinder Trace Server

4. InsightFinder Platform

→ Analysis & visualization of traces and LLM interactions

Prerequisites

Required Dependencies

Install the following Python packages:
opentelemetry-api>=1.20.0
opentelemetry-sdk>=1.20.0
opentelemetry-exporter-otlp-proto-grpc>=0.41b0
# Depending on llm provider used, import the corresponding package
openinference-instrumentation-openai

Implementation Steps

Step 1: Import Required Modules

Add these imports to your agent code:

import os
import uuid
import openai # OpenTelemetry imports
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.resources import Resource# OpenInference instrumentation for automatic LLM tracing
from openinference.instrumentation.openai import OpenAIInstrumentor

Step 2: Configure OpenTelemetry Provider

Set up the tracer provider with InsightFinder headers and OpenInference instrumentation:

# Configure InsightFinder authentication headers
insightfinder_headers = {
“ifuser”: os.getenv(“INSIGHTFINDER_USER”),
“iflicensekey”: os.getenv(“INSIGHTFINDER_LICENSE_KEY”),
“ifproject”: os.getenv(“INSIGHTFINDER_PROJECT”),
“ifsystem”: os.getenv(“INSIGHTFINDER_SYSTEM”),
} # Configure resource with service information
resource = Resource.create({
“service.name”: “llm-agent”,
“service.version”: “1.0.0”,
})# Create OTLP exporter with InsightFinder backend
otlp_exporter = OTLPSpanExporter(
endpoint=os.getenv(“INSIGHTFINDER_ENDPOINT”, “localhost:4317”),
insecure=False, # Set to True if not using TLS
headers=insightfinder_headers, # Include InsightFinder authentication
)

# Create and configure tracer provider
tracer_provider = TracerProvider(resource=resource)
tracer_provider.add_span_processor(BatchSpanProcessor(otlp_exporter))

# Register globally
trace.set_tracer_provider(tracer_provider)

# Get the tracer
tracer = trace.get_tracer(__name__)

Step 3: Instrument LLM Client with OpenInference

Enable automatic tracing of all OpenAI API calls using OpenInference:

# Instrument OpenAI client with OpenInference (modern approach)
# This automatically captures all LLM calls (prompts, responses, tokens, etc.)
# and creates detailed child spans for each OpenAI API call
OpenAIInstrumentor().instrument(tracer_provider=tracer_provider) # Initialize OpenAI client – all calls are now automatically traced
client = openai.OpenAI(api_key=os.getenv(“OPENAI_API_KEY”))

Once instrumented, OpenInference automatically captures:

✓ LLM prompts and responses (in a dedicated `openai.chat.completions` span)

✓ Token counts (prompt, completion, total)

✓ Model name and parameters

✓ API latency and errors

✓ Request/response timing

✓ Detailed hierarchical span structure

Step 4: Create Manual Spans for Workflow Control

While OpenInference library auto-instruments LLM calls, you can create manual spans for application logic:

def your_workflow(user_query):
“””Main workflow – spans are automatically created and hierarchies maintained.””” with tracer.start_as_current_span(“your_workflow”) as span:# add session and user context to parent span
span.set_attribute(“x-session-id”, session_id)

span.set_attribute(“x-username”, user_id)

# Step 1: Gather context (contains auto-instrumented LLM call)
context = gather_context(user_query)
span.set_attribute(“step_1_complete”, True)

# Step 2: Analyze (contains auto-instrumented LLM call)
analysis = analyze_data(user_query, context)
span.set_attribute(“step_2_complete”, True)

span.set_attribute(“workflow_status”, “success”)

return {
“query”: user_query,
“context”: context,
“analysis”: analysis
}

Attributes for InsightFinder:

Attribute	Purpose	When to Set
x-username (Optional)	User identifier	Before LLM call
x-session-id (Optional)	Session/workflow correlation ID	Before LLM call

Step 5: Call LLM Functions (Automatically Traced)

Once OpenInference instrumentation is enabled, all LLM calls are automatically traced:

def ask_llm(prompt, model=”gpt-3.5-turbo”):
“””
Call OpenAI – OpenInference automatically captures:
– llm.prompt (input prompt in child span)
– llm.response (output response in child span)
– llm.token_count_prompt (input tokens in child span)
– llm.token_count_completion (output tokens in child span)
– llm.token_count_total (total tokens in child span)
– llm.model (model name in child span)
“”” with tracer.start_as_current_span(“ask_llm”) as span:
# Add session and user context to span
span.set_attribute(“x-session-id”, session_id)span.set_attribute(“x-username”, user_id)

# This call triggers OpenInference instrumentation
# which creates a child span “openai.chat.completions” with all LLM details
response = client.chat.completions.create(
model=model,
messages=[{“role”: “user”, “content”: prompt}],
max_tokens=150,
temperature=0.7,
)

# OpenInference instrumentation automatically adds LLM attributes to child span
# You can still add custom attributes to parent span if wanted
span.set_attribute(“request_successful”, True)

return response.choices[0].message.content.strip()

How It Works

Automatic Instrumentation Flow

1. Application Start: OpenInference instrumentation is registered with OpenAI client

2. OpenAI Call: When client.chat.completions.create() is called:

├─ OpenInference interceptor is triggered

├─ Input prompt is captured

├─ API call is executed

├─ Response and metadata are captured

└─ Child span “openai.chat.completions” is created with all LLM attributes

3. Span Hierarchy: OpenInference creates detailed parent-child structure:

your_workflow (manual span – your code)

├── gather_context (manual span – your code)

│ └── ask_llm (manual span – your code)

│ └── openai.chat.completions (AUTO CHILD SPAN by OpenInference!)

│ • llm.prompt: “Provide context about…”

│ • llm.response: “Here’s the context…”

│ • llm.token_count_prompt: 15

│ • llm.token_count_completion: 89

│ • llm.token_count_total: 104

└── analyze_data (manual span – your code)

└── ask_llm (manual span – your code)

└── openai.chat.completions (AUTO CHILD SPAN by OpenInference!)

llm.prompt: “Analyze this…”
llm.response: “Analysis results…”
llm.token_count_prompt: 120
llm.token_count_completion: 156
llm.token_count_total: 276

4. Export: Spans (including auto-generated child spans) are batched and sent to InsightFinder via OTLP with headers

5. Processing: InsightFinder extracts LLM interactions from the detailed span hierarchy

6. Visualization: Data is analyzed and displayed in InsightFinder dashboard

Temporal SDK Tracing Integration with InsightFinder

Overview

This section explains how to integrate OpenTelemetry tracing into Temporal-based agent systems to send traces to InsightFinder for LLM prompt/response monitoring and analysis. This approach uses Temporal SDK’s built-in TracingInterceptor for automatic distributed tracing while adding custom attributes for LLM-specific data extraction.

Architecture

The system consists of the following components:

1. Temporal Agent (Your Code)

↓ OpenTelemetry spans with custom attributes

2. TracingInterceptor (Temporal SDK)

→ Provides automatic Temporal tracing

↓ OTLP/gRPC

3. Trace Server (InsightFinder)

→ Extracts prompt/response pairs and formats data for InsightFinder

↓ HTTP API

4. InsightFinder Platform

→ Analysis & visualization

Prerequisites

Required Dependencies

Install the following Python packages:

temporalio
opentelemetry-api
opentelemetry-sdk
opentelemetry-exporter-otlp-proto-grpc

Implementation Steps

Step 1: Import Required Modules

Add these imports to your Temporal agent code:

import os
import uuid # OpenTelemetry imports
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.resources import Resource# Temporal OpenTelemetry integration
from temporalio.contrib.opentelemetry import TracingInterceptor

# Temporal SDK
from temporalio.client import Client
from temporalio import activity, workflow

Step 2: Configure OpenTelemetry Provider

Create a function to set up the OpenTelemetry tracer with InsightFinder headers:

def setup_opentelemetry():
“””Configure OpenTelemetry with OTLP exporter for InsightFinder trace server.””” # Create resource with service information
resource = Resource.create({
“service.name”: “agent-service-name”,
“service.version”: “1.0.0”,
})# Read configuration from environment
endpoint = os.getenv(“OTEL_EXPORTER_OTLP_ENDPOINT”, “https://localhost:4317”)

# InsightFinder authentication headers (sent as gRPC metadata)
headers = {
“ifuser”: “<INSIGHTFINDER_USER>”,
“iflicensekey”: “<INSIGHTFINDER_LICENSE_KEY>”,
“ifproject”: “<INSIGHTFINDER_PROJECT_NAME>”,
“ifsystem”: “<INSIGHTFINDER_SYSTEM_NAME>”,
}

# Create OTLP exporter
otlp_exporter = OTLPSpanExporter(
endpoint=endpoint,
headers=headers,
insecure=False # Set to False if using TLS
)

# Set up tracer provider with batch span processor
provider = TracerProvider(resource=resource)
processor = BatchSpanProcessor(otlp_exporter)
provider.add_span_processor(processor)

# Set as global tracer provider
trace.set_tracer_provider(provider)

return trace.get_tracer(__name__)

Key Configuration Points:

Resource: Identifies your service in traces
Headers: InsightFinder authentication (passed as gRPC metadata)
Endpoint: Your trace server URL (OTLP over gRPC)
BatchSpanProcessor: Batches spans for efficient export

Step 3: Initialize Temporal Client with TracingInterceptor

In your main function, connect the Temporal client with the tracing interceptor:

async def main():
# Set up OpenTelemetry
tracer = setup_opentelemetry() # Connect to Temporal with tracing interceptor
client = await Client.connect(
os.getenv(“TEMPORAL_ADDRESS”, “localhost:7233”),
interceptors=[TracingInterceptor(tracer)] # ← Enable automatic tracing
)# Start your workflow
result = await client.execute_workflow(
YourWorkflow.run,
“your-input”,
id=f”workflow-{uuid.uuid4()}”,
task_queue=”your-task-queue”,
)

return result

if __name__ == “__main__”:
import asyncio
asyncio.run(main())

Step 4: Add Custom Attributes to Activities

For any activity that makes LLM calls, add custom attributes to the current span:

@activity.defn
async def your_llm_activity(task: str) -> str:
“””Example activity that calls an LLM.””” # Get the current span created by TracingInterceptor
current_span = trace.get_current_span()# Prepare the LLM prompt (This is a placeholder, prompt should contain actual prompt to LLM)
user_prompt = f”Your prompt text here: {task}”

# Add attributes BEFORE making the LLM call
if current_span:
# The input prompt sent to the LLM
current_span.set_attribute(“chat.prompt”, user_prompt)

# User and session identification (Optional)
current_span.set_attribute(“x-username”,”workflow-user”)

workflow_id = activity.info().workflow_id
current_span.set_attribute(“x-session-id”,)

try:
# Make your LLM call (example with OpenAI)
response = await your_llm_client.chat.completions.create(
model=”gpt-4o”,
messages=[{“role”: “user”, “content”: user_prompt}]
)

llm_response = response.choices[0].message.content

# Add response and token usage AFTER receiving the response
if current_span:
current_span.set_attribute(“chat.response”, llm_response)
current_span.set_attribute(“chat.model”, response.model)

# Token usage (if available)
if response.usage:
current_span.set_attribute(“chat.prompt_tokens”, response.usage.prompt_tokens)
current_span.set_attribute(“chat.completion_tokens”, response.usage.completion_tokens)
current_span.set_attribute(“chat.total_tokens”, response.usage.total_tokens)

return llm_response

except Exception as e:
# Add error information to span
if current_span:
current_span.set_attribute(“error”, str(e))
raise

Required Attributes for InsightFinder:

Attribute	Purpose	When to Set
chat.prompt	Input prompt sent to LLM	Before LLM call
chat.response	LLM’s response	After LLM call
x-username (Optional)	User identifier	Before LLM call
x-session-id (Optional)	Session/workflow correlation ID	Before LLM call
chat.prompt_tokens	Token usage (input)	After LLM call
chat.completion_tokens	Token usage (output)	After LLM call
chat.total_tokens	Total token usage	After LLM call
chat.model	Model used	After LLM call
error	Error message (if any)	On exception

How It Works

─

Trace Flow

1. Client Start: When you execute a workflow, TracingInterceptor creates a root span

2. Workflow Execution: A child span is created for the workflow execution

3. Activity Execution: A child span is created for each activity

Automatic: Temporal metadata (workflow_id, activity_type, etc.)
Manual: Your custom LLM attributes (prompt, response, tokens)

4. Export: Spans are batched and sent to trace server via OTLP/gRPC

5. Processing: InsightFinder extracts prompt/response pairs based on span attributes

6. Storage: Data is sent to InsightFinder for analysis

Span Hierarchy Example

workflow-execution (span.kind=server)

└── StartWorkflow:YourMainWorkflow

└── StartChildWorkflow:YourSubWorkflow

└── RunActivity:your_llm_activity (span.kind=server) ← Your custom attributes here

chat.prompt: “Analyze this task…”
chat.response: “Based on analysis…”
chat.prompt_tokens: 45
chat.completion_tokens: 120

Why TracingInterceptor is Important

The TracingInterceptor provides:

✓ Automatic span creation for workflows and activities

✓ Trace context propagation across Temporal boundaries

✓ Parent-child relationships maintaining execution hierarchy

✓ Standard OpenTelemetry attributes (span.kind, service.name, etc.)

✓ Temporal-specific attributes (workflow_id, run_id, activity_type)

Without it, you would need to manually:

Create spans for each operation
Pass trace context as workflow/activity parameters
Manage parent-child span relationships
Set all standard attributes yourself

DataDog Trace SDK Integration with InsightFinder

Overview

This guide explains how to integrate Datadog’s ddtrace library into your agent systems to send distributed traces to InsightFinder for LLM prompt/response monitoring and analysis. This approach uses ddtrace’s automatic instrumentation combined with custom span tags for LLM-specific data extraction.

NOTE: Examples in this document use OpenAI as the LLM provider for reference, but can be extended to any LLM provider library that is supported by ddtrace instrumentation.

Architecture

The system consists of the following components:

1. Agent Code (Your Python Application)

↓ LLM client calls (auto-instrumented by ddtrace)

↓ Custom span tags (prompt, response, tokens)

2. DDTrace (Python Instrumentation Library)

→ Auto-instruments OpenAI library

→ Captures LLM interactions

↓ GRPC to InsightFinder Trace Collector

3. InsightFinder Trace Collector

→ Receives traces from ddtrace

→ Forwards to InsightFinder

↓ HTTP/gRPC API

4. InsightFinder Platform

→ Processes trace data

→ Extracts prompt/response pairs

→ Analysis & visualization

Prerequisites

──

Required Dependencies

Install the following Python packages:

ddtrace>=1.0.0

DDTrace Configuration

Method 1: Environment Variables (Recommended)

# Trace backend configuration
export DD_TRACE_ENABLED=true
export DD_SERVICE=”<agent-service-name>”
export DD_ENV=”production”# Trace backend – set to InsightFinder’s trace endpoint
export DD_AGENT_HOST=”<InsightFinder-Trace-Endpoint>”
export DD_TRACE_AGENT_PORT=”<InsightFinder-Trace-Port>”

Method 2: In-Code Configuration

from ddtrace import tracertracer.configure(
service=”<agent-service-name>”,
hostname=”<InsightFinder-Trace-Endpoint>”,
port=<InsightFinder-Trace-Port>,
https=True/False,
)

Implementation Steps

──

Step 1: Import Required Modules

Add these imports to your agent code:

import os
from ddtrace import tracer, patch

Key Components

ddtrace.tracer: Core tracing object for creating and managing spans
ddtrace.patch(): Enables automatic instrumentation of libraries
os: For accessing environment variables

Step 2: Enable Automatic Library Instrumentation

Enable the auto instrumentation of the library using ddtrace patch.

# Patch OpenAI to enable automatic tracing
patch(openai=True)# Initialize OpenAI client
OPENAI_API_KEY = os.getenv(“OPENAI_API_KEY”)
client = openai.OpenAI(api_key=OPENAI_API_KEY)

Step 3: Create LLM Call Function with Custom Tags

Using the ddtrace tracer, wrap LLM calls in a traced function with custom span tags for InsightFinder:

def ask_openai(prompt, model=”gpt-3.5-turbo”):
# Create a custom span for the function
with tracer.trace(“ask_openai”, service=”llm-agent”) as span: # Tag the span metadata and prompt for InsightFinderspan.set_tag(“span.kind”, “internal”)
span.set_tag(“chat.prompt”, prompt)

span.set_tag(“x-username”,”agent-user”)

span.set_tag(“”x-session-id”, session_id)

# Make the LLM call
response = client.chat.completions.create(
model=model,
messages=[{“role”: “user”, “content”: prompt}],
max_tokens=150,
temperature=0.7,
)

answer = response.choices[0].message.content.strip()

# Tag the response content for InsightFinder
span.set_tag(“chat.response”, answer)

# Add token usage metadata
span.set_tag(“chat.completion_tokens”, response.usage.completion_tokens)
span.set_tag(“chat.prompt_tokens”, response.usage.prompt_tokens)
span.set_tag(“chat.total_tokens”, response.usage.total_tokens)

# Add model information
span.set_tag(“chat.model”, model)

return answer

Required Attributes for InsightFinder:

Attribute	Purpose	When to Set
span.kind	Identifies Type of span for processing	Before LLM Call
chat.prompt	Input prompt sent to LLM	Before LLM call
chat.response	LLM’s response	After LLM call
x-username (Optional)	User identifier	Before LLM call
x-session-id (Optional)	Session/workflow correlation ID	Before LLM call
chat.prompt_tokens	Token usage (input)	After LLM call
chat.completion_tokens	Token usage (output)	After LLM call
chat.total_tokens	Total token usage	After LLM call
chat.model	Model used	After LLM call

Step 4: Set Attributes for Parent Spans

For multi-step agent workflows, add the required attributes along with any other wanted:

def agentic_workflow(user_query):
“””
Parent span containing multi-step agent reasoning and LLM calls
“””
with tracer.trace(“agentic_workflow”, service=”llm-agent”) as parent_span: parent_span.set_tag(“span.kind”, “internal”)parent_span.set_tag(“x-username”,”agent-user”)

parent_span.set_tag(“”x-session-id”, session_id)

…

# Step 1: Reasoning (internal step, no LLM call)
with tracer.trace(“agent_reasoning_step”, service=”llm-agent”) as span:
…

# Step 2: Context gathering (LLM call 1)
with tracer.trace(“agent_gather_context”, service=”llm-agent”) as span:
…

# Step 3: Synthesis (LLM call 2)
with tracer.trace(“agent_synthesize_insight”, service=”llm-agent”) as span:
…

Required Attributes for InsightFinder:

Attribute	Purpose	When to Set
span.kind	Identifies Type of span for processing	Before LLM call
x-username	User identifier	Before LLM call
x-session-id	Session/workflow correlation ID	Before LLM call
chat.model	Model used	After LLM call

Step 5: Add Error Handling

Tag errors into spans if wanting to record and analyze the errors:

def ask_openai_with_error_handling(prompt, model=”gpt-3.5-turbo”):
with tracer.trace(“ask_openai”, service=”llm-agent”) as span: …try:

…

return answer

except Exception as e:
# Tag error information
span.set_tag(“error”, str(e))
span.set_tag(“error.type”, type(e).__name__)
raise

Required Attributes for InsightFinder:

Attribute	Purpose	When to Set
error	Capture error during LLM call	After LLM call

How It Works

Trace Flow

1. Initialization: Your code imports ddtrace and patches OpenAI
2. Environment Setup: DDTrace reads `DD_AGENT_HOST`, `DD_TRACE_AGENT_PORT`, etc. from environment
3. Span Creation: `tracer.trace()` creates spans
4. Auto–Instrumentation**: DDTrace automatically captures all OpenAI API calls (via the patch)
5. Tag Addition: You explicitly set custom tags like `chat.prompt` and `chat.response`
6. Span Export: Spans are batched and sent to trace collector
7. Collection: Trace collector receives and processes the traces
8. Forwarding: Trace collector forwards to InsightFinder
9. Processing: InsightFinder extracts prompt/response pairs and processes them

Span Hierarchy Example

agentic_workflow (parent span)

├── agent_reasoning_step (internal, no LLM)

│ └── tags: step, query, agent.reasoning

├── agent_gather_context (LLM call 1)

│ └── ask_openai (child span)

│ ├── chat.completions.create (auto-instrumented by patch)

│ └── tags: chat.prompt, chat.response, chat.prompt_tokens, etc.

└── agent_synthesize_insight (LLM call 2)

└── ask_openai (child span)

├── chat.completions.create (auto-instrumented by patch)

└── tags: chat.prompt, chat.response, chat.prompt_tokens, etc.

OpenTelemetry SDK Tracing Integration with InsightFinder

Overview

This guide explains how to integrate OpenTelemetry tracing into agent systems to send traces to InsightFinder for LLM prompt/response monitoring and analysis. This approach uses OpenTelemetry SDK’s manual span instrumentation for complete control over distributed tracing while adding custom attributes for LLM-specific data extraction.

Architecture

The system consists of the following components:

1. Agent Application (Your Code)

↓ OpenTelemetry spans with custom attributes

2. TracerProvider (OpenTelemetry SDK)

→ Manages tracer and span processors

↓ OTLP/gRPC

3. OTLPSpanExporter (OpenTelemetry)

→ Exports spans with InsightFinder headers

↓ gRPC to InsightFinder Trace Server

4. InsightFinder Platform

→ Analysis & visualization of traces and LLM interactions

Prerequisites

Required Dependencies

Install the following Python packages:

opentelemetry-api>=1.20.0opentelemetry-sdk>=1.20.0

opentelemetry-exporter-otlp-proto-grpc>=0.41b0

Implementation Steps

──

Step 1: Import Required Modules

Add these imports to your Temporal agent code:

Step 2: Configure OpenTelemetry Provider

Create a function to set up the OpenTelemetry tracer with InsightFinder headers:

def setup_opentelemetry():
“””Configure OpenTelemetry with OTLP exporter for InsightFinder trace server.””” # Create resource with service information
resource = Resource.create({
“service.name”: “agent-service”,
“service.version”: “1.0.0”,
})# Read configuration from environment
endpoint = os.getenv(“OTEL_EXPORTER_OTLP_ENDPOINT”, “localhost:4317”)

# InsightFinder authentication headers (sent as gRPC metadata)
headers = {
“ifuser”: os.getenv(“INSIGHTFINDER_USER”),
“iflicensekey”: os.getenv(“INSIGHTFINDER_LICENSE_KEY”),
“ifproject”: os.getenv(“INSIGHTFINDER_PROJECT_NAME”),
“ifsystem”: os.getenv(“INSIGHTFINDER_SYSTEM_NAME”),
}

# Create OTLP exporter
otlp_exporter = OTLPSpanExporter(
endpoint=endpoint,
headers=headers,
insecure=True/False # Set to True for non-TLS endpoints
)

# Set up tracer provider with batch span processor
provider = TracerProvider(resource=resource)
processor = BatchSpanProcessor(otlp_exporter)
provider.add_span_processor(processor)

# Set as global tracer provider
trace.set_tracer_provider(provider)

return trace.get_tracer(__name__)

Key Configuration Points:

Resource: Identifies your service in traces
Headers: InsightFinder authentication (passed as gRPC metadata)
Endpoint: Your trace server URL (OTLP over gRPC)
BatchSpanProcessor: Batches spans for efficient export

Configuration via Environment Variables

You can override configuration using environment variables:

# Exporter endpoint
export OTEL_EXPORTER_OTLP_ENDPOINT=trace.insightfinder.com:4317 # InsightFinder credentials
export INSIGHTFINDER_USER=your_user
export INSIGHTFINDER_LICENSE_KEY=your_license_key
export INSIGHTFINDER_PROJECT_NAME=your_project
export INSIGHTFINDER_SYSTEM_NAME=your_system# Service identification
export OTEL_SERVICE_NAME=agent-service
export OTEL_SERVICE_VERSION=1.0.0

Step 3: Initialize Tracer in Your Application

Initialize the tracer at application startup:

# At module level or in your main function tracer = setup_opentelemetry() session_id = str(uuid.uuid4())

Step 4: Create Spans for Your Workflow

Use context managers to create and manage spans. Example:

def your_workflow(user_query):
“””Main workflow that coordinates agent tasks.””” # Create a parent span for the entire workflow
with tracer.start_as_current_span(“your_workflow”) as span:span.set_attribute(“x-username”, agent-user)
span.set_attribute(“x-session-id”, session_id)

# Each function has its own tracer creating spans as needed, shown in step 5

# Step 1: Reasoning
reasoning = agent_reasoning_step(user_query)
span.set_attribute(“step_1_reasoning”, “complete”)

# Step 2: Context gathering with LLM call
context = agent_gather_context(user_query)
span.set_attribute(“step_2_context_gathered”, “complete”)

# Step 3: Insight synthesis with LLM call
insight = agent_synthesize_insight(user_query, context)
span.set_attribute(“step_3_insight_synthesized”, “complete”)

# Final summary
span.set_attribute(“workflow_status”, “success”)
span.set_attribute(“total_llm_calls”, 2)

return {
“query”: user_query,
“reasoning”: reasoning,
“context”: context,
“insight”: insight
}

Step 5: Add Custom Attributes to Activities

For any activity that makes LLM calls, add custom attributes to the current span:

def ask_llm(prompt, model=”gpt-3.5-turbo”):
“””Call LLM and track the interaction in traces.””” # Create a span for this LLM operation
with tracer.start_as_current_span(“ask_llm”) as span:
span.set_attribute(“x-username”, agent-user)
span.set_attribute(“x-session-id”, session_id)# Add the input prompt BEFORE making the LLM call
span.set_attribute(“chat.prompt”, prompt)

try:
# Make your LLM call (example with OpenAI)
response = client.chat.completions.create(
model=model,
messages=[{“role”: “user”, “content”: prompt}],
max_tokens=150,
temperature=0.7,
)

llm_response = response.choices[0].message.content.strip()

# Add response and metadata AFTER receiving the response
span.set_attribute(“chat.response”, llm_response)
span.set_attribute(“chat.model”, model)

# Token usage information
if response.usage:
span.set_attribute(“chat.prompt_tokens”, response.usage.prompt_tokens)
span.set_attribute(“chat.completion_tokens”, response.usage.completion_tokens)
span.set_attribute(“chat.total_tokens”, response.usage.total_tokens)

return llm_response

except Exception as e:
# Add error information to span
span.set_attribute(“error”, str(e))
span.set_attribute(“error.type”, type(e).__name__)
raise

Required Attributes for InsightFinder:

Attribute	Purpose	When to Set
chat.prompt	Input prompt sent to LLM	Before LLM call
chat.response	LLM’s response	After LLM call
x-username (Optional)	User identifier	Before LLM call
x-session-id (Optional)	Session/workflow correlation ID	Before LLM call
chat.prompt_tokens	Token usage (input)	After LLM call
chat.completion_tokens	Token usage (output)	After LLM call
chat.total_tokens	Total token usage	After LLM call
chat.model	Model used	After LLM call
error	Error message (if any)	On exception

How It Works

─

Trace Flow

1. Application Start: TracerProvider is initialized with OTLP exporter and InsightFinder headers

2. Workflow Execution: A root span is created for the entire workflow

3. Sub-operations: Child spans are created for each operation (reasoning, context gathering, etc.)

Automatic: Standard OpenTelemetry attributes (span.kind, service.name, etc.)
Manual: Your custom LLM attributes (prompt, response, tokens)

4. Export: Spans are batched and sent to trace server via OTLP/gRPC with InsightFinder metadata headers

5. Processing: InsightFinder extracts prompt/response pairs based on span attributes

6. Storage: Data is sent to InsightFinder for analysis and visualization

Span Hierarchy Example

your_workflow (root span)

├── agent_reasoning_step (child span)

│ └── [reasoning logic]

├── agent_gather_context (child span)

│ └── ask_llm (child span) ← LLM interaction

│ • chat.prompt: “Provide key facts about: quantum computing”

│ • chat.response: “Quantum computing uses quantum mechanics…”

│ • chat.prompt_tokens: 15

│ • chat.completion_tokens: 89

│ • chat.total_tokens: 104

└── agent_synthesize_insight (child span)

└── ask_llm (child span) ← LLM interaction

chat.prompt: “Given this context… What are implications?”
chat.response: “Key implications include…”
chat.prompt_tokens: 120
chat.completion_tokens: 156
chat.total_tokens: 276

Context Propagation

OpenTelemetry automatically maintains context across span creation:

# Context is automatically propagated to child spzans
with tracer.start_as_current_span(“parent”) as parent_span:
parent_span.set_attribute(“operation”, “workflow”) # This child span automatically has parent context
with tracer.start_as_current_span(“child”) as child_span:
child_span.set_attribute(“operation”, “task”)
# child_span is automatically linked as a child of parent_span

Iftracer-sdk

The iftracer-sdk is InsightFinder’s Python SDK built on top of OpenTelemetry that empowers developers to monitor and debug Large Language Model (LLM) executions with minimal intrusion. It enables distributed tracing of prompt calls, LLM responses, vector database interactions, and more—while integrating smoothly with your existing observability stack (e.g., Datadog, Honeycomb, etc.)

Setup and Quick Start

Install the SDK (via PyPI):

pip install iftracer-sdk

Initialize in your code, typically at entry point (e.g., __init__.py):

from iftracer.sdk import IftracerIftracer.init(

api_endpoint=”https://otlp.insightfinder.com”,

iftracer_user=”YOUR_USER”,

iftracer_license_key=”YOUR_LICENSE_KEY”,

iftracer_project=”YOUR_TRACE_PROJECT”,

)

Instrument functions using decorators to capture tracing spans automatically:

from iftracer.sdk.decorators import workflow, task@workflow(name=”my_workflow”)

def my_workflow(…):

…

@task(name=”my_task”)

def my_task(…):

…

Choosing Between Iftracer Decorators
- Use @aworkflow, @atask over an asynchronous function. Use @workflow, @task over a synchronous function.
- Use @aworkflow or @workflow when the function calls multiple tasks or workflows and combines their results, when the function is a high-level orchestration of a process, when you need to get more tags, or when you intend to create a logical boundary for a workflow execution. Otherwise, use @atask or @task.

Once configured, each decorated function generates trace data—including LLM model names, embedding model details, and RAG datasets as tags—allowing for full-stack observability and real-time analytics via the InsightFinder platform

InsightFinder LLM Labs Trace Projects

When using InsightFinder LLM Labs, a dedicated trace project/model is automatically created and populated with your LLM traces. This eliminates the need for manual project setup—every LLM request, response, and intermediate step is automatically ingested into the trace model.

The resulting project in InsightFinder provides:

End-to-end visibility of prompt execution and downstream dependencies.
Automatic correlation of LLM spans with context such as datasets, model type, and evaluation results.
Real-time monitoring of performance, failures, and safety issues.

This means developers can start analyzing and troubleshooting LLM workflows immediately, without extra configuration.

InsightFinder Trace Analysis

LLM Traces Page

From anywhere on the InsightFinder AW UI, click on the LLM Traces on the side navigation bar to go to the AW LLM Traces Page.

Once on the AW LLM Traces Page, Select the model from the drop down (or can type it in to narrow the search).

Once the model is selected, all the trace data that was detected as anomalous by InsightFinder WatchTower will be displayed for that model’s traffic.

Filtering

By default all the anomalous traces for the past 24 hours are loaded and displayed. The search can be further filtered based on the following selections:

Time: Time and Date range for which to load the data. Can manually update the time by editing the time for start and end dates.

Duration(s): Filter based on the time duration of the traces

Token Count: Filter based on minimum and maximum token count per trace

Analyzing the Data

Once the data is loaded, a holistic view of the total token count, total anomaly count, total traces and spans can be seen on top of the page.

Each anomalous trace can also be further analyzed by looking at the individual spans within the trace by clicking on the > to show the respective Spans for the Trace

Each Span within the trace has its own analysis performed by Insightfinder to identify issues on both trace level and specific spans

The following information is available to see immediately once the data is loaded:

Anomaly Score: A measure of how big of an anomaly the trace is. Higher the score, the worse the anomaly is, determined by InsightFinder AW. Can be sorted on by clicking the header
Anomaly Type: The type of anomaly that was detected by InsightFinder. There are 4 major categories of anomalies for trace data (The anomaly type can also be identified by hovering over the icons, figure 1)
- Trace Response Time: If the full trace took longer than normal or threshold
- Span Response Time: If a particle (or multiple) spans within the trace took longer than normal/threshold
- Span No Prompt Response: If there is no response returned for a particular span
- Span Errors: If the trace data has errors set to true in span data
Timestamp: Timestamp for the anomalous Trace and/or Span. Can be sorted
Trace / Span ID: The Trace ID and/or Span ID for the anomalous Trace/Span
Duration: How long the specific trace/span lasted. Can be sorted on by clicking the header
Token Count: Total number of tokens used for the model per Trace call
Token Size: Total size of the tokens for the LLM/ML Model during the Trace call
Json: The Trace log in json format
- Copy Code – Copies the full json trace log
- Download – Downloads the log as a json file

Sensitive Data Filtering in TraceServer

InsightFinder’s TraceServer includes built-in sensitive data filtering to help protect privacy and comply with legal requirements. When enabled, the sensitive data filter automatically detects and masks sensitive information—such as social security numbers (SSNs), addresses, and other personal identifiers—within trace data before it is displayed or processed.

How Sensitive Data Filtering Works

Configuration: Sensitive data filtering is enabled by setting sensitiveDataFilterEnabled: true in your TraceServer configuration file. You can customize the types of sensitive data detected using regular expressions under sensitiveDataRegex. (If using the Saas production, this will be handled by the InsightFinder Team)

Detection: The filter scans incoming trace data for patterns matching sensitive information (e.g., SSNs, addresses, medical IDs).

Masking: When sensitive data is detected, it is masked or removed from the trace output, ensuring that such information is not exposed in the UI or logs.

Example: SSN Filtering

Below are two screenshots demonstrating the effect of sensitive data filtering:

Without Sensitive Data Filter (SSN visible):

In this example, the trace input contains an SSN (123-456-7890). Without filtering, the SSN is visible in the trace details.

With Sensitive Data Filter Enabled (SSN masked):

Here, the sensitive data filter is enabled. The SSN in the trace input is detected and masked, and the response indicates that sensitive data handling policies are enforced.

Benefits

Compliance: Helps meet privacy regulations (e.g., GDPR, HIPAA).

Security: Prevents accidental exposure of sensitive information.

Customizable: Easily extend filtering to new data types via regex patterns.

From the Blog

AI Reliability Platform uniting through composite AI

Blogs

InsightFinder 2025 Retrospective: From Observability Insights to Operational Actions

In 2025, the reliability bar kept moving forward. Engineering teams shipped more distributed systems,…

Traditional observability vs AI complexity

Blogs

Why Traditional Observability Fails in AI Production (And What to Do Instead)

AI systems are forcing engineering leaders to confront an uncomfortable reality: the observability practices…

OpenTelemetry AI Observability and InsightFinder

Blogs

OTel me more about using InsightFinder

Teams often presume that trying a new AI observability tool means re-instrumenting code, swapping…

Explore InsightFinder AI

Take InsightFinder AI for a no-obligation test drive. We’ll provide you with a detailed report on your outages to uncover what could have been prevented.

AI Observability

IT Observability

Unified Intelligence Engine - UIE

Integrations

Release Notes

Welcome to InsightFinder AI Observability Docs!

Traceviewer

What are LLM Traces?

Who can use Traceviewer?

Trace Ingestion

InsightFinder Integrations with Different Tracing SDKs

Overview

Arize Tracing SDK Integration with InsightFinder

Overview

Architecture

Prerequisites

Required Dependencies

Implementation Steps

Step 1: Import Required Modules

Step 2: Configure OpenTelemetry Provider

Step 3: Instrument LLM Client with OpenInference

Step 4: Create Manual Spans for Workflow Control

Step 5: Call LLM Functions (Automatically Traced)

How It Works

Automatic Instrumentation Flow

Temporal SDK Tracing Integration with InsightFinder

Overview

Architecture

Prerequisites

Required Dependencies

Implementation Steps

Step 1: Import Required Modules

Step 2: Configure OpenTelemetry Provider

Step 3: Initialize Temporal Client with TracingInterceptor

Step 4: Add Custom Attributes to Activities

How It Works

Trace Flow

Why TracingInterceptor is Important

DataDog Trace SDK Integration with InsightFinder

Overview

Architecture

Prerequisites

Required Dependencies

DDTrace Configuration

Method 1: Environment Variables (Recommended)

Method 2: In-Code Configuration

Implementation Steps

Step 1: Import Required Modules

Key Components

Step 2: Enable Automatic Library Instrumentation

Step 3: Create LLM Call Function with Custom Tags

Step 4: Set Attributes for Parent Spans

Step 5: Add Error Handling

How It Works

Trace Flow

OpenTelemetry SDK Tracing Integration with InsightFinder

Overview

Architecture

Prerequisites

Required Dependencies

Implementation Steps

Step 1: Import Required Modules

Step 2: Configure OpenTelemetry Provider

Configuration via Environment Variables

Step 3: Initialize Tracer in Your Application

Step 4: Create Spans for Your Workflow

Step 5: Add Custom Attributes to Activities

How It Works

Trace Flow

Context Propagation

Iftracer-sdk

Setup and Quick Start

InsightFinder LLM Labs Trace Projects

InsightFinder Trace Analysis

LLM Traces Page

Filtering

Analyzing the Data

Sensitive Data Filtering in TraceServer

How Sensitive Data Filtering Works

Without Sensitive Data Filter (SSN visible):