OpenTelementory Integration

InsightFinder provides a OTLP/HTTP service that accepts trace and log data from OpenTelemetry collector and sends it to the InsightFinder engine.

InsightFinder OTLP service can be deployed either on any cloud environment or inside a datacenter protected by a firewall.

open telemetry insightfinder integration

Prerequisites

  • Java JRE 21 or higher
  • OpenTelemetry Collector

InsightFinder OTLP Service requires OpenTelemetry collector to be configured to stream data.

Setup

OpenTelemetry Collector

Installation

InsightFinder requires a “OpenTelemetry Collector” to be set up to stream data.

1. Download the OpenTelemetry release from GitHub release page.
2. Extract the downloaded tarball:


tar -xvzf otelcol-<platform>-amd64.tar.gz

3. Run the collector with the configuration file:

./otelcol --config=config.yaml

Configuration

Here is the example of an OpenTelemetry collector that collects data and send to InsightFinder server:


receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
processors:
  batch:

exporters:
  # >>> InsightFinder Exporter <<<
  otlp/insightfinder:
    endpoint: otlp.insightfinder.com:4317
    tls:
      insecure: true
    headers:
      ifuser: "userName" # The username from InsightFinder
      iflicenseKey: "123445" # The licenseKey from InsightFinder
      

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [batch]
      exporters: [otlphttp/insightfinder]
    logs:
      receivers: [otlp]
      processors: [batch]
      exporters: [otlphttp/insightfinder]


InsightFinder OTLP Service

Installation

1. Get the precompiled JAR file from the GitHub repo.
2. Create configuration files server.yaml and data.yaml under the same folder using the example here.
3. Start the service by running


java -jar insightfinder-otlpserver-1.0.0-SNAPSHOT.jar

Configuration

Server
File server.yaml has the configurations to control the gRPC server and data processing workers.
Edit the file based on the following example:


# InsightFinder Engine URL
insightFinderUrl: "https://stg.insightfinder.com"

# The gRPC server listen port
port: 4317

# TLS/SSL settings.
tls:
  enabled: false

  # Path to full-chain certificate file (PEM)
  certificateFile: ''

  # Path to the private key file.
  privateKeyFile: ''

# Worker Thread Configurations.
worker:
  # DataProcess worker: Extracting information from the original data.
  processThreads: 1

  # Streaming worker: Send data to InsightFinder.
  streamingThreads: 1


Data Processing
File data.yaml has the configurations to the core ETL process of the data.
It defines the rules to extract the key metadata from the raw data which is sent via OLTP protocol:


users:
  # InsightFinder username
  maoyuwang:

    # InsightFinder licenseKey
    licenseKey: "123"

    # Rules to process log data.
    log:
      extraction:

        # The rule to extract projectName from the data
        projectFrom:
          - source: "body"
            field: "_source.project"
            regex: '.*'

        # (Optional)
        # The rule to extract systemName from the data
        systemFrom:
          - source: "body"
            field: "_source.datacenter"
            regex: '.*'

        # The rule to extract instanceName from the data
        instanceFrom:
          - source: "body"
            field: "_source.hostname"
            regex: '.*'

        # (Optional)
        # The rule to extract component from the data
        componentFrom:
          - source: "body"
            field: ''
            regex: '.*'

        # The rule to extract timestamp from the data
        timestampFrom:
          - source: "body"
            field: "_source.timestamp"

    # Rules to process trace data.
    trace:
      extraction:

        # The rule to extract projectName from the data
        projectFrom:
          - source: "body"
            field: "_source.dataset"
            regex: '.*'

        # (Optional)
        # The rule to extract systemName from the data
        systemFrom:
          - source: "body"
            field: "_source.datacenter"
            regex: ''

        # The rule to extract instanceName from the data
        instanceFrom:
          - source: "body"
            field: "spanAttributes.net.host.name"
            regex: ".*"

        # (Optional)
        # The rule to extract component from the data
        componentFrom:
          - source: "body"
            field: "traceAttributes.service.name"
            regex: ".*"


Each rule entry starts with a source field that defines where to search in the raw data.

 

Searching Rules

Bellowing is a table of different kinds of sources and their effect:

Source Description & Example Use with Support Metadata
body This option can search for data in the data body. If the data is JSON, it will search data in the specific field.
Example:
source: "body"
field: "_source.hostname"
regex: '.*'
● field
● regex
● projectFrom
● systemFrom
● instanceFrom
● componentFrom
● timestampFrom
header This option will search for data in the Header configured in OpenTelemetry.
Example:
source: "header"
field: "myheader"
regex: '.*'
● field
● regex
● projectFrom
● systemFrom
● instanceFrom
● componentFrom
static It will use the value from the value field as the result.
Example:
source: "static"
value: "elastic-log-project"
● value
● projectFrom
● systemFrom
● instanceFrom
● componentFrom
sender It will use the data receiving time as the timestamp.
Example:
source: "sender"
● timestampFrom

This configuration file also supports searching metadata in multiple source entries.

 

Multi-source Searching

You can create a list of rules in the configuration to allow it to search with each rule one by one and apply the first target as the result.

Below is an good example to search for data to be used as instanceName for InsightFinder:

  1. First, It tries to use _source.hostname in the message body.
  2. If that field does not exist, it will use the IP address appeared in the _source.message through a regex pattern.
  3. If it still can’t get the instance name from above rules, it will use unknownInstance as the instanceName for this data entry.

instanceFrom:

  - source: "body"
    field:  '_source.hostname'
    regex:  '.*'

  - source: "body"
    field: "_source.message"
    regex: '(?:[0-9]{1,3}\.){3}[0-9]{1,3}'

  - source: "static"
    value:  'unknownInstance'