A self-hosted, open-source analysis engine that ingests OpenTelemetry traces and automatically identifies where latency occurs and why in distributed Go services.
Modern distributed systems rely heavily on OpenTelemetry-based tracing to understand request latency across services. Existing open-source tracing tools such as Jaeger and Grafana Tempo are excellent at answering:
Which service or span is slow?
However, they fundamentally fail to answer the more important and actionable question:
Why was this span slow?
A single span may represent tens or hundreds of milliseconds, but current tools treat it as a black box. They do not explain whether the delay was caused by:
Kernel-level I/O blocking
Network handshake overhead (DNS/TCP/TLS)
Garbage collection pauses
Goroutine scheduling delays
Actual application logic
As a result, engineers are forced to guess, add ad-hoc logs, or use heavyweight profilers that cannot be correlated back to individual requests.
The goal of this project is to build a Free and Open Source Distributed Latency Breakdown Tool that:
Decomposes a single OpenTelemetry span into its true runtime latency components
Instead of only reporting how long a span took, the tool explains where the time went inside the span, in real time.
This project introduces a custom Go-based OpenTelemetry Collector that acts as an intelligent pre-processor for traces.
The collector:
Receives standard OTLP traces
Collects low-level runtime signals from the host
Correlates these signals with active spans
Produces an augmented span with a detailed latency breakdown
This approach preserves OpenTelemetry compatibility while extending it with deep runtime visibility.
Show span duration
Show service dependencies
Do not explain span internals
Explains what happened inside a span
Attributes latency to concrete runtime causes
Works without proprietary agents or SaaS services
Uses only open-source technologies
In short:
Jaeger shows the symptom.
This tool shows the cause.
Instrumented Go Services
(OpenTelemetry SDK)
|
| OTLP Traces
v
Custom Go Collector ← Core Innovation
├─ OTLP Receiver
├─ Span Lifecycle Tracker
├─ eBPF Syscall Tracing
├─ Go Runtime Metrics (GC, Scheduler)
├─ Network Timing Attribution
├─ Span ↔ Runtime Correlation Engine
└─ Latency Decomposition Engine
|
v
JSON API / CLI / Minimal UI
A sample Go microservice is instrumented using the OpenTelemetry Go SDK:
Incoming HTTP requests create root spans
Outgoing HTTP calls create child spans
Context propagation ensures trace continuity
The application uses net/http/httptrace to capture:
DNS resolution time
TCP connection time
TLS handshake time
These values are attached to spans as attributes.
This collector is the heart of the system.
Receive OTLP spans
Track active spans in memory
Collect host-level runtime events
Correlate events with spans
Decompose span duration into components
Expose enriched trace data via an API
Unlike standard collectors, this collector understands time, not just telemetry formats.
Using eBPF, the collector measures:
Duration of blocking syscalls (read, write, connect, etc.)
Kernel-level waiting time invisible to application code
This allows accurate attribution of:
Disk I/O delays
Network blocking
Context switching overhead
The collector reads:
Garbage collection pause durations
Goroutine scheduling behavior
These signals explain latency caused by:
Memory pressure
Concurrency contention
By integrating httptrace, the tool breaks down network delays into:
DNS
TCP
TLS
The key technical challenge is correlating low-level runtime events with high-level spans.
A runtime event is attributed to a span if:
The process ID matches
The event’s timestamp overlaps the span’s lifetime
This simple, deterministic rule enables accurate attribution without invasive instrumentation.
Each span’s total duration is decomposed as:
Span Duration =
DNS +
TCP +
TLS +
Syscall Time +
GC Pause +
Scheduler Delay +
Application Logic (Residual)
The residual represents actual business logic time and ensures correctness even when some signals are missing.
For a slow request, the tool produces a clear breakdown:
/checkout (92ms)
├─ DNS lookup: 14ms
├─ TCP connect: 11ms
├─ TLS handshake: 6ms
├─ Kernel syscalls: 21ms
├─ GC pause: 9ms
└─ App logic: 31ms
This output can be viewed via:
JSON API
CLI tool
Minimal web UI
Debugging tail latency (p95 / p99)
Identifying kernel vs application bottlenecks
Understanding GC-induced latency spikes
Diagnosing network-related slowdowns
Performance tuning of Go microservices
Latency is not just a performance metric — it is a reliability and user experience issue.
By exposing the real causes of latency:
Engineers debug faster
Optimizations are targeted
Guesswork is eliminated
This project bridges the gap between:
Distributed tracing
Low-level system profiling
All while remaining fully open-source and vendor-neutral.
Cross-node correlation
Tail-latency anomaly detection
Integration with Jaeger or Tempo
Support for additional languages
Advanced scheduling and lock contention analysis