Observability (OTel, Loki, Prometheus)
Inject trace context, assert on spans in Tempo, query logs in Loki, and validate metrics in Prometheus.
TestMesh ships three built-in observability plugins covering the full LGTM stack. Together they let you write flows that verify not just API responses, but the traces, logs, and metrics your services emit.
How it fits together
Flow step (otel.inject)
→ sets traceparent header
→ http_request carries header to your service
→ service emits span to OTel Collector → Tempo
→ otel.assert queries Tempo by trace ID
Flow action triggers your service
→ service logs via zap/OTLP → Loki
→ loki.assert queries Loki with LogQL
Metric counter increments in your service
→ prometheus.query captures baseline
→ ... actions run ...
→ prometheus.assert verifies counter increasedotel.inject
Creates an OTel span and injects W3C traceparent / tracestate into the flow context. Pass the output as a header in subsequent HTTP steps.
- id: start_trace
action: otel.inject
config:
service_name: testmesh-flow # optional
span_name: create-order # optional, defaults to step id
output:
traceparent: $.traceparent
trace_id: $.trace_id
- id: create_order
action: http_request
config:
method: POST
url: http://order-service:5003/orders
headers:
traceparent: "{{traceparent}}"
body: { user_id: 1, product_id: 2, quantity: 1 }Output: { "traceparent": "00-...", "tracestate": "", "trace_id": "...", "span_id": "..." }
otel.assert
Queries Grafana Tempo for spans by trace ID (or service + operation) and asserts on them.
- id: verify_trace
action: otel.assert
config:
backend_url: http://tempo:3200 # required
trace_id: "{{trace_id}}" # required (or use service + operation)
within: 10s # poll until spans appear or timeout
assert:
- "len(spans) > 0"
- "spans[0].duration_ms < 500"
- "spans[0].status == 'ok'"Output: { "spans": [{ "trace_id": "...", "service": "order-service", "operation": "POST /orders", "duration_ms": 42, "status": "ok", "attributes": {...} }] }
loki.query
Queries Grafana Loki using LogQL.
- id: fetch_logs
action: loki.query
config:
url: http://loki:3100 # required
query: '{service="order-service"} |= "created order"' # required (LogQL)
start: "-5m" # optional, relative or RFC3339
end: "now" # optional
limit: 100 # optional, default 100
output:
log_count: $.countOutput: { "lines": ["2026-03-30T10:00:00Z order-service created order id=abc"], "count": 1 }
loki.assert
Queries Loki and asserts on the results, with optional polling.
- id: verify_log
action: loki.assert
config:
url: http://loki:3100
query: '{service="order-service"} |= "created order"'
start: "-30s"
within: 5s # poll until assertion passes or timeout
assert:
- "count > 0"prometheus.query
Runs a PromQL instant query against Prometheus.
- id: baseline
action: prometheus.query
config:
url: http://prometheus:9090 # required
query: 'http_requests_total{service="order-service",status="201"}'
output:
baseline: $.valueOutput: { "value": 42.0, "metric": { "service": "order-service", "status": "201" } }
prometheus.assert
Runs PromQL and asserts on the result, with optional polling until the assertion passes.
- id: verify_counter
action: prometheus.assert
config:
url: http://prometheus:9090
query: 'http_requests_total{service="order-service",status="201"}'
within: 15s
assert:
- "value >= baseline + 3"The full LGTM stack (OTel Collector, Tempo, Loki, Prometheus, Grafana) is included in the local dev setup. Start everything with ./infra.sh up. Grafana is available at http://localhost:3002 with Tempo, Loki, and Prometheus pre-configured as datasources.