Observability
Full execution visibility with per-step timing, request/response inspection, real-time log streaming, and dashboard trend analysis.
TestMesh gives you complete visibility into every test execution. When a test fails, you see exactly what request was sent, what response came back, which assertion failed, and what the variable state was at the time of failure.
Execution Dashboard
The dashboard shows all executions with real-time status updates via WebSocket:
Executions
checkout-flow 2m ago
5/5 steps • 2.3s • staging-agent
payment-flow 5m ago
3/7 steps • Failed at "charge_card" • prod-agent
HTTP 402: Payment Required
user-registration-flow Running
2/5 steps • 12.5s elapsed
[████████░░░░░░░░░░] 40%- Real-time status (running, success, failed)
- Progress indicator during execution
- Error summary visible without clicking through
- Filter by status, flow, agent, or date range
Step-Level Detail
Click any execution to see a full timeline. Each step is expandable to show complete request, response, timing, and variable state.
Expanded Failed Step
Step 3: charge_card 2.1s FAILED
POST /api/payments/charge
→ 402 Payment Required
Error Details
─────────────
Status: 402 Payment Required
Message: Insufficient funds
Assertion Failed:
Expected: status == 200
Actual: status == 402
Request
─────────────
POST https://api.company.com/payments/charge
Headers:
Content-Type: application/json
Authorization: Bearer sk_test_***abc123
X-Request-ID: req_abc123xyz
Body:
{
"amount": 9999,
"currency": "usd",
"source": "card_123",
"customer": "cus_456"
}
Response
─────────────
Status: 402 Payment Required
Headers:
Content-Type: application/json
X-Request-ID: req_abc123xyz
Body:
{
"error": {
"type": "insufficient_funds",
"message": "Insufficient funds",
"code": "insufficient_funds"
}
}
Timing Breakdown
─────────────────
DNS Lookup: 12ms
TCP Connect: 45ms
TLS Handshake: 89ms
Request Send: 3ms
Wait (TTFB): 1.2s
Response Download: 5ms
Total: 2.1s
Variables at this point
─────────────────
cart_id: "cart_123"
total_amount: 9999
customer_id: "cus_456"
card_token: "card_123"You can copy the request as cURL from any failed step to reproduce the issue in your terminal instantly.
Execution Tabs
Each execution has dedicated tabs for different views:
Timeline
Step-by-step waterfall with expandable request/response details
Logs
Structured log stream with level filtering and search
Variables
All context variables and their values at each step
Network
All HTTP requests made during the execution
Metrics
Timing breakdown and performance data
Log Streaming
The Logs tab streams structured output in real-time with millisecond timestamps:
[14:23:45.123] [INFO] Execution started
[14:23:45.124] [INFO] Agent: prod-agent
[14:23:45.125] [INFO] Flow: payment-flow (v1.2.0)
[14:23:45.126] [DEBUG] Loading environment variables
[14:23:45.127] [DEBUG] API_URL=https://api.company.com
[14:23:45.130] [INFO] Setup
[14:23:45.200] [INFO] Setup completed in 0.2s
[14:23:45.202] [INFO] Step 1: create_cart
[14:23:45.205] [DEBUG] POST https://api.company.com/cart
[14:23:45.512] [INFO] → 201 Created (0.31s)
[14:23:45.513] [DEBUG] Saved: cart_id = "cart_123"
[14:23:45.514] [INFO] Step 2: add_items
[14:23:46.014] [INFO] → 200 OK (0.50s)
[14:23:46.015] [INFO] Step 3: charge_card
[14:23:46.016] [DEBUG] POST https://api.company.com/payments/charge
[14:23:48.116] [ERROR] → 402 Payment Required (2.10s)
[14:23:48.117] [ERROR] Assertion failed: status == 200
[14:23:48.118] [ERROR] Expected: 200, Actual: 402
[14:23:48.119] [ERROR] Execution failed at step: charge_card- Filter by level: Debug, Info, Warn, Error
- Search across all log entries
- Auto-scroll toggles to keep up with live output
- Download full log as plain text
Debug Mode
Debug mode enables interactive step-by-step execution from the CLI:
testmesh debug my-flow.yamlDebug Mode — my-flow.yaml
> Step 1: create_cart
Action: http_request
POST https://api.company.com/cart
[n]ext [s]kip [v]ariables [b]reakpoint [q]uit
> n
Response: 201 Created
Saved: cart_id = "cart_123"
> Step 2: add_items
[n]ext [s]kip [v]ariables [b]reakpoint [q]uit
> v
Variables:
cart_id: "cart_123"
customer_id: "cus_456"
API_URL: "https://api.company.com"Debug mode lets you:
- Step through one step at a time
- Inspect all variables at any point
- Set breakpoints on specific steps
- Skip steps to isolate failures
Variables Tab
Inspect the full variable context at any point in the execution — what was saved from previous steps, what was available when a step ran, and what changed after it completed.
Variables
[After Step 2]
cart_id: "cart_123"
item_count: 3
customer_id: "cus_456"
total_amount: 9999
API_URL: "https://api.company.com"
AUTH_TOKEN: "Bearer sk_test_***" (masked)Sensitive values (tokens, passwords, API keys) are automatically masked in the UI.
Metrics
The Metrics tab shows timing data for the full execution and per-step breakdowns:
Execution Metrics
Total Duration: 3.2s
Setup: 0.2s
Steps: 2.8s
Teardown: 0.2s
Slowest Steps:
charge_card 2.1s (66% of total)
add_items 0.5s (16% of total)
create_cart 0.3s (9% of total)
setup 0.2s (6% of total)
teardown 0.1s (3% of total)Historical Trends
The dashboard tracks pass rates, duration trends, and flaky test detection over time.
Pass Rate Over Time
View test stability across the last 30 days. Dips in the chart correlate with specific deployments or infrastructure issues.
Duration Trends
Track whether a flow is getting slower over time. Sudden spikes indicate performance regressions or infrastructure problems.
Flaky Test Detection
Tests that pass and fail inconsistently are flagged automatically:
Flaky Tests (Last 50 Runs)
Test: Update User Email
Flakiness: 23%
Pass rate: 77% (38/50)
Common errors:
- Timeout waiting for response (8 times)
- Status code 500 (3 times)
- Connection refused (1 time)Real-Time WebSocket Updates
The dashboard connects to the TestMesh API via WebSocket and receives live updates as executions run. You don't need to refresh the page — step status, log entries, and variable values appear in real time as each step completes.
WebSocket updates are scoped per execution. Opening the execution detail page for a running flow will show live progress regardless of when the execution started.
API Endpoint Coverage
The dashboard also tracks which API endpoints have been exercised by your flows:
API Endpoint Coverage
/users POST tested 2m ago
/users/:id GET tested 2m ago
/users/:id PUT tested 2m ago
/users/:id DELETE tested 2m ago
/orders POST never tested
/orders/:id GET never tested
Covered: 57% (4/7 endpoints)Use this to identify gaps in your test coverage alongside the AI integration feature to auto-generate tests for missing endpoints.
Asserting on Service Observability
Beyond TestMesh's own execution visibility, you can use the built-in LGTM plugins to assert on the traces, logs, and metrics that your services emit. This closes the loop: not just "did the API return 201?" but "did the service emit the right span, write the right log line, and increment the right counter?"
Trace propagation
otel.inject creates a span and returns W3C traceparent / tracestate headers. Pass these to your service — any OTel-instrumented service will attach its own spans as children of the injected trace. Then otel.assert queries Tempo to verify the full trace.
steps:
- id: inject
action: otel.inject
config:
span_name: place-order
output:
traceparent: $.traceparent
trace_id: $.trace_id
- id: place_order
action: http_request
config:
method: POST
url: http://order-service/orders
headers:
traceparent: "{{traceparent}}"
body: { user_id: 1, product_id: 2, quantity: 1 }
- id: verify_span
action: otel.assert
config:
backend_url: http://tempo:3200
trace_id: "{{trace_id}}"
within: 10s
assert:
- "len(spans) > 0"
- "spans[0].duration_ms < 500"Log assertion
loki.assert queries Loki with LogQL and asserts that specific log lines appeared — with optional polling until they arrive.
- id: verify_log
action: loki.assert
config:
url: http://loki:3100
query: '{service="order-service"} |= "created order"'
start: "-30s"
within: 5s
assert:
- "count > 0"Metrics assertion
prometheus.assert runs PromQL and asserts on the result. Capture a baseline before your action, then assert the delta after.
- id: baseline
action: prometheus.query
config:
url: http://prometheus:9090
query: 'http_requests_total{service="order-service",status="201"}'
output:
baseline: $.value
# ... run some orders ...
- id: verify
action: prometheus.assert
config:
url: http://prometheus:9090
query: 'http_requests_total{service="order-service",status="201"}'
within: 15s
assert:
- "value >= baseline + 3"See the Observability Actions reference for full config options and the observability examples for complete flows.
Semantic Search & Embeddings
Use vector embeddings to find similar tests, detect duplicate flows, and enhance AI agent analysis with semantic understanding across your test graph.
Telemetry & Trace Intelligence
Ingest OpenTelemetry traces, auto-discover flows, detect drift, validate traces against expected paths, and get AI-powered root cause analysis.