CLEO Metrics Value Proof Specification
Problem Statement
Goals
Part 1: Token Consumption Tracking
The Solution: OpenTelemetry Integration
How to Enable Telemetry
Part 2: Manifest Token Savings
Hypothesis
Measurement Approach
Expected Results
Part 3: Validation Impact Measurement
Tracking Violations Caught
Value Demonstration
Part 4: A/B Testing Framework
Test Scenarios
Metrics to Compare
Implementation Status
References

CLEO Metrics Value Proof Specification

Version: 0.1.0 Status: DRAFT Created: 2026-02-01 Epic: T2833

Problem Statement

CLEO claims to save context tokens and prevent hallucinations, but there is no mechanism to prove these claims:

Token consumption: All metrics show 0 because there’s no data source
Manifest savings: Theory says manifest reads save tokens, but no measurement
Hallucination prevention: Validators exist but no before/after comparison
Skill composition: Single skill only, no progressive loading measurement

Goals

Measure actual token usage - Before and after CLEO
Prove manifest efficiency - Full file vs manifest-only reads
Track validation impact - Violations caught, fixes applied
Enable skill composition - Multiple skills with progressive disclosure

Part 1: Token Consumption Tracking

The Solution: OpenTelemetry Integration

Claude Code DOES track actual tokens via OpenTelemetry telemetry:

claude_code.token.usage (tokens)
├── type: "input" | "output" | "cacheRead" | "cacheCreation"
└── model: "claude-sonnet-4-5-20250929" etc.

Available data per API request:

input_tokens - Actual input tokens consumed
output_tokens - Actual output tokens generated
cache_read_tokens - Tokens read from cache
cache_creation_tokens - Tokens used to create cache

How to Enable Telemetry

Option 1: Console Export (development)

export CLAUDE_CODE_ENABLE_TELEMETRY=1
export OTEL_METRICS_EXPORTER=console
export OTEL_METRIC_EXPORT_INTERVAL=5000

Option 2: File Export (CLEO integration)

export CLAUDE_CODE_ENABLE_TELEMETRY=1
export OTEL_METRICS_EXPORTER=otlp
export OTEL_EXPORTER_OTLP_PROTOCOL=http/json
export OTEL_EXPORTER_OTLP_ENDPOINT=file://.cleo/metrics/otel/

Part 2: Manifest Token Savings

Hypothesis

Reading manifest summaries instead of full agent output files saves significant tokens.

Measurement Approach

# Track manifest reads
track_manifest_read() {
    local entry_id="$1"
    local tokens=$(estimate_tokens "$entry_content")
    log_token_event "manifest_read" "$tokens" "$entry_id"
}

# Estimate full file equivalent
estimate_full_file_tokens() {
    local file_path="$1"
    local char_count=$(wc -c < "$file_path")
    echo $((char_count / 4))  # ~4 chars per token
}

Expected Results

Approach	Tokens per Entry	10 Entries
Manifest only	~200	~2,000
Full file	~2,000	~20,000
Savings	90%	18,000

Part 3: Validation Impact Measurement

Tracking Violations Caught

{
  "timestamp": "2026-02-01T01:23:45Z",
  "source_id": "T1234",
  "validation_result": {
    "passed": false,
    "score": 75,
    "violations": [
      {
        "requirement": "RSCH-001",
        "severity": "error",
        "message": "Research task modified code",
        "fix": "Revert code changes"
      }
    ]
  }
}

Value Demonstration

Violations caught: Count per period
Prevention rate: Violations / Total completions
By protocol: Which protocols catch most issues

Part 4: A/B Testing Framework

Test Scenarios

Scenario	Description
Baseline	Direct implementation without CLEO
With CLEO	Orchestrator + subagents + manifest

Metrics to Compare

Metric	Baseline	With CLEO	Expected
Total tokens	Higher	Lower	-50%+
Files read	Many	Few	-80%+
Validation failures	N/A	Caught	>0

Implementation Status

References

METRICS-VALIDATION: Complete metrics system
TOKEN-TRACKING-ARCHITECTURE: Tracking tiers

Metrics Validation System Token Tracking Architecture

Overview

System Architecture

RCSD-IVTR Protocol Stack

Protocol Definitions

Agent Protocols

Core Specifications

Strategic

MCP Server

Metrics & Validation

Protocol Enforcement

System Specifications

Feature Specifications

Release & Changelog

Schemas

Development Guides

Troubleshooting

Metrics Value Proof

CLEO Metrics Value Proof Specification

Problem Statement

Goals

Part 1: Token Consumption Tracking

The Solution: OpenTelemetry Integration

How to Enable Telemetry

Part 2: Manifest Token Savings

Hypothesis

Measurement Approach

Expected Results

Part 3: Validation Impact Measurement

Tracking Violations Caught

Value Demonstration

Part 4: A/B Testing Framework

Test Scenarios

Metrics to Compare

Implementation Status

References

Overview

System Architecture

RCSD-IVTR Protocol Stack

Protocol Definitions

Agent Protocols

Core Specifications

Strategic

MCP Server

Metrics & Validation

Protocol Enforcement

System Specifications

Feature Specifications

Release & Changelog

Schemas

Development Guides

Troubleshooting

​CLEO Metrics Value Proof Specification

​Problem Statement

​Goals

​Part 1: Token Consumption Tracking

​The Solution: OpenTelemetry Integration

​How to Enable Telemetry

​Part 2: Manifest Token Savings

​Hypothesis

​Measurement Approach

​Expected Results

​Part 3: Validation Impact Measurement

​Tracking Violations Caught

​Value Demonstration

​Part 4: A/B Testing Framework

​Test Scenarios

​Metrics to Compare

​Implementation Status

​References

CLEO Metrics Value Proof Specification

Problem Statement

Goals

Part 1: Token Consumption Tracking

The Solution: OpenTelemetry Integration

How to Enable Telemetry

Part 2: Manifest Token Savings

Hypothesis

Measurement Approach

Expected Results

Part 3: Validation Impact Measurement

Tracking Violations Caught

Value Demonstration

Part 4: A/B Testing Framework

Test Scenarios

Metrics to Compare

Implementation Status

References