Skip to main content

Find Command Specification

Version: 1.0 | Status: DRAFT | Date: 2025-12-18 Task: T376 - Research: Fuzzy task search command for LLM agents Category: Read command (LLM-Agent-First)

Executive Summary

The find command provides efficient task discovery for LLM agents, enabling fuzzy search with minimal context output. This directly addresses the context bloat problem: a full list --format json returns 355KB for 352 tasks, while find returns only matching tasks with minimal fields (500 bytes - 2KB typical).

Problem Statement

ScenarioCurrent ApproachContext Cost
Find task by partial titlelist --format json | jq355KB + parsing
Find task by ID prefixlist --format json | grep355KB + parsing
Fuzzy search for related taskslist then manual scan355KB + LLM reasoning
Check if task name existslist + full scan355KB

Solution: find Command

ScenarioNew ApproachContext CostReduction
Find task by partial titlefind "auth"~1KB99.7%
Find task by ID prefixfind --id 37~500B99.9%
Fuzzy search for related tasksfind "user registration"~2KB99.4%
Check if task name existsfind --exact "Task title"~300B99.9%

RFC 2119 Conformance

This specification uses RFC 2119 keywords:
  • MUST: Absolute requirement
  • SHOULD: Recommended but not mandatory
  • MAY: Optional

Part 1: Command Definition

Basic Syntax

ct find <query> [OPTIONS]
ct find --id <id-pattern> [OPTIONS]

Arguments

ArgumentTypeRequiredDescription
<query>stringYes*Search query for title/description
*Either <query> or --id is required

Options

OptionShortTypeDefaultDescription
--id-istring-Search by task ID pattern (prefix match)
--fieldstringtitle,descriptionFields to search: title, description, labels, notes, all
--status-sstring-Filter by status
--limit-nint10Maximum results to return
--threshold-tfloat0.3Minimum match score (0-1)
--exact-eboolfalseExact match instead of fuzzy
--include-archiveboolfalseSearch archived tasks too
--format-fstringautoOutput format: text, json, jsonl
--quiet-qboolfalseSuppress non-essential output
--verbose-vboolfalseInclude full task objects in output

Exit Codes

CodeConstantDescription
0EXIT_SUCCESSMatches found
2EXIT_INVALID_INPUTInvalid query or options
100EXIT_NO_DATANo matches found (not an error)

Part 2: Search Modes

2.1 Fuzzy Title/Description Search (Default)

Searches task titles and descriptions using substring matching with relevance scoring.
# Find tasks mentioning "auth" anywhere in title or description
ct find "auth"

# Find tasks related to user registration
ct find "user registration" --field title

# Search only in labels
ct find "bug" --field labels
Matching Algorithm:
  1. Case-insensitive substring match
  2. Word boundary bonus (matching whole word scores higher)
  3. Title match scores higher than description match
  4. Multiple query terms use AND logic
Searches task IDs by prefix, suffix, or contains.
# Find tasks with IDs starting with T37
ct find --id 37

# Find specific ID range
ct find --id "T37[0-9]"  # T370-T379

# Partial ID lookup
ct find --id 001  # Returns T001
Returns only exact title matches (useful for duplicate checking).
# Check if exact task title exists
ct find "Implement authentication middleware" --exact
# Search all text fields
ct find "security" --field all

# Search specific fields
ct find "bug" --field labels,notes

Part 3: Output Format (LLM-Agent-First)

3.1 JSON Output (Default for Non-TTY)

MUST follow LLM-AGENT-FIRST-SPEC envelope:
{
  "$schema": "https://cleo.dev/schemas/v1/output.schema.json",
  "_meta": {
    "format": "json",
    "version": "0.19.0",
    "command": "find",
    "timestamp": "2025-12-18T10:00:00Z",
    "execution_ms": 15
  },
  "query": {
    "text": "auth",
    "mode": "fuzzy",
    "fields": ["title", "description"],
    "threshold": 0.3
  },
  "summary": {
    "total_searched": 352,
    "matches": 3,
    "truncated": false
  },
  "matches": [
    {
      "id": "T042",
      "title": "Implement auth middleware",
      "status": "pending",
      "priority": "high",
      "score": 0.95,
      "matched_in": ["title"]
    },
    {
      "id": "T123",
      "title": "Add authentication tests",
      "status": "done",
      "priority": "medium",
      "score": 0.80,
      "matched_in": ["title"]
    },
    {
      "id": "T201",
      "title": "Security review",
      "status": "pending",
      "priority": "high",
      "score": 0.45,
      "matched_in": ["description"]
    }
  ]
}

3.2 Minimal Match Object

For context efficiency, match objects are minimal by default:
FieldTypeIncludedDescription
idstringAlwaysTask ID
titlestringAlwaysTask title
statusstringAlwaysTask status
prioritystringAlwaysTask priority
scorefloatAlwaysMatch relevance (0-1)
matched_inarrayAlwaysFields where match was found
phasestringIf presentTask phase
labelsarrayIf --verboseTask labels
descriptionstringIf --verboseFull description

3.3 Verbose Output

With --verbose, include full task objects:
{
  "matches": [
    {
      "id": "T042",
      "score": 0.95,
      "matched_in": ["title"],
      "task": {
        "id": "T042",
        "title": "Implement auth middleware",
        "description": "Add JWT authentication...",
        "status": "pending",
        "priority": "high",
        "labels": ["auth", "backend"],
        "depends": ["T040"],
        "createdAt": "2025-12-01T10:00:00Z"
      }
    }
  ]
}

3.4 Text Output (Interactive TTY)

FIND: "auth" (3 matches)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

  T042  [pending]  Implement auth middleware           (0.95)
        high • auth, backend

  T123  [done]     Add authentication tests            (0.80)
        medium

  T201  [pending]  Security review                     (0.45)
        high • matched in description

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Use 'ct show T042' to view full details

Part 4: Use Cases for LLM Agents

4.1 Task Discovery Before Update

# Agent needs to update a task about authentication
# OLD WAY: ct list --format json | jq '.tasks[] | select(.title | contains("auth"))'
# NEW WAY:
ct find "auth" --limit 5
Context saved: ~354KB → ~1KB (99.7% reduction)

4.2 Dependency Resolution

# Agent needs to find related tasks before adding dependency
ct find "database schema" --status pending --limit 3

4.3 Duplicate Checking Before Add

# Check if similar task already exists
if ct find "Implement user login" --exact --quiet; then
  echo "Task already exists"
fi

4.4 ID Lookup with Partial Memory

# Agent remembers "something around T370"
ct find --id 37 --limit 5

4.5 Label-Based Task Discovery

# Find all bug-related tasks
ct find "bug" --field labels --status pending

Part 5: Implementation Requirements

5.1 Foundation Libraries (MUST)

Per LLM-AGENT-FIRST-SPEC:
source "${LIB_DIR}/exit-codes.sh"
source "${LIB_DIR}/error-json.sh"
source "${LIB_DIR}/output-format.sh"

COMMAND_NAME="find"

5.2 TTY-Aware Format Resolution (MUST)

# After argument parsing
FORMAT=$(resolve_format "$FORMAT")

5.3 JSON Output Requirements (MUST)

  • Include $schema field
  • Include complete _meta envelope
  • Include success boolean
  • Use output_error() for errors

5.4 Performance Requirements

MetricRequirement
Search time (1000 tasks)< 100ms
Memory usageO(n) where n = result count
JSON output size (10 matches)< 5KB

5.5 Scoring Algorithm

score = base_score * field_weight * position_bonus

base_score:
  - Exact match: 1.0
  - Word boundary match: 0.9
  - Substring match: 0.7
  - Fuzzy match: 0.5

field_weight:
  - title: 1.0
  - labels: 0.9
  - description: 0.7
  - notes: 0.5

position_bonus:
  - Match at start: +0.1
  - Match at word boundary: +0.05

Part 6: CLI Integration

6.1 Command Registration

Add to install.sh CMD_MAP:
[find]="find.sh"
Add to CMD_DESC:
[find]="Fuzzy search tasks by title, ID, or labels"

6.2 Aliases

[search]="find"
[f]="find"

6.3 Help Integration

ct find --help
ct help find

Part 7: Error Handling

7.1 Error Responses

{
  "$schema": "https://cleo.dev/schemas/v1/error.schema.json",
  "_meta": {"command": "find", "timestamp": "...", "version": "..."},
  "success": false,
  "error": {
    "code": "E_INPUT_MISSING",
    "message": "Search query required",
    "exitCode": 2,
    "recoverable": true,
    "suggestion": "Use 'ct find <query>' or 'ct find --id <pattern>'"
  }
}

7.2 No Matches Response

{
  "$schema": "https://cleo.dev/schemas/v1/output.schema.json",
  "_meta": {"command": "find", "timestamp": "...", "version": "..."},
  "success": true,
  "query": {"text": "nonexistent", "mode": "fuzzy"},
  "summary": {"total_searched": 352, "matches": 0, "truncated": false},
  "matches": []
}
Exit code: 100 (EXIT_NO_DATA) - indicates no matches but not an error.

Part 8: Testing Requirements

8.1 Unit Tests

# Test basic fuzzy search
ct find "test" | jq -e '.matches | length > 0'

# Test ID search
ct find --id T001 | jq -e '.matches[0].id == "T001"'

# Test exact match
ct find "Exact Title" --exact | jq -e '.query.mode == "exact"'

# Test no matches returns 100
ct find "zzzznonexistent" --quiet
[[ $? -eq 100 ]]

# Test JSON envelope
ct find "test" | jq -e '."$schema" and ._meta.command == "find"'

8.2 Performance Tests

# Search should complete in <100ms for 1000 tasks
time ct find "test" --quiet

Part 9: Comparison with Existing Commands

Featurelistexistsshowfind
Returns all tasksYesNoNoNo
Fuzzy searchNoNoNoYes
ID prefix searchNoExact onlyExact onlyYes
Minimal outputNoBinaryFull taskYes
Match scoringNoNoNoYes
Context efficientNoYesModerateYes

Use Case Decision Tree

Need task info?
├── Know exact ID? → ct show T042
├── Check if ID exists? → ct exists T042
├── Need all tasks? → ct list
├── Need filtered list? → ct list --status pending
└── Need to search/discover? → ct find "query"

Appendix A: Alternative Names Considered

NameProsConsDecision
findFamiliar (Unix), clear purposeMight conflict with shell findSELECTED
searchClear, no conflictsLonger to typeAlias
lookupClearLess commonRejected
queryDatabase-likeToo genericRejected
matchDescribes behaviorLess intuitiveRejected

Appendix B: Context Savings Analysis

Based on this project (352 tasks, 357KB todo.json):
Operationlist Contextfind ContextSavings
Find 1 task355KB~0.5KB99.9%
Find 5 tasks355KB~1.5KB99.6%
Find 10 tasks355KB~3KB99.1%
ID lookup355KB~0.3KB99.9%
Estimated annual token savings (assuming 100 find operations/day at $0.003/1K tokens):
  • Current: 355KB * 100 = 35.5MB/day = ~$10.65/day
  • With find: 3KB * 100 = 300KB/day = ~$0.09/day
  • Savings: 10.56/day=10.56/day = 3,850/year

Specification v1.0 - Research Complete Task T376 - Research: Fuzzy task search command for LLM agents Created: 2025-12-18