Skip to main content

Backup System Architecture

CLEO implements a four-layer data integrity system designed for defense-in-depth. Each layer serves a distinct purpose, and together they provide comprehensive protection against data loss and corruption.
The backup layers are complementary by design, not competing mechanisms. Each layer addresses different failure modes and recovery scenarios.

Architecture Overview

Layer Summary

LayerPurposeTriggerLocation
1: Atomic OperationsPer-write safetyEvery write.cleo/backups/operational/
2: Recovery BackupsPoint-in-time snapshotsUser/system events.cleo/backups/{type}/
3: Schema ValidationData integrityManual/CISchema definitions
4: Git Version ControlProject historyUser commitsGit repository

Layer 1: Atomic Operations

Purpose

Protect against I/O failures, system crashes, and interrupted writes. Every write operation is atomic and reversible.

How It Works

Every write operation follows this sequence:

Key Functions

# Primitive atomic write (no validation dependencies)
aw_atomic_write "$file" "$content"

# Create numbered backup with rotation
aw_create_backup "$file"

# Rotate backups (keep last N)
_aw_rotate_backups "$file" "$max_backups"

Backup Rotation

  • Location: .cleo/backups/operational/
  • Naming: {filename}.{1|2|3|...} (lower = newer)
  • Retention: Configurable (default 5-10 backups)
  • Rotation: FIFO - oldest backup deleted when limit reached

Automatic Rollback

# If atomic rename fails, restore from backup
if ! aw_atomic_move "$temp_file" "$file"; then
    if [[ -f "$backup_file" ]]; then
        cp "$backup_file" "$file"
    fi
fi

Layer 2: Recovery Backups

Purpose

Enable point-in-time recovery for disaster scenarios. User-initiated or automatic before destructive operations.

Backup Types

Snapshot

Purpose: Full system backupTrigger: Manual cleo backupRetention: Count-based (default 10)Contents: All CLEO data files

Safety

Purpose: Pre-operation protectionTrigger: Before destructive operationsRetention: Time (7d) + count (5)Contents: Affected file only

Archive

Purpose: Pre-archive protectionTrigger: Before cleo archiveRetention: Count-based (3)Contents: todo.json + archive

Migration

Purpose: Schema migration safetyTrigger: Before cleo migrateRetention: PERMANENTContents: All affected files

Directory Structure

.cleo/backups/
├── operational/           # Layer 1 numbered backups
│   ├── todo.json.1
│   ├── todo.json.2
│   └── sessions.json.1
├── snapshot/              # Full system snapshots
│   └── 2026-01-26_153000_snapshot/
│       ├── todo.json
│       ├── config.json
│       └── metadata.json
├── safety/                # Pre-operation backups
├── archive/               # Pre-archive backups
├── migration/             # Schema migration backups
└── backup-manifest.json   # Index for O(1) lookup

Metadata Format

Each backup includes metadata for verification:
{
  "id": "backup_20260126_153000",
  "type": "snapshot",
  "createdAt": "2026-01-26T15:30:00Z",
  "trigger": "manual",
  "neverDelete": false,
  "files": [
    {
      "name": "todo.json",
      "size": 12345,
      "checksum": "sha256:abc123..."
    }
  ]
}

Recovery Commands

# List available backups
cleo backup --list

# Create manual snapshot
cleo backup

# Restore from backup
cleo restore [backup-id]

# Restore specific file
cleo restore --file todo.json

Layer 3: Schema Validation

Purpose

Ensure data semantic correctness. Detect invalid data structures, missing fields, and constraint violations.
Schema validation is currently detection-only, not prevention. Invalid data CAN be written; validation catches it afterward.

Validation Functions

FunctionPurposeWhen Called
validate_json_syntax()JSON parsingBefore write
validate_schema()Schema complianceManual/CI
validate_task()Task object rulesManual/CI
validate_all()Comprehensive checkManual/CI

Schema Locations

schemas/
├── todo.schema.json        # Task data
├── config.schema.json      # Configuration
├── sessions.schema.json    # Session state
├── archive.schema.json     # Archived tasks
└── log.schema.json         # Audit log

Validation Commands

# Run full validation
cleo validate

# Attempt automatic fixes
cleo validate --fix

# Check specific aspects
cleo validate --check-orphans
cleo validate --check-deps

Migration System

Schema versions are tracked and migrated safely:
# Check current versions
cleo upgrade --status

# Run migrations
cleo migrate

# Migrations are recorded in .cleo/migrations.json

Layer 4: Git Version Control

Purpose

Provide project history, collaboration support, and disaster recovery via version control.

What’s Git-Tracked

Core data files ARE tracked in git:
  • .cleo/todo.json - Active tasks
  • .cleo/todo-archive.json - Completed tasks
  • .cleo/config.json - Configuration
  • .cleo/sessions.json - Session state
  • .cleo/todo-log.jsonl - Audit trail

Git Integration Points

LocationCommandPurpose
scripts/safestop.shgit add -A && git commitWIP snapshot on agent shutdown
dev/hooks/pre-commitgit addAuto-stage generated migrations
# After significant task work
cleo archive                    # Clean up completed tasks
git add .cleo/todo.json .cleo/todo-archive.json
git commit -m "chore: Update task status"
Git commits are intentionally not automated to preserve user control over commit granularity and messages.

Recovery Procedures

Scenario 1: Corrupted Write

Problem: Write interrupted mid-operation Recovery:
  1. Layer 1 atomic write ensures temp file corruption, not target
  2. Target file remains intact
  3. If target corrupted, automatic rollback from numbered backup

Scenario 2: Bad Data Written

Problem: Invalid data passed validation but is semantically wrong Recovery:
# Check what's wrong
cleo validate

# Restore from operational backup
cp .cleo/backups/operational/todo.json.1 .cleo/todo.json

# Or restore from snapshot
cleo restore

Scenario 3: Schema Migration Failed

Problem: Migration script had a bug Recovery:
# Migration backups are PERMANENT
cleo backup --list --type migration

# Restore pre-migration state
cleo restore <migration-backup-id>

Scenario 4: Disaster Recovery

Problem: All local data lost Recovery:
  1. Restore from git (data files are tracked)
  2. Or restore from Tier 2 snapshot backup
  3. Run cleo validate --fix to repair any issues

Best Practices

1

Regular Snapshots

Run cleo backup before major work sessions
2

Commit Regularly

Git commit CLEO data files after significant changes
3

Validate Periodically

Run cleo validate as part of CI/CD or before releases
4

Monitor Backups

Check cleo backup --list to ensure backups exist

Configuration

Backup Settings

# View current backup configuration
cleo config get backup

# Set operational backup retention
cleo config set backup.operationalRetention 10

# Set snapshot retention
cleo config set backup.snapshotRetention 5

Validation Settings

# Enable/disable pre-write validation (future)
cleo config set validation.preWriteEnabled true

# Set validation strictness
cleo config set validation.strictMode true