Backup System Architecture
CLEO implements a four-layer data integrity system designed for defense-in-depth. Each layer serves a distinct purpose, and together they provide comprehensive protection against data loss and corruption.The backup layers are complementary by design, not competing mechanisms. Each layer addresses different failure modes and recovery scenarios.
Architecture Overview
Layer Summary
| Layer | Purpose | Trigger | Location |
|---|---|---|---|
| 1: Atomic Operations | Per-write safety | Every write | .cleo/backups/operational/ |
| 2: Recovery Backups | Point-in-time snapshots | User/system events | .cleo/backups/{type}/ |
| 3: Schema Validation | Data integrity | Manual/CI | Schema definitions |
| 4: Git Version Control | Project history | User commits | Git repository |
Layer 1: Atomic Operations
Purpose
Protect against I/O failures, system crashes, and interrupted writes. Every write operation is atomic and reversible.
How It Works
Every write operation follows this sequence:Key Functions
Backup Rotation
- Location:
.cleo/backups/operational/ - Naming:
{filename}.{1|2|3|...}(lower = newer) - Retention: Configurable (default 5-10 backups)
- Rotation: FIFO - oldest backup deleted when limit reached
Automatic Rollback
Layer 2: Recovery Backups
Purpose
Enable point-in-time recovery for disaster scenarios. User-initiated or automatic before destructive operations.
Backup Types
Snapshot
Purpose: Full system backupTrigger: Manual
cleo backupRetention: Count-based (default 10)Contents: All CLEO data filesSafety
Purpose: Pre-operation protectionTrigger: Before destructive operationsRetention: Time (7d) + count (5)Contents: Affected file only
Archive
Purpose: Pre-archive protectionTrigger: Before
cleo archiveRetention: Count-based (3)Contents: todo.json + archiveMigration
Purpose: Schema migration safetyTrigger: Before
cleo migrateRetention: PERMANENTContents: All affected filesDirectory Structure
Metadata Format
Each backup includes metadata for verification:Recovery Commands
Layer 3: Schema Validation
Purpose
Ensure data semantic correctness. Detect invalid data structures, missing fields, and constraint violations.
Validation Functions
| Function | Purpose | When Called |
|---|---|---|
validate_json_syntax() | JSON parsing | Before write |
validate_schema() | Schema compliance | Manual/CI |
validate_task() | Task object rules | Manual/CI |
validate_all() | Comprehensive check | Manual/CI |
Schema Locations
Validation Commands
Migration System
Schema versions are tracked and migrated safely:Layer 4: Git Version Control
Purpose
Provide project history, collaboration support, and disaster recovery via version control.
What’s Git-Tracked
- Tracked Files
- Ignored Files
Core data files ARE tracked in git:
.cleo/todo.json- Active tasks.cleo/todo-archive.json- Completed tasks.cleo/config.json- Configuration.cleo/sessions.json- Session state.cleo/todo-log.jsonl- Audit trail
Git Integration Points
| Location | Command | Purpose |
|---|---|---|
scripts/safestop.sh | git add -A && git commit | WIP snapshot on agent shutdown |
dev/hooks/pre-commit | git add | Auto-stage generated migrations |
Recommended Workflow
Git commits are intentionally not automated to preserve user control over commit granularity and messages.
Recovery Procedures
Scenario 1: Corrupted Write
Problem: Write interrupted mid-operation Recovery:- Layer 1 atomic write ensures temp file corruption, not target
- Target file remains intact
- If target corrupted, automatic rollback from numbered backup
Scenario 2: Bad Data Written
Problem: Invalid data passed validation but is semantically wrong Recovery:Scenario 3: Schema Migration Failed
Problem: Migration script had a bug Recovery:Scenario 4: Disaster Recovery
Problem: All local data lost Recovery:- Restore from git (data files are tracked)
- Or restore from Tier 2 snapshot backup
- Run
cleo validate --fixto repair any issues
