76. Shell Script Interface for AI Agent

Status: Accepted Date: 2025-07-06

Context

We need to define the contract, or API, through which our AI agent will interact with the michi task management system. A traditional REST or gRPC API would re-introduce the server-side complexity we are trying to avoid. We need an interface that is extremely simple, robust, stateless, and easy for an AI agent to call from a command-line environment.

Decision

The exclusive interface for the michi system will be a set of simple, composable shell scripts. The AI agent will perform all task operations by executing these scripts (e.g., via pnpm michi:add ...).

These scripts will be designed to be "agent-safe":

Stateless: Each script execution is self-contained.
Clear, Parsable Output: All output intended for the agent will be in a clean, easily parsable format like JSON. There will be no noisy, human-readable output unless explicitly requested.
Idempotent-like Behavior: The scripts will be designed to be safe to re-run. For example, the underlying Git workflow (pull-rebase-push) handles state convergence.
Standard Exit Codes: The scripts will use standard Unix exit codes to signal success (0) or failure (non-zero), which is a simple and universal error handling mechanism.

Consequences

Positive:

Extreme Simplicity: The interface is incredibly simple. There is no network protocol, no serialization format (beyond command-line arguments), and no complex client library needed. The "API documentation" is the script's --help output.
Robustness: Shell scripts are extremely robust and have been a stable part of computing environments for decades. The interface is not subject to the churn of programming language frameworks or libraries.
Easy for AI to Use: Executing a shell command and parsing its JSON output is a fundamental and reliable capability for an AI coding assistant.
Debuggability: A human developer can easily debug the system by running the exact same shell commands as the AI agent, providing perfect reproducibility.

Negative:

Limited Expressiveness: A shell script interface is less expressive than a rich client library in a high-level programming language. Complex queries or logic are harder to implement.
Cross-Platform Compatibility: While common on Linux and macOS, shell scripts can have portability issues on Windows if not written carefully or used within an environment like WSL.
Security: The scripts have direct access to the filesystem and Git. A bug in a script could have significant consequences, though this is also true of any backend application.

Mitigation:

Fit for Purpose: The interface is not designed for complex queries. It's designed for a specific, limited set of atomic operations (add, list, done), which it handles well.
POSIX Compliance: The scripts will be written to be as POSIX-compliant as possible, using common tools like jq, git, and find to maximize portability. The primary target environment is the Unix-like system used for development.
Script Testing: The shell scripts will be treated as first-class code and will have their own automated tests (e.g., using a framework like BATS) to ensure they are correct and robust.

Context​

Decision​

Consequences​

Context

Decision

Consequences