76. Shell Script Interface for AI Agent
Status: Accepted Date: 2025-07-06
Context
We need to define the contract, or API, through which our AI agent will interact with the michi task management system. A traditional REST or gRPC API would re-introduce the server-side complexity we are trying to avoid. We need an interface that is extremely simple, robust, stateless, and easy for an AI agent to call from a command-line environment.
Decision
The exclusive interface for the michi system will be a set of simple, composable shell scripts. The AI agent will perform all task operations by executing these scripts (e.g., via pnpm michi:add ...).
These scripts will be designed to be "agent-safe":
- Stateless: Each script execution is self-contained.
- Clear, Parsable Output: All output intended for the agent will be in a clean, easily parsable format like JSON. There will be no noisy, human-readable output unless explicitly requested.
- Idempotent-like Behavior: The scripts will be designed to be safe to re-run. For example, the underlying Git workflow (
pull-rebase-push) handles state convergence. - Standard Exit Codes: The scripts will use standard Unix exit codes to signal success (
0) or failure (non-zero), which is a simple and universal error handling mechanism.
Consequences
Positive:
- Extreme Simplicity: The interface is incredibly simple. There is no network protocol, no serialization format (beyond command-line arguments), and no complex client library needed. The "API documentation" is the script's
--helpoutput. - Robustness: Shell scripts are extremely robust and have been a stable part of computing environments for decades. The interface is not subject to the churn of programming language frameworks or libraries.
- Easy for AI to Use: Executing a shell command and parsing its JSON output is a fundamental and reliable capability for an AI coding assistant.
- Debuggability: A human developer can easily debug the system by running the exact same shell commands as the AI agent, providing perfect reproducibility.
Negative:
- Limited Expressiveness: A shell script interface is less expressive than a rich client library in a high-level programming language. Complex queries or logic are harder to implement.
- Cross-Platform Compatibility: While common on Linux and macOS, shell scripts can have portability issues on Windows if not written carefully or used within an environment like WSL.
- Security: The scripts have direct access to the filesystem and Git. A bug in a script could have significant consequences, though this is also true of any backend application.
Mitigation:
- Fit for Purpose: The interface is not designed for complex queries. It's designed for a specific, limited set of atomic operations (add, list, done), which it handles well.
- POSIX Compliance: The scripts will be written to be as POSIX-compliant as possible, using common tools like
jq,git, andfindto maximize portability. The primary target environment is the Unix-like system used for development. - Script Testing: The shell scripts will be treated as first-class code and will have their own automated tests (e.g., using a framework like BATS) to ensure they are correct and robust.