117. Human-in-the-Loop Validation
Status: Accepted Date: 2025-07-06
Context
The Maat module collects "Bad Signal" (BS) candidates from across the system. These are situations that are potentially anomalous or incorrect. A key question is: what happens after a candidate is collected? Should the system attempt to automatically decide if it's a "true positive" and take action?
Automating this final judgment is extremely difficult. The definition of "bad" can be highly contextual and nuanced, requiring domain knowledge and an understanding of the broader market or system state that the isolated module might not have. An automated system that tries to self-correct based on this data could easily make things worse.
Decision
The Maat system is designed for human-in-the-loop validation. The system's role is to detect, collect, and present potential issues, but the final analysis and judgment are the responsibility of a human reviewer.
Maat's primary output is not an automated action, but a well-structured report (e.g., a GitHub Gist, adr://artifact-based-reporting) that provides a human expert with all the context they need to investigate the BS candidate. The goal of the system is to make this human review process as efficient and effective as possible.
The feedback loop is closed when a human reviews the BS report, investigates the root cause, and then takes action, such as:
- Creating a bug ticket.
- Improving an AI prompt.
- Adjusting a configuration parameter.
- Acknowledging it as a false positive and dismissing it.
Consequences
Positive:
- Safety and Robustness: Prevents the system from taking incorrect, automated actions based on a flawed understanding of a "bad signal". It keeps a human with superior context and judgment in control.
- Drives Systemic Improvement: The goal of the system shifts from "auto-remediation" to "providing high-quality data for human analysis". This process of human review is what leads to deep, meaningful improvements in the underlying system logic, rather than just patching over symptoms.
- Handles Nuance: A human can easily handle the ambiguity and context that would be incredibly difficult to encode in an automated system.
Negative:
- Creates a Manual Bottleneck: This process is dependent on the availability of a human reviewer. A large influx of BS candidates could create a backlog of items waiting for review.
- Slower Feedback Loop: The time from detection to corrective action is longer than in a fully automated system, as it's gated by human review time.
Mitigation:
- Prioritization and Rich Reporting: The BS reports generated by
Maatare designed to be rich with context, making the review process as fast as possible. We can add metadata to help prioritize which candidates are most critical to review first (e.g., based on the source module or the potential impact). - AI-Assisted Review: While the final decision is human, we can use LLMs to perform a pre-analysis of the BS candidate. The report can include an AI-generated summary of the potential issue, its likely root cause, and suggested next steps, significantly accelerating the human's review process.
- Sampling:
adr://sampling-based-collectionensures the volume of candidates remains manageable.