Skip to main content

39. Manual Restart Control for Experiments

Status: Accepted Date: 2025-07-06

Context

In the Mercury experimental framework, configuration files can be modified directly in the source code, sometimes by an AI agent. If these changes were to be applied automatically to a running instance (e.g., via hot-reloading), it could lead to unexpected behavior, unstable states, or disruptive changes without any human oversight. This is particularly risky in a system that influences trading decisions.

Decision

We will enforce a policy of Manual Restart Control. Changes made to any code-based configuration files for the experimental framework will not be automatically hot-reloaded or applied to a running application instance. A deliberate, manual restart of the instance by a human operator is the only way to load and apply new configurations. This acts as a critical safety gate, ensuring that a human reviews and explicitly approves any changes before they can affect the system's behavior.

Consequences

Positive:

  • Enhanced Safety: Prevents automated or unintended changes from immediately impacting the system, giving operators a chance to review and validate them.
  • User in Control: The human operator has the final say on when a new experimental configuration goes live.
  • Predictability: The behavior of a running instance is predictable and will not change unexpectedly until it is deliberately restarted.

Negative:

  • Slower Iteration Cycle: The need for a manual restart introduces friction and slows down the process of iterating on experimental configurations.
  • Requires Manual Intervention: This approach is inherently manual and requires an operator to be present to apply changes.

Mitigation:

  • Intentional Friction: The slower iteration cycle is a deliberate and accepted trade-off for safety and control in this sensitive context. The friction is a feature, not a bug.
  • Clear Process Documentation: The workflow for making and applying configuration changes (i.e., edit file, commit, get review, then manually restart the instance) will be clearly documented for all developers and operators.
  • Staging Environments: For more rapid iteration during development, developers can use local or staging environments where they have the freedom to restart instances quickly.