63. Hardware and Credential Isolation for Instances
Status: Accepted Date: 2025-07-06
Context
Within our three-tier instance architecture (adr://instance-purpose-clarity), we have one instance type (W) that trades with real money and two (R, ABH) that do not. The single greatest operational risk is a scenario where a non-live instance (e.g., the demo instance) is accidentally configured with live API keys, leading to unintended real-money trades. We must establish a strict, unbreakable policy to prevent this.
Decision
We will enforce a policy of Strict Credential and Hardware Isolation.
- Credential Segregation: Live, real-money API keys for the exchange must never be present in the configuration files, environment variables, or secrets management scope of the R (Demo) or ABH (Intelligence) instances.
- Physical/Logical Isolation: The W (Live) instance(s) should be run on physically or logically separate hardware (e.g., a different server, a different Kubernetes namespace with stricter network policies) from the non-live instances.
- Principle of Least Privilege: The IAM roles or user accounts associated with the R and ABH instances must not have the permissions required to even read the secrets (e.g., from HashiCorp Vault or AWS Secrets Manager) that contain the live API keys.
A configuration error should never be able to cause a non-live instance to access live keys. It must be impossible by design at the infrastructure level.
Consequences
Positive:
- Maximum Safety: Provides the strongest possible protection against accidental live trading from non-live instances. This is a critical safety feature.
- Clear Security Boundary: Creates a clear, unambiguous security boundary between live and non-live environments, which is easy for developers, operators, and auditors to understand.
- Reduced Blast Radius: A security compromise of a non-live instance (which might have less stringent security) cannot lead to a compromise of live trading credentials.
Negative:
- Increased Infrastructure Complexity and Cost: Maintaining separate hardware or strictly segregated virtual environments adds to the cost and complexity of the infrastructure.
- Stricter Deployment Procedures: Deployments and configuration management become more complex, as we have to manage different sets of credentials and target environments for each instance type.
Mitigation:
- Infrastructure as Code (IaC): The entire environment segregation policy will be defined and enforced through our Ansible playbooks and Docker/Kubernetes configurations. This makes the setup automated, repeatable, and auditable.
- Secrets Management: We will use a dedicated secrets management tool (like HashiCorp Vault) with strict access control policies to manage credentials. The application code will be identical on all instances; the instance's identity and associated permissions will determine which secrets it is allowed to retrieve.
- Cost-Benefit Analysis: The additional cost and complexity are a non-negotiable and essential investment for operating a real-money trading system safely. The cost of an incident caused by a lack of isolation would be far greater.