Personal AI infrastructure: fleet provisioning for non-technical users

2026-03-18 · halos fleet agents infrastructure governance

The problem

My parents want to use AI assistants. They will not configure them. They will not debug them. They will not read documentation. If something breaks, they will stop using it.

This is not a criticism. This is the design constraint.

The question is: can I build infrastructure where the operator provisions, governs, and verifies - and the user just talks?

What I built

halos is a personal AI infrastructure layer. A single Node.js process connects messaging channels (Telegram, WhatsApp, Slack, Discord) to Claude agents running in isolated Docker containers. Each agent has its own filesystem, memory, and conversation history.

The fork adds fleet management. HAL-prime (my instance) can spawn and maintain independent instances for other users. Each instance gets its own bot token, personality, and sandboxed environment.

~/code/nanoclaw/          HAL-prime (operator instance)
~/code/halfleet/
  microhal-ben/           Independent instance
  microhal-dad/           Independent instance (The Captain)
  microhal-mum/           Independent instance
  microhal-gains/         Independent instance
  microhal-money/         Independent instance

The provisioning pipeline

halctl create --name dad --personality dad

This command:

Copies the prime source tree (excluding store/, memory/, groups/, .env)
Composes CLAUDE.md from base template + personality blocks + user context
Runs npm install && npm run build
Locks governance files (chmod 444/555: CLAUDE.md, .claude/, src/, halos/)
Registers the operator chat as the main group
Generates ecosystem.config.cjs with hardcoded bot token and proxy ports
Appends to FLEET.yaml
Starts via pm2

The user never sees any of this. They open Telegram, find their bot, and start talking.

The personality engine

Non-technical users cannot write system prompts. But they have preferences - brevity, warmth, how strongly the assistant should express opinions.

The personality engine solves this with dimension profiles:

# templates/microhal/personalities/dad.yaml
name: dad
display_name: The Captain
dimensions:
  brevity: 7      # 1-10, higher = more concise
  warmth: 6       # 1-10, higher = warmer tone
  formality: 4    # 1-10, higher = more formal
  opinion_strength: 8  # 1-10, higher = stronger opinions
  humor: 5        # 1-10, higher = more humor
  
context: |
  The user is a retired 737 pilot. Former RAF navigator.
  He values precision, clarity, and directness.
  Technical explanations are welcome.
  Aviation metaphors will land.

The template renderer composes these dimensions into governance blocks that become part of CLAUDE.md. The user gets an assistant calibrated to their preferences without ever seeing a configuration file.

Isolation architecture

Fleet instances are isolated at multiple levels:

Container isolation. Each message spawns a Docker container. The container mounts the instance’s source tree, memory, and group directory. Containers cannot see each other or the host filesystem beyond their mounts.

Credential isolation. Containers never see raw API tokens. They connect to a credential proxy (port 3001 for prime, unique ports for fleet) that injects authentication. If a container is compromised, it cannot exfiltrate credentials.

Governance lock. After provisioning, governance files are chmod 444 (read-only). The agent cannot modify its own CLAUDE.md, even if instructed to. The user cannot accidentally (or deliberately) alter the governance layer.

halctl push --all   # Push governance updates to all fleet instances

When I update the base template or fix a governance issue, one command propagates it to the entire fleet. Users don’t need to update anything.

The fleet manifest

# FLEET.yaml
instances:
- name: dad
  path: /home/mrkai/code/halfleet/microhal-dad/nanoclaw
  telegram_bot_token_env: MICROHAL_DAD_BOT_TOKEN
  personality: dad
  services:
  - gh
  created: '2026-03-17T20:37:53Z'
  status: active

halctl list renders this as an audit table with instance names, bot IDs, group counts, note counts, and status. One view of the entire fleet.

Lifecycle management

Instances have a lifecycle: active, frozen, folded, fried.

freeze - Stop the instance but preserve all data. User sees the bot as offline.
fold - Archive the instance. Data preserved, instance removed from active fleet.
fry - Delete everything. Requires --confirm flag.
reset - Restore a frozen instance to active.

halctl freeze dad     # Stop but preserve
halctl fold dad       # Archive
halctl fry dad --confirm  # Permanent deletion

What this enables

I provisioned 5 instances in 48 hours. Each one is:

Independently deployed (own bot token, own database, own memory)
Personality-calibrated (dimension profiles composed into governance)
Governance-locked (users cannot alter their own CLAUDE.md)
Operator-maintainable (push updates to all instances with one command)
Auditable (FLEET.yaml + halctl list = complete fleet state)

The users - my parents, friends testing domain-specific instances - just talk to a Telegram bot. They have no idea there’s a fleet management layer underneath.

Source

Architecture diagrams: docs/d1/architecture-diagrams.md
Fleet provisioning: halos/halctl/
Personality engine: templates/microhal/personalities/
Container runner: src/container/

This is part 1 of the halos series. Next: the assessment system - how to verify that an AI assistant behaves correctly before deploying it to users who cannot debug it.