DOCS architecture
Support
# Bauxite: Architecture Overview

## System Components

### `bauxite-conduit` — The Data Plane Library

The foundational library containing the core logic for mesh connectivity. It is internally split into five subsystems that share a single facade struct, `BauxiteNode`:

| Subsystem | Responsibility |
| --- | --- |
| **Identity** | Loads or generates sovereign Ed25519 (signing) and X25519 (key exchange) keys. Per-session ephemeral X25519 keys provide Perfect Forward Secrecy (PFS). |
| **PeerRegistry** | Tracks virtual-IP → peer-ID mappings (`HashMap<u32, String>`), populated from Hub topology sync or local config. |
| **SessionManager** | Owns ICE agents, session keys, and per-peer crypto state (`SecureAgent` / `SecureChannel`). |
| **DataPlaneScheduler** | Three-lane outbound packet scheduler (Critical, Telemetry, Bulk) with RBAC sandbox checks. |
| **TUNController** | Bridges local traffic into the mesh via a TUN device (`bauxite0`, MTU 1350). Falls back to `tokio::io::duplex(65536)` when TUN creation fails. |
| **MetricsCollector** | Atomic counters for `tx_packets`, `tx_bytes`, `tx_dropped` |

Additional capabilities exposed by `BauxiteNode`:

- **HubConnector**: Registers the node with `bauxite-dispatch` over mTLS gRPC and syncs mesh topology.
- **PolicyEngine**: Firewalls outbound packets against allow-listed rules (populated by Citadel).
- **eBPF Offload** (`DataPlaneOffload` trait): Kernel-level packet classification when available; falls back to userspace.
- **State Snapshots**: Serializes `StateSnapshot` (peer routing table + X25519 public keys) using **postcard** for warm starts during OTA updates. Session keys are explicitly excluded to preserve PFS.

### `bauxite` — The Edge Binary

The executable that runs on edge devices. It:

- Interfaces with the control plane (`bauxite-dispatch`) for provisioning and signaling.
- Manages the local TUN interface and lifecycle of the data-plane node.
- Executes P2P hole punching via ICE (STUN/TURN).
- Provides a CLI for joining and managing the mesh (`bauxite join`, `bauxite run`).

### `bauxite-citadel` — The ROS 2 Discovery Extension

A library for integrating ROS 2 (Robot Operating System) with the Bauxite Mesh. It provides:

- **Discovery Server Management**: Spawns and manages the `fastdds` discovery server process, injecting `ROS_DISCOVERY_SERVER` and `ROS_DOMAIN_ID` (42) environment variables into the process tree.
- **Process Spawning Abstraction**: The `ProcessSpawner` trait allows full unit-testing with mock implementations without requiring FastDDS or Zenoh binaries.
- **Metrics Reporting**: Exports `ros_discovery_port` and `ros_discovery_status` gauges to the Hub via `ReportMetricsRequest`.
- **Zenoh Bridge**: Optionally spawns `zenoh-bridge-dds` (requires the `zenoh` feature flag) for scalable DDS bridging with metadata minification.

Citadel manages the external discovery server binary lifecycle; it does not perform network-level multicast tunneling.

### `bauxite-forge` — Enterprise OTA Updates

Proprietary crate providing:

- **Resume-Safe Downloads**: Chunked binary transfers with `serde_json`-serialized checkpointing and resume-safe recovery.
- **Ed25519 Package Verification**: Cryptographic signature verification of update packages before installation.
- **Warm Start Snapshots**: `StateSnapshot` (peer routing table + X25519 public keys) serialized with **postcard** for warm starts during OTA restarts. Session keys are excluded to preserve PFS.
- **Atomic Rollback**: Real binary backup via `std::env::current_exe()`, atomic staging path rename, and `rollback.marker` file to prevent infinite reboot loops.
- **Heartbeat Watchdog**: Monitors telemetry check-ins; auto-triggers rollback if no heartbeat within configurable timeout.

> **Note**: Post-quantum crypto (ML-KEM-768) in `bauxite-conduit/src/network/kyber.rs` is a **stub** — key generation uses `rand::thread_rng().fill_bytes()`. Hardware-bound identity ("Silicon Lock") is implemented in `bauxite-conduit/src/identity.rs` and `bauxite`'s `silicon-lock` feature.

### `bauxite-dispatch` — The Control Plane Hub

A self-hosted hub server providing:

- Node provisioning (issuing node certificates after join-token verification).
- ICE candidate signaling relay for P2P negotiation.
- Real-time mesh topology state sync.
- Audit logging, policy distribution, and admin UI.

---

## Data Flow

1. **Ingress**: Traffic enters through the `bauxite0` TUN interface (or duplex pipe fallback).
2. **Prioritization**: The `DataPlaneScheduler` inspects each packet and assigns it to one of three lanes:
    - **Critical** (`Priority::Critical`): Low-latency control packets via a bounded `VecDeque` (capacity 1024). On overflow, the oldest packet is evicted (`pop_front()`) to guarantee **data freshness** — newest packets always take priority. ICMP (protocol 1) and small packets (< 100 bytes) are automatically routed here.
    - **Telemetry** (`Priority::Telemetry`): Medium-priority state data via `HeapRb` ring-buffer (2048 slots). Traffic on port 5006 is routed here.
    - **Bulk** (`Priority::Bulk`): High-bandwidth data (video, logs, OTA chunks) via larger `HeapRb` ring-buffer (4096 slots).
3. **Routing**: The scheduler resolves the destination peer via the `PeerRegistry`.
4. **Encryption**: Packets are encrypted using the session's AEAD key (ChaCha20-Poly1305 default; AES-256-GCM in FIPS mode). Every outbound packet is tagged with `0x80` to multiplex mesh traffic and STUN on a single UDP port.
5. **Transport**: Encrypted packets are sent via UDP using the ICE agent (direct P2P tunnel) or, if P2P is unavailable, through an authenticated relay or the Hub gRPC data channel.
6. **Egress**: The receiving peer decrypts and writes the packet to its local TUN interface.

### Path Selection Logic

The scheduler uses the following path-selection order:

| Rank | Path | When Used |
| --- | --- | --- |
| 1 | **Direct P2P Tunnel** | An operational ICE session exists |
| 2 | **Relay / Hub Channel** | Direct P2P is blocked; authenticated TURN relay or gRPC fallthrough |
| 3 | **Drop** | No valid transit paths exist; packet counted in `tx_dropped` |

The `CongestionStrategy` trait exists for future congestion-aware scheduling but currently only the no-op `NullCongestionStrategy` is implemented.

---

## ICE & NAT Traversal

Powered by `webrtc-ice` and `webrtc-util`, Bauxite establishes direct P2P connections even behind restrictive NATs.

- **STUN**: Used for public IP discovery (default: `stun.l.google.com:19302`).
- **TURN**: Used as a fallback relay when direct P2P is impossible.
- **Signaling**: The `bauxite` gRPC signaling client exchanges ICE candidates and per-session ephemeral X25519 keys over `SignalingServiceClient::exchange_candidates`.
- **Active-Active Bonding**: Each packet carries a monotonically increasing sequence ID prepended to the encrypted payload. The receiver's ICE agent tracks `last_seen_seq` and discards duplicate or stale packets to deduplicate bonded paths.

---

## bauxite-dispatch (Control Plane) Integration

The data plane communicates with `bauxite-dispatch` via gRPC for:

- **Provisioning**: Joining the mesh and receiving sovereign credentials (`ca.crt`, `node.crt`, `node.key`) plus an assigned `virtual_ip`.
- **Signaling**: Exchanging ICE candidates and ephemeral X25519 keys for P2P connection establishment.
- **State Sync**: Maintaining a real-time map of mesh topology and peer reachability.

# End-to-End System Lifecycles

## Node Provisioning & Boot Flow

```mermaid
sequenceDiagram
    autonumber
    actor User as User CLI
    participant Agent as bauxite
    participant Hub as Hub ProvisioningService
    participant Disk as Local Storage (/root/.bauxite/)

    Note over User, Hub: Phase 1: Mesh Provisioning
    User->>Agent: bauxite join <token> <hub_url>
    Agent->>Hub: ProvisionNode(ProvisionRequest)
    Note over Hub: Verify join token &<br/>generate cryptographic keys
    Hub-->>Agent: ProvisionResponse (ca.crt, node.crt, node.key, virtual_ip)
    Agent->>Disk: Write assets to /certs/
    Agent->>Disk: Generate config.toml & static PSK
    Note over Agent, Disk: Node Identity Provisioned

    Note over User, Hub: Phase 2: Daemon Initialization
    User->>Agent: bauxite run
    Agent->>Disk: Load and validate certificates
    Agent->>Hub: Establish mTLS gRPC Channel
    activate Hub
    Agent->>Agent: Spawn background workers<br/>(Signaling, Policy Sync, IPC Bus)
    Agent->>Agent: Initialize BauxiteNode Data Plane & TAP/TUN
    deactivate Hub
```

## Peer Connection Hookup (Direct P2P Negotiation)

```mermaid
sequenceDiagram
    autonumber
    actor App as App / User
    participant AgentA as Peer A (bauxite)
    participant Hub as Hub Signaling Relay
    participant AgentB as Peer B (bauxite)

    App->>AgentA: IPC Command: "connect:<target_peer_id>"
    AgentA->>AgentA: Evaluate live active ICE agent records

    alt Peer ID A < Peer ID B (Initiator)
        AgentA->>Hub: ExchangeCandidates(INVITE + ephemeral X25519)
        Hub->>AgentB: Forward INVITE
        Note over AgentB: Awaits incoming handshake
        AgentB-->>Hub: ExchangeCandidates(OFFER + ANSWER + ephemeral X25519)
        Hub-->>AgentA: Forward OFFER + ANSWER
    else Peer ID A > Peer ID B (Responder)
        Note over AgentA: Halts and awaits remote INVITE
        AgentB->>Hub: ExchangeCandidates(INVITE)
        Hub->>AgentA: Forward INVITE
        AgentA-->>Hub: ExchangeCandidates(OFFER + ANSWER + ephemeral X25519)
        Hub-->>AgentB: Forward OFFER + ANSWER
    end

    Note over AgentA, AgentB: Trickle ICE Candidate Exchange via Hub Relay
    AgentA->>AgentB: STUN Hole Punching directly between peers
    AgentB->>AgentA: Direct STUN Connectivity Established
    
    Note over AgentA, AgentB: Execute DH with ephemeral X25519 keys (PFS)
    AgentA->>AgentB: Establish P2P ICE Data Channel
    Note over AgentA, AgentB: Direct Mesh Routing Active 
```

## Egress Packet Flight Path

```mermaid
sequenceDiagram
    autonumber
    participant App as ROS Node / Application
    participant Dev as TUNController (Virtual Interface)
    participant Scheduler as DataPlaneScheduler
    participant SessionMgr as SessionManager
    participant Net as Physical Link (Wi-Fi/LTE/5G)

    App->>Dev: Write outbound packet payload
    Dev->>Scheduler: Enqueue into appropriate priority lane
    
    Note over Scheduler, SessionMgr: Evaluate priority lane (Critical, Telemetry, or Bulk)<br/>Resolve destination peer via PeerRegistry

    alt P2P Connection Open
        Note over SessionMgr: Target path: Direct ICE Tunnel
        Scheduler->>SessionMgr: Look up session encryptor
        SessionMgr-->>Scheduler: AEAD encrypt packet
        Scheduler->>Net: Transmit over ICE UDP link (tag 0x80)
    else Relay Fallback Only
        Note over SessionMgr: Target path: Mesh Topology Relay or Hub gRPC
        SessionMgr-->>Scheduler: Encrypt via relay session profile
        Scheduler->>Net: Stream through authenticated relay
    end

    alt No Valid Path
        Scheduler->>Scheduler: Increment tx_dropped counter
    end
```