Air-Gapped AI: Deploying LLMs in Disconnected Environments

Some environments cannot tolerate any network connection to the outside world. Classified government facilities, critical infrastructure control systems, secure manufacturing floors, and organizations bound by the strictest compliance regimes all operate in air-gapped networks where no packet crosses the security boundary. Deploying large language models in these environments is possible, but it demands a fundamentally different approach to every aspect of the lifecycle, from initial setup to ongoing operations. This guide covers the architecture, procedures, and operational patterns required to make LLMs functional in fully disconnected environments.

Why Air-Gapped Deployment

Organizations choose air-gapped deployment for several reasons, each imposing distinct constraints:

Classified and National Security

Government and defense organizations operating at Secret, Top Secret, or SCI classification levels are required by regulation (NIST 800-171, CMMC, ICD 503) to process classified data exclusively on systems with no connectivity to unclassified networks. There is no exception for AI workloads. If the LLM will process classified inputs or operate in a classified environment, it must be air-gapped by definition.

Regulatory Compliance

Certain financial regulations, healthcare data rules (HIPAA in its strictest interpretations), and data sovereignty laws in some jurisdictions effectively require air-gapped processing for specific data categories. While not always a hard mandate, the compliance burden of proving that connected systems adequately protect the data sometimes exceeds the engineering cost of simply disconnecting.

Critical Infrastructure Protection

Industrial control systems (ICS), SCADA networks, power grid operations, and water treatment facilities increasingly operate in segmented or fully air-gapped environments following standards like IEC 62443 and NERC CIP. AI-assisted monitoring, anomaly detection, and operational support in these environments must run within the security boundary.

Intellectual Property Protection

Some commercial organizations, particularly in semiconductor design, pharmaceutical research, and advanced manufacturing, choose air-gapped environments for their most sensitive intellectual property. The threat model includes not just external attackers but sophisticated nation-state actors and insider threats.

Architecture for Disconnected Deployment

An air-gapped LLM deployment must be entirely self-contained. Every dependency, from the operating system to the model weights to the Python packages, must be present within the isolated network before the system can function.

Core Components

GPU server(s): Pre-configured with operating system, GPU drivers, CUDA toolkit, and all required libraries before being placed in the air-gapped environment
Model weights: Transferred via approved media (encrypted hard drives, optical media) through the facility's data transfer process
Model serving framework: vLLM, TGI, or equivalent, installed with all dependencies from a pre-built container image or offline package repository
RAG infrastructure (if applicable): Vector database, embedding model, and document ingestion pipeline, all running locally
Local package mirror: An internal mirror of required package repositories (PyPI, apt/yum, container registry) for any software installation or update needs
Monitoring and logging: All observability infrastructure must run inside the air gap. No external monitoring services.

Network Architecture

Even within an air-gapped environment, internal network architecture matters. Segment the LLM infrastructure into zones:

Inference zone: GPU servers and model serving endpoints. Minimal attack surface, no unnecessary services.
Application zone: API gateway, application servers, and user-facing interfaces. Communicates with the inference zone over a dedicated VLAN.
Data zone: Vector databases, document stores, and knowledge bases. Accessed by the inference zone for RAG retrieval.
Management zone: Monitoring, logging, configuration management. Accessible to administrators but not to general users.

Model Transfer Procedures

Getting model weights into an air-gapped environment is a logistics challenge. A 70B parameter model in FP16 is approximately 140 GB. With quantized variants, supporting files, and the serving framework, a complete deployment package typically ranges from 50 GB to 500 GB.

Transfer Media

Encrypted removable drives: The most common approach. Use FIPS 140-2 (or 140-3) validated encrypted drives. Prepare the drive on a clean, trusted system on the unclassified side, transfer through the facility's data transfer officer or security review process, and import on the air-gapped side.
Data diodes: Hardware-enforced one-way data transfer devices that physically prevent any data from leaving the secure network. Data diodes allow streaming data into the air-gapped environment while guaranteeing no exfiltration. Used in defense and intelligence community environments.
Optical media: For smaller transfers or environments with the strictest policies. Write-once Blu-ray discs (100 GB capacity) provide tamper-evident transfer media.

Transfer Validation

Every file transferred into the air gap must be validated:

Cryptographic hash verification: Generate SHA-256 hashes of all files before transfer. Verify hashes after import to confirm integrity.
Malware scanning: Scan all transfer media with up-to-date antivirus on the air-gapped side (which requires its own signature update process).
Content review: For classified environments, a security review officer may need to approve the content of model weights and configuration files before they are admitted to the secure network.
Chain of custody: Maintain a documented chain of custody for all transfer media from creation through import and destruction.

Dependency Management Without Internet

Modern software assumes internet connectivity. Package managers download from remote repositories. Container images pull from registries. Even build tools phone home. In an air-gapped environment, you must eliminate every external dependency before deployment.

Container-Based Deployment

The most reliable approach is to package the entire serving stack as container images on the connected side and transfer the images into the air gap. This eliminates runtime dependency resolution entirely.

Build a Docker or Podman image containing the serving framework (vLLM, TGI), all Python dependencies, CUDA runtime libraries, and configuration
Export the image as a tarball (docker save) and transfer via approved media
Load the image on the air-gapped side (docker load) and run
Model weights can be mounted as a volume from local storage rather than baked into the image to keep image sizes manageable

Offline Package Mirror

For environments that need to install or update software within the air gap, maintain an internal package mirror:

PyPI mirror: Use tools like devpi or bandersnatch to create a local PyPI mirror containing the required packages. Transfer the mirror snapshot via approved media.
OS packages: Mirror the required apt or yum repositories. Only mirror packages actually needed to minimize the transfer size and security review burden.
Container registry: Run a local Harbor or Docker Registry instance within the air gap for storing and distributing container images.

Dependency Pinning

Pin every dependency to exact versions. In an air-gapped environment, you cannot tolerate version resolution failures because a required version is not available in the local mirror. Use lock files (pip freeze, poetry.lock, package-lock.json) generated on the connected side and verified to resolve entirely from the local mirror.

Update and Patching Workflows

Air-gapped systems still need updates. Security patches, model upgrades, and software updates must flow through a controlled process.

Security Patching

Establish a regular cadence (monthly or quarterly) for security updates:

On the connected side, build an updated container image or package mirror snapshot incorporating the latest security patches
Test the updated stack in a connected staging environment that mirrors the air-gapped configuration
Transfer the tested update package through the data transfer process
Apply updates in the air-gapped environment during a scheduled maintenance window
Validate functionality with automated test suites after patching

Model Updates

When a new model version or fine-tuned variant needs deployment:

Evaluate the new model on the connected side using representative (non-classified) test data
Package the model weights and any updated serving configuration
Transfer via approved media with full hash verification
Deploy alongside the existing model (blue-green deployment within the air gap)
Run validation tests with domain-specific evaluation data
Cut over traffic only after validation passes

Antivirus and Threat Intelligence

Antivirus signature databases on the air-gapped network need periodic updates. Establish a one-way transfer process for signature files, typically via data diode or encrypted media. Many organizations maintain a weekly or bi-weekly signature update cadence for air-gapped systems.

Operational Procedures

Operating an LLM in an air-gapped environment requires procedural discipline that goes beyond typical IT operations.

Monitoring Without External Services

All monitoring and alerting must be self-contained. Deploy a local Prometheus and Grafana stack for metrics, a local Elasticsearch or Loki instance for logs, and configure alerting to internal channels (local email server, internal messaging, or even local pager systems in the most secure environments). There is no PagerDuty, no Datadog, no external SaaS monitoring.

Incident Response

Runbooks must be comprehensive because you cannot Google a solution during an outage. Document every known failure mode and its resolution:

GPU hardware failure: diagnostic steps, spare hardware procedures
Model serving crash: restart procedures, log analysis checklist
Memory exhaustion: KV cache tuning, batch size reduction, request queuing
Vector database corruption: backup restoration procedures
Disk space exhaustion: log rotation, temporary file cleanup procedures

Knowledge Management

In a connected environment, your team can reference online documentation, community forums, and vendor support portals in real time. In an air-gapped environment, all reference documentation must be available locally. Maintain an internal wiki or document repository containing:

Complete documentation for all deployed software (vLLM docs, CUDA docs, vector database docs)
Custom runbooks and operational procedures
Architecture diagrams and network maps
Change logs for every update applied to the environment
Troubleshooting guides compiled from experience and vendor documentation

Staffing and Training

Air-gapped environments require staff with deep expertise who can operate independently. Training must be completed before personnel enter the secure environment, as online courses and external training resources are not accessible from within. Budget for regular training refreshers and cross-training to avoid single points of human failure.

Testing and Validation

Maintain a connected staging environment that mirrors the air-gapped production setup as closely as possible. Use this environment to:

Test all software updates and patches before transfer to the air gap
Develop and refine operational procedures
Train new staff on system administration and troubleshooting
Benchmark new model versions against evaluation datasets
Validate container images and dependency packages build correctly from offline mirrors

The staging environment is your bridge between the connected world where development happens and the air-gapped world where production runs. It is not optional; it is essential infrastructure.

Common Pitfalls

Organizations new to air-gapped AI deployment frequently encounter these issues:

Undocumented internet dependencies: Software that works perfectly on a connected system may silently fail in an air gap because it tries to download a resource, check a license server, or send telemetry. Test every component in a simulated offline environment before deploying to the actual air gap.
Insufficient transfer bandwidth: Transferring hundreds of gigabytes through a security review process takes time. Plan for transfer windows of days, not hours, for initial deployment.
Clock drift: Without NTP access to external time servers, system clocks can drift. Use a local GPS-synchronized NTP server or a rubidium clock within the air-gapped network to maintain time accuracy for logging, certificate validation, and audit trails.
Certificate management: TLS certificates cannot be renewed through external CAs. Deploy an internal PKI (Certificate Authority) within the air gap and manage all certificates internally.
Underestimating operational burden: Every task that takes a minute on a connected system (installing a package, checking documentation, downloading a patch) takes hours or days in an air-gapped environment. Staff and budget accordingly.

Air-gapped AI deployment is not a simplified version of standard deployment; it is a fundamentally different operational model. Every convenience that connected infrastructure provides, from package managers to monitoring services to web searches during troubleshooting, must be replaced with local alternatives and documented procedures. The organizations that succeed treat air-gapped deployment as a discipline with its own practices, tooling, and staffing requirements, not as an afterthought applied to a connected architecture.