This guide provides detailed instructions for deploying Inference.net nodes using Docker. If you’re new to Docker, please refer to the Docker documentation. Join our Discord community if you need assistance.

Requirements

  • Windows or Linux operating system
  • NVIDIA GPU from our supported hardware list
  • Docker Desktop (Windows) or Docker Engine (Linux)
  • NVIDIA drivers and container toolkit

Linux Installation

Prerequisites

  1. Install Docker Engine Follow the official Docker Engine installation guide for Linux.
  2. Install NVIDIA Drivers Option A: Automatic installation
    sudo apt update
    sudo apt install ubuntu-drivers-common
    sudo ubuntu-drivers autoinstall
    
    Option B: Manual installation
    sudo apt update
    sudo apt install ubuntu-drivers-common
    ubuntu-drivers devices
    sudo apt install nvidia-driver-XXX  # Replace XXX with recommended version
    
    Find your recommended driver version at NVIDIA’s driver download page.
  3. Install NVIDIA Container Toolkit Follow the NVIDIA Container Toolkit installation guide.
  4. Verify Installation
    # Check Docker installation
    docker --version
    docker info
    
    # Verify NVIDIA drivers
    nvidia-smi
    
    # Test NVIDIA Docker support
    docker run --rm --gpus all nvidia/cuda:12.4.0-base-ubuntu22.04 nvidia-smi
    

Start Node

  1. Register an account at https://devnet.inference.net/register
  2. Verify your email address
  3. Navigate to the Workers tab in your dashboard
  4. Click Create Worker in the top-right corner
  5. Enter a worker name, ensure Docker is selected, and click Create Worker
  6. On the Worker Details page, click Launch Worker
  7. Run the provided Docker command with your worker code:
    docker run \
      --pull=always \
      --restart=always \
      --runtime=nvidia \
      --gpus all \
      -v ~/.inference:/root/.inference \
      inferencedevnet/amd64-nvidia-inference-node:latest \
      --code <your-worker-code>
    
Once started, your node will enter the “Initializing” state on the dashboard. This initialization phase typically takes 1-2 minutes but may take up to 10 minutes depending on your GPU.

Windows Installation

Prerequisites

  1. Install Docker Desktop Download and install Docker Desktop for Windows.
  2. Install NVIDIA Drivers Download and install the appropriate NVIDIA driver for your GPU.
  3. Install NVIDIA Container Toolkit Follow the NVIDIA Container Toolkit guide for Windows.

Start Node

  1. Register an account at https://devnet.inference.net/register
  2. Verify your email address
  3. Navigate to the Workers tab in your dashboard
  4. Click Create Worker in the top-right corner
  5. Enter a worker name, ensure Docker is selected, and click Create Worker
  6. On the Worker Details page, click Launch Worker
  7. Run the provided Docker command with your worker code in PowerShell:
    docker run `
      --pull=always `
      --restart=always `
      --runtime=nvidia `
      --gpus all `
      -v ~/.inference:/root/.inference `
      inferencedevnet/amd64-nvidia-inference-node:latest `
      --code <your-worker-code>
    
Once started, your node will enter the “Initializing” state on the dashboard. This preparation phase typically takes 1-2 minutes but may take up to 10 minutes depending on your GPU.

Advanced Configuration

Running Multiple Containers on Multi-GPU Systems

If you have multiple GPUs, you can run separate containers, each utilizing a different GPU. This maximizes your hardware utilization by running multiple workers simultaneously.

Understanding GPU Selection

GPUs are numbered starting from 0. You can specify which GPU(s) each container should use:
# Use all available GPUs
docker run --gpus all ...

# Use specific GPUs (e.g., first and second GPU)
docker run --gpus '"device=0,1"' ...

# Use a single GPU (e.g., first GPU)
docker run --gpus '"device=0"' ...

Example: Multiple GPU Setup

Run separate containers for each GPU: Container 1 (GPU 0):
docker run -d \
  --pull=always \
  --restart=always \
  --runtime=nvidia \
  --gpus '"device=0"' \
  -v ~/.inference:/root/.inference \
  --name inference-node-1 \
  inferencedevnet/amd64-nvidia-inference-node:latest \
  --code <your-worker-code>
Container 2 (GPU 1):
docker run -d \
  --pull=always \
  --restart=always \
  --runtime=nvidia \
  --gpus '"device=1"' \
  -v ~/.inference:/root/.inference \
  --name inference-node-2 \
  inferencedevnet/amd64-nvidia-inference-node:latest \
  --code <your-worker-code>

Resource Management

Set memory and CPU limits to prevent resource contention:
docker run \
  --gpus '"device=0"' \
  --memory=30g \        # Limit to 30GB RAM (minimum recommended)
  --cpus=8 \            # Limit to 8 CPU cores
  inferencedevnet/amd64-nvidia-inference-node:latest \
  --code <your-worker-code>

Complete Multi-GPU Example

Running two containers with resource limits:
# Container 1 - GPU 0
docker run -d \
  --pull=always \
  --restart=always \
  --runtime=nvidia \
  --gpus '"device=0"' \
  --memory=30g \
  --cpus=8 \
  -v ~/.inference:/root/.inference \
  --name inference-node-1 \
  inferencedevnet/amd64-nvidia-inference-node:latest \
  --code <your-worker-code>

# Container 2 - GPU 1
docker run -d \
  --pull=always \
  --restart=always \
  --runtime=nvidia \
  --gpus '"device=1"' \
  --memory=30g \
  --cpus=8 \
  -v ~/.inference:/root/.inference \
  --name inference-node-2 \
  inferencedevnet/amd64-nvidia-inference-node:latest \
  --code <your-worker-code>

Docker Compose Configuration

For easier management of multiple containers, you can use Docker Compose. This is particularly useful when running multiple GPU instances or managing complex deployments. Create a docker-compose.yml file:
services:
  instance-0:
    image: inferencedevnet/amd64-nvidia-inference-node:latest
    command: --code <registry-code>
    restart: always
    environment:
      CONFIG_DIR: /root/.inference
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              device_ids: ["0"]
              capabilities: [gpu]
    volumes:
      - /root:/root/.inference

  instance-1:
    image: inferencedevnet/amd64-nvidia-inference-node:latest
    command: --code <registry-code>
    restart: always
    environment:
      CONFIG_DIR: /root/.inference
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              device_ids: ["1"]
              capabilities: [gpu]
    volumes:
      - /root:/root/.inference
To deploy with Docker Compose:
# Start all services
docker-compose up -d

# View logs for all services
docker-compose logs -f

# Stop all services
docker-compose down

# Restart a specific instance
docker-compose restart instance-0
This configuration automatically handles:
  • GPU device assignment for each container
  • Persistent volume mounting for configuration
  • Automatic restart on failure
  • Proper environment variable setup

Troubleshooting

Container Startup Issues

When your Docker container fails to start or stops unexpectedly, these commands help diagnose and resolve Docker daemon-related issues. Use them to check if Docker is running properly, restart the service if needed, or examine system logs for error messages.
# Check Docker daemon status (Linux)
sudo systemctl status docker

# Restart Docker daemon (Linux)
sudo systemctl restart docker

# View Docker logs (Linux)
sudo journalctl -fu docker

GPU Access Problems

These commands help troubleshoot GPU visibility and accessibility issues within Docker containers. Use them to verify that the NVIDIA runtime is properly configured, list available GPUs on your system, or reset a GPU that may be in an error state.
# Verify NVIDIA runtime
docker info | grep nvidia

# List available GPUs
nvidia-smi -L

# Reset GPU (if needed - Linux)
sudo nvidia-smi --gpu-reset

GPU Monitoring

Monitor GPU performance and resource utilization in real-time. These commands are essential for tracking GPU memory usage, identifying bottlenecks, and ensuring your inference workloads are running efficiently. Use them to detect memory leaks or verify that your containers are properly utilizing GPU resources.
# Real-time GPU monitoring
watch -n 1 nvidia-smi

# Detailed memory usage
nvidia-smi --query-gpu=timestamp,name,memory.used,memory.total,memory.free --format=csv

# Monitor GPU processes
nvidia-smi --query-compute-apps=pid,process_name,used_memory --format=csv -l 1

Container Management

Essential commands for managing Docker containers running your inference nodes. Use these to check container health, view logs for debugging, follow real-time output during operations, or monitor resource consumption to ensure containers aren’t exceeding system limits.
# Check container status
docker ps -a

# View container logs
docker logs <container_name>

# Follow logs in real-time
docker logs -f <container_name>

# Monitor resource usage
docker stats <container_name>

Handling Failures

Commands for managing failed or problematic containers. Use these when you need to stop unresponsive containers, force-terminate hung processes, remove containers for a fresh start, or clean up disk space by removing unused Docker resources.
# Stop container gracefully
docker stop <container_name>

# Force stop
docker kill <container_name>

# Remove container
docker rm <container_name>

# Clean up unused resources
docker system prune

Restart Policies

Configure automatic container restart behavior to ensure high availability of your inference nodes. These policies help maintain uptime by automatically restarting containers after crashes, system reboots, or failures, reducing the need for manual intervention.
# Always restart (including after reboot)
docker run -d --restart always ...

# Restart on failure (max 5 attempts)
docker run -d --restart on-failure:5 ...

# Restart unless manually stopped
docker run -d --restart unless-stopped ...

Performance Monitoring

Advanced monitoring commands for tracking both container and GPU performance metrics. Use these to identify performance bottlenecks, monitor power consumption for efficiency optimization, or track temperature to prevent thermal throttling during intensive inference workloads.
# Monitor container metrics
docker stats --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}\t{{.NetIO}}"

# Monitor GPU temperature and power
nvidia-smi --query-gpu=temperature.gpu,power.draw,power.limit --format=csv -l 1
Tip: Always check container logs first when troubleshooting issues. They often contain valuable error messages and diagnostic information.