This guide provides detailed instructions for deploying Inference.net nodes using Docker. If you’re new to Docker, please refer to the Docker documentation. Join our Discord community if you need assistance.

Requirements

  • Windows or Linux operating system
  • NVIDIA GPU from our supported hardware list
  • Docker Desktop (Windows) or Docker Engine (Linux)
  • NVIDIA drivers and container toolkit

Linux Installation

Prerequisites

  1. Install Docker Engine

    Follow the official Docker Engine installation guide for Linux.

  2. Install NVIDIA Drivers

    Option A: Automatic installation

    sudo apt update
    sudo apt install ubuntu-drivers-common
    sudo ubuntu-drivers autoinstall
    

    Option B: Manual installation

    sudo apt update
    sudo apt install ubuntu-drivers-common
    ubuntu-drivers devices
    sudo apt install nvidia-driver-XXX  # Replace XXX with recommended version
    

    Find your recommended driver version at NVIDIA’s driver download page.

  3. Install NVIDIA Container Toolkit

    Follow the NVIDIA Container Toolkit installation guide.

  4. Verify Installation

    # Check Docker installation
    docker --version
    docker info
    
    # Verify NVIDIA drivers
    nvidia-smi
    
    # Test NVIDIA Docker support
    docker run --rm --gpus all nvidia/cuda:12.4.0-base-ubuntu22.04 nvidia-smi
    

Start Node

  1. Register an account at https://devnet.inference.net/register

  2. Verify your email address

  3. Navigate to the Workers tab in your dashboard

  4. Click Create Worker in the top-right corner

  5. Enter a worker name, ensure Docker is selected, and click Create Worker

  6. On the Worker Details page, click Launch Worker

  7. Run the provided Docker command with your worker code:

    docker run \
      --pull=always \
      --restart=always \
      --runtime=nvidia \
      --gpus all \
      -v ~/.inference:/root/.inference \
      inferencedevnet/amd64-nvidia-inference-node:latest \
      --code <your-worker-code>
    

Once started, your node will enter the “Initializing” state on the dashboard. This initialization phase typically takes 1-2 minutes but may take up to 10 minutes depending on your GPU.

Windows Installation

Prerequisites

  1. Install Docker Desktop

    Download and install Docker Desktop for Windows.

  2. Install NVIDIA Drivers

    Download and install the appropriate NVIDIA driver for your GPU.

  3. Install NVIDIA Container Toolkit

    Follow the NVIDIA Container Toolkit guide for Windows.

Start Node

  1. Register an account at https://devnet.inference.net/register

  2. Verify your email address

  3. Navigate to the Workers tab in your dashboard

  4. Click Create Worker in the top-right corner

  5. Enter a worker name, ensure Docker is selected, and click Create Worker

  6. On the Worker Details page, click Launch Worker

  7. Run the provided Docker command with your worker code in PowerShell:

    docker run `
      --pull=always `
      --restart=always `
      --runtime=nvidia `
      --gpus all `
      -v ~/.inference:/root/.inference `
      inferencedevnet/amd64-nvidia-inference-node:latest `
      --code <your-worker-code>
    

Once started, your node will enter the “Initializing” state on the dashboard. This preparation phase typically takes 1-2 minutes but may take up to 10 minutes depending on your GPU.

Advanced Configuration

Running Multiple Containers on Multi-GPU Systems

If you have multiple GPUs, you can run separate containers, each utilizing a different GPU. This maximizes your hardware utilization by running multiple workers simultaneously.

Understanding GPU Selection

GPUs are numbered starting from 0. You can specify which GPU(s) each container should use:

# Use all available GPUs
docker run --gpus all ...

# Use specific GPUs (e.g., first and second GPU)
docker run --gpus '"device=0,1"' ...

# Use a single GPU (e.g., first GPU)
docker run --gpus '"device=0"' ...

Example: Multiple GPU Setup

Run separate containers for each GPU:

Container 1 (GPU 0):

docker run -d \
  --pull=always \
  --restart=always \
  --runtime=nvidia \
  --gpus '"device=0"' \
  -v ~/.inference:/root/.inference \
  --name inference-node-1 \
  inferencedevnet/amd64-nvidia-inference-node:latest \
  --code <your-worker-code>

Container 2 (GPU 1):

docker run -d \
  --pull=always \
  --restart=always \
  --runtime=nvidia \
  --gpus '"device=1"' \
  -v ~/.inference:/root/.inference \
  --name inference-node-2 \
  inferencedevnet/amd64-nvidia-inference-node:latest \
  --code <your-worker-code>

Resource Management

Set memory and CPU limits to prevent resource contention:

docker run \
  --gpus '"device=0"' \
  --memory=30g \        # Limit to 30GB RAM (minimum recommended)
  --cpus=8 \            # Limit to 8 CPU cores
  inferencedevnet/amd64-nvidia-inference-node:latest \
  --code <your-worker-code>

Complete Multi-GPU Example

Running two containers with resource limits:

# Container 1 - GPU 0
docker run -d \
  --pull=always \
  --restart=always \
  --runtime=nvidia \
  --gpus '"device=0"' \
  --memory=30g \
  --cpus=8 \
  -v ~/.inference:/root/.inference \
  --name inference-node-1 \
  inferencedevnet/amd64-nvidia-inference-node:latest \
  --code <your-worker-code>

# Container 2 - GPU 1
docker run -d \
  --pull=always \
  --restart=always \
  --runtime=nvidia \
  --gpus '"device=1"' \
  --memory=30g \
  --cpus=8 \
  -v ~/.inference:/root/.inference \
  --name inference-node-2 \
  inferencedevnet/amd64-nvidia-inference-node:latest \
  --code <your-worker-code>

Troubleshooting

Container Startup Issues

When your Docker container fails to start or stops unexpectedly, these commands help diagnose and resolve Docker daemon-related issues. Use them to check if Docker is running properly, restart the service if needed, or examine system logs for error messages.

# Check Docker daemon status (Linux)
sudo systemctl status docker

# Restart Docker daemon (Linux)
sudo systemctl restart docker

# View Docker logs (Linux)
sudo journalctl -fu docker

GPU Access Problems

These commands help troubleshoot GPU visibility and accessibility issues within Docker containers. Use them to verify that the NVIDIA runtime is properly configured, list available GPUs on your system, or reset a GPU that may be in an error state.

# Verify NVIDIA runtime
docker info | grep nvidia

# List available GPUs
nvidia-smi -L

# Reset GPU (if needed - Linux)
sudo nvidia-smi --gpu-reset

GPU Monitoring

Monitor GPU performance and resource utilization in real-time. These commands are essential for tracking GPU memory usage, identifying bottlenecks, and ensuring your inference workloads are running efficiently. Use them to detect memory leaks or verify that your containers are properly utilizing GPU resources.

# Real-time GPU monitoring
watch -n 1 nvidia-smi

# Detailed memory usage
nvidia-smi --query-gpu=timestamp,name,memory.used,memory.total,memory.free --format=csv

# Monitor GPU processes
nvidia-smi --query-compute-apps=pid,process_name,used_memory --format=csv -l 1

Container Management

Essential commands for managing Docker containers running your inference nodes. Use these to check container health, view logs for debugging, follow real-time output during operations, or monitor resource consumption to ensure containers aren’t exceeding system limits.

# Check container status
docker ps -a

# View container logs
docker logs <container_name>

# Follow logs in real-time
docker logs -f <container_name>

# Monitor resource usage
docker stats <container_name>

Handling Failures

Commands for managing failed or problematic containers. Use these when you need to stop unresponsive containers, force-terminate hung processes, remove containers for a fresh start, or clean up disk space by removing unused Docker resources.

# Stop container gracefully
docker stop <container_name>

# Force stop
docker kill <container_name>

# Remove container
docker rm <container_name>

# Clean up unused resources
docker system prune

Restart Policies

Configure automatic container restart behavior to ensure high availability of your inference nodes. These policies help maintain uptime by automatically restarting containers after crashes, system reboots, or failures, reducing the need for manual intervention.

# Always restart (including after reboot)
docker run -d --restart always ...

# Restart on failure (max 5 attempts)
docker run -d --restart on-failure:5 ...

# Restart unless manually stopped
docker run -d --restart unless-stopped ...

Performance Monitoring

Advanced monitoring commands for tracking both container and GPU performance metrics. Use these to identify performance bottlenecks, monitor power consumption for efficiency optimization, or track temperature to prevent thermal throttling during intensive inference workloads.

# Monitor container metrics
docker stats --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}\t{{.NetIO}}"

# Monitor GPU temperature and power
nvidia-smi --query-gpu=temperature.gpu,power.draw,power.limit --format=csv -l 1

Tip: Always check container logs first when troubleshooting issues. They often contain valuable error messages and diagnostic information.