Requirements
- Windows or Linux operating system
- NVIDIA GPU from our supported hardware list
- Docker Desktop (Windows) or Docker Engine (Linux)
- NVIDIA drivers and container toolkit
Linux Installation
Prerequisites
- Install Docker Engine Follow the official Docker Engine installation guide for Linux.
-
Install NVIDIA Drivers
Option A: Automatic installation
Option B: Manual installationFind your recommended driver version at NVIDIA’s driver download page.
- Install NVIDIA Container Toolkit Follow the NVIDIA Container Toolkit installation guide.
-
Verify Installation
Start Node
- Register an account at https://devnet.inference.net/register
- Verify your email address
- Navigate to the Workers tab in your dashboard
- Click Create Worker in the top-right corner
- Enter a worker name, ensure Docker is selected, and click Create Worker
- On the Worker Details page, click Launch Worker
-
Run the provided Docker command with your worker code:
Windows Installation
Prerequisites
- Install Docker Desktop Download and install Docker Desktop for Windows.
- Install NVIDIA Drivers Download and install the appropriate NVIDIA driver for your GPU.
- Install NVIDIA Container Toolkit Follow the NVIDIA Container Toolkit guide for Windows.
Start Node
- Register an account at https://devnet.inference.net/register
- Verify your email address
- Navigate to the Workers tab in your dashboard
- Click Create Worker in the top-right corner
- Enter a worker name, ensure Docker is selected, and click Create Worker
- On the Worker Details page, click Launch Worker
-
Run the provided Docker command with your worker code in PowerShell:
Advanced Configuration
Running Multiple Containers on Multi-GPU Systems
If you have multiple GPUs, you can run separate containers, each utilizing a different GPU. This maximizes your hardware utilization by running multiple workers simultaneously.Understanding GPU Selection
GPUs are numbered starting from 0. You can specify which GPU(s) each container should use:Example: Multiple GPU Setup
Run separate containers for each GPU: Container 1 (GPU 0):Resource Management
Set memory and CPU limits to prevent resource contention:Complete Multi-GPU Example
Running two containers with resource limits:Docker Compose Configuration
For easier management of multiple containers, you can use Docker Compose. This is particularly useful when running multiple GPU instances or managing complex deployments. Create adocker-compose.yml file:
- GPU device assignment for each container
- Persistent volume mounting for configuration
- Automatic restart on failure
- Proper environment variable setup
Troubleshooting
Container Startup Issues
When your Docker container fails to start or stops unexpectedly, these commands help diagnose and resolve Docker daemon-related issues. Use them to check if Docker is running properly, restart the service if needed, or examine system logs for error messages.GPU Access Problems
These commands help troubleshoot GPU visibility and accessibility issues within Docker containers. Use them to verify that the NVIDIA runtime is properly configured, list available GPUs on your system, or reset a GPU that may be in an error state.GPU Monitoring
Monitor GPU performance and resource utilization in real-time. These commands are essential for tracking GPU memory usage, identifying bottlenecks, and ensuring your inference workloads are running efficiently. Use them to detect memory leaks or verify that your containers are properly utilizing GPU resources.Container Management
Essential commands for managing Docker containers running your inference nodes. Use these to check container health, view logs for debugging, follow real-time output during operations, or monitor resource consumption to ensure containers aren’t exceeding system limits.Handling Failures
Commands for managing failed or problematic containers. Use these when you need to stop unresponsive containers, force-terminate hung processes, remove containers for a fresh start, or clean up disk space by removing unused Docker resources.Restart Policies
Configure automatic container restart behavior to ensure high availability of your inference nodes. These policies help maintain uptime by automatically restarting containers after crashes, system reboots, or failures, reducing the need for manual intervention.Performance Monitoring
Advanced monitoring commands for tracking both container and GPU performance metrics. Use these to identify performance bottlenecks, monitor power consumption for efficiency optimization, or track temperature to prevent thermal throttling during intensive inference workloads.Tip: Always check container logs first when troubleshooting issues. They often contain valuable error messages and diagnostic information.