The Inference.net Staking Protocol is being tested on Solana Devnet with test
tokens. These tokens have no monetary value and should not be used for
real-world transactions or bought or sold by anyone.
Setting Up Your Solana Wallet for Devnet
All of these test features are deployed on Solana Devnet, so you’ll need to configure your wallet to use Solana Devnet.This step is very important. If not completed correctly you will experience
issues trying to submit transactions to our programs.
- Phantom Wallet: instructions
- Solflare Wallet: instructions
- Backpack Wallet: instructions
Preparing for Epoch 3
- Verify your hardware meets the minimum requirements
- Review the new quickstart documentation to deploy your Epoch 3 node.
- Link your Solana wallet on the dashboard to your Inference.net account before June 13th
- Join our Discord to montior Epoch 3 rollout and get support.
Epoch 3 Feature Timeline
Epoch 3 will be rolled out in phases, with each phase bringing new features and improvements to the network. Please follow Discord for updates.Core Protocol Improvements
Automatic Node Updates
Based on overwhelming operator feedback from previous epochs, we’ve implemented a comprehensive auto-update system for the Inference.net node software. This system works across all deployment types - CLI, Docker, and Desktop applications - and handles minor version updates without operator intervention. The auto-update mechanism performs health checks before and after updates, ensuring nodes remain operational throughout the process. In the event of an update failure, nodes automatically rollback to the previous stable version and report the issue to our monitoring systems. This significantly reduces the operational burden on GPU operators who previously needed to manually update nodes during each release cycle.Unified Inference Engine Management
Previous versions of Inference.net required operators to select and deploy specific Docker containers based on their intended inference engine (SGLang or vLLM). This created complexity for operators who needed to understand the technical differences between engines and make deployment decisions based on their hardware capabilities. In Epoch 3, we’ve consolidated these into a single container that automatically detects hardware specifications and selects the optimal inference engine. The selection algorithm considers:- GPU architecture and compute capability
- Available VRAM
- CPU specifications
- System memory
Enhanced GPU Detection and Validation
Starting June 6th, the network will enforce strict GPU detection requirements. Only GPUs that can be properly identified by our detection system will be permitted to join the network. This change from Epoch 2’s permissive approach ensures:- Accurate job routing based on hardware capabilities
- Proper performance benchmarking
- Prevention of misrepresented hardware specifications
Stake-Weighted Job Routing
The cornerstone of Epoch 3 is our new stake-weighted routing system, which fundamentally changes how inference jobs are distributed across the network. This system creates economic incentives for reliable operation while ensuring efficient resource utilization.Priority Score Calculation
Each instance operating on the network receives a priority score that determines its probability of receiving inference jobs. The priority score is calculated as:- Device VRAM / Total Operator VRAM: Normalizes stake across operators with different numbers of machines
- Total INT Stake: The sum of operator-owned and delegated tokens in their pool
- Reputation Weight: A multiplier between 0 and 1 based on performance metrics
- k: A network parameter that adjusts based on utilization
Understanding the k Parameter
The parameterk dynamically adjusts to optimize network efficiency:
- When
k = 0: Routing becomes round-robin, giving equal probability to all instances - When
kis large: Routing heavily favors staked operators - During low network utilization:
kincreases to reward staked operators - During high network utilization:
kdecreases to leverage all available capacity
VRAM Normalization
Since operators manage varying numbers of GPUs with different VRAM capacities, the routing system normalizes stake based on VRAM. For example:- Operator A: 100,000 INT staked, running 4x RTX 4090s (96GB total VRAM)
- Operator B: 100,000 INT staked, running 1x A100 (80GB VRAM)
Reputation Scoring System
We employ a reputation system to evaluate GPU operator quality and determine job routing and rewards. Reputation combines three components: Verification, Uptime, and Reliability. Together, these reflect integrity, availability, and request completion performance and allow pool delegators to make more informed delegation decisions.Verification Score
Determines operator honesty and is used to determine job routing.- We use a proprietary inference verification system to ensure that operators are running inference jobs honesty.
- This system runs pass/fail verifications on all processed requests.
- A score is computed based on a rolling window of verifications and operators are promoted/demoted from our trusted lane based on a verification failure threshold.
- Promotion: Untrusted operators with ≤5 failures in their last 500 verifications are promoted to the trusted lane.
- Demotion: Trusted operators with ≥10 failures in their last 500 verifications are demoted to the evaluation lane.
Operators that consistently fail verifications may be halted and slashed,
according to the network security
guidelines.
Reliability Score
Captures how consistently requests complete successfully and on time.- Inference request completion rates are measured over a fast and slow window and combined into a single score.
- The score calculation uses a weighted average of the fast and slow window scores, e.g. 0.5 for fast (1 hour) and 0.5 for slow (72 hours).
- Scores are recomputed for all operators hourly.
Uptime Score
Measures how consistently instances are online and responsive.- The network periodically checks whether instances are online by sending a health check inference request to the instance.
- If the instance completes the request, it is considered online for the interval.
- If the instance does not complete the request, it is considered offline for the interval.
- To prevent gaming, checks begin only after an instance has been running for a short buffer period following startup.
Uptime scores are incorporated into network reward emission calculations to
allow us to incentive specific types of hardware to join the network. Rewards
are normalized across all operators by the number of points earned. More
hardware means more points, and more points means more rewards.
Solana-Based Staking Protocol
Technical Implementation
The Inference.net staking protocol is implemented as a Solana program using the SPL token standard for $INT-DEV tokens. The protocol manages:- Operator pool creation and configuration
- Stake delegation and undelegation
- Commission rate management
- Reward distribution and revenue payout
- Slashing mechanisms (to be enabled in later phases)
Operator Pools
Each operator can create a staking pool with the following configurable parameters:- Reward Commission Rate: Percentage of epoch rewards retained by the operator (0-100%)
- USDC Commission Rate: Percentage of USDC revenue retained by the operator (0-100%)
- Delegation Status: Whether to accept external delegations
- Minimum Self-Stake: Operators must maintain a globally-set minimum stake amount
Delegation Mechanics
Token holders can delegate $INT-DEV to operator pools to earn a share of rewards without running hardware. The delegation process operates as follows:- Tokens can be delegated immediately without cooldown
- Delegators earn rewards proportional to their share of the pool
- Rewards are calculated after operator commission
- Undelegating requires a cooldown period before tokens and rewards can be withdrawn
- No rewards accrue during the cooldown period
Token delegators are not exposed to any slashing risk, in the event a slashing
event occurs.
Dual Token System
During Epoch 3, operators will interact with two distinct reward mechanisms:$INT Points (Off-chain)
- Accumulated in real-time as jobs are processed
- Calculated based on computational work performed
- May be awarded for non-compute contributions (guides, community help)
- Serves as the primary performance metric during the testing phase
$INT-DEV Tokens (On-chain - Solana Devnet only)
- Distributed for testing purposes via airdrop and reward distributions
- Based on stake-weighted job completion
- Required for staking to receive job allocations
- Used for pool creation registration fees
USDC Revenue Sharing
- Operators earn USDC for processing inference jobs
- Operators can share USDC revenue with delegators by setting a USDC commission rate
- Delegators receive proportional USDC earnings based on their stake
- USDC accrues on-chain and can be withdrawn without affecting stake
The Inference.net Staking Protocol is being tested on Solana Devnet with test
tokens. These tokens have no monetary value and should not be used for
real-world transactions or bought or sold by anyone.
Long-term Implications
The architectural changes in Epoch 3 establish the foundation for Inference.net’s evolution from a points-based test network to a fully decentralized, economically sustainable inference protocol. The stake-weighted routing system creates a market mechanism for quality assurance, while the delegation system enables broader participation beyond hardware operators. As we progress through Epoch 3, we’ll gather data on:- Optimal k parameter values for different network conditions
- Stake distribution patterns and delegation preferences
- Performance improvements from unified engine management
- Economic efficiency of the dual-token model
Moving Forward
We encourage all operators to thoroughly review the new documentation, prepare their systems for the June 6th launch, and participate actively in testing these new features. Your feedback during this phase is crucial for refining these systems before our eventual mainnet deployment. For technical support, detailed documentation, and community discussions, please visit:- Documentation: docs.devnet.inference.net
- Dashboard: devnet.inference.net
- Discord: discord.gg/kuzco