Hosting

Become a Hoster

Share your GPU with the world — or just your team. Create a free plan to share privately, or set up paid subscriptions and keep 80% of revenue.

🚧 LLMFinder is currently in beta. Access is invite-only. Request an invite →

Requirements

A GPU server with at least 4GB VRAM (or a CPU with 8GB+ RAM)
A publicly reachable endpoint (not localhost or private IPs)
An OpenAI-compatible server (setup guide)
An invite code — register to request one

One-command setup

The fastest way to get started. The setup wizard detects your GPU, picks a model, configures Docker Compose, and registers with LLMFinder:

curl -O https://llmfinder.net/llmfinder-hoster.py && python3 llmfinder-hoster.py

The wizard:

Detects your GPU and available VRAM
Downloads the right model into ~/llmfinder-models/
Calculates optimal context size from GGUF metadata + VRAM
Generates a docker-compose.yml with your backend (llama.cpp, Ollama, or vLLM) plus a Cloudflare tunnel for public access
Registers your endpoint with LLMFinder and runs a verification test

Top-level menu (7 options): 1) Setup wizard, 2) Add/update models, 3) Server verification test, 4) Update server URL, 5) Rotate bearer token, 6) Uninstall, 7) Exit.

💡 The Cloudflare tunnel URL changes on every restart (free tier). The script auto-syncs the URL with LLMFinder on each menu open. For a permanent URL, use a named Cloudflare tunnel.

Manual registration

If you already have a server running, register it directly:

curl -X POST https://api.llmfinder.net/hosters/register \
  -H "Content-Type: application/json" \
  -d '{
    "name": "My GPU Server",
    "email": "[email protected]",
    "endpoint_url": "https://my-server.example.com",
    "api_key": "my-bearer-token",
    "invite_code": "BETA2026",
    "models": [
      {
        "model_id": "llama-3-8b",
        "model_alias": "Llama 3 8B",
        "price_per_input_token": 100,
        "price_per_output_token": 300,
        "context_window": 8192,
        "max_tokens": 2048
      }
    ]
  }'

Supported server software

Software	Compatible	Notes
llama.cpp server	✅	Recommended. Supports GGUF models.
vLLM	✅	Best for large HuggingFace models.
Ollama	✅	Natively OpenAI-compatible. No bridge needed.
Any OpenAI-compatible server	✅	Must expose /health and /v1/chat/completions

Blocked endpoints

The following cannot be registered (ToS violation):

Localhost or private IPs (127.x, 10.x, 192.168.x) — must be publicly reachable
Commercial API providers (OpenAI, Anthropic, Google, etc.)
Commercial model IDs (gpt-4, claude-*, gemini-*)

Verification

After registration, LLMFinder runs two checks:

Health check — GET /health must return HTTP 200
Inference test — sends a test prompt, expects a valid response

Once both pass, your server goes live and starts receiving traffic.

📖

Become a Hoster

Requirements

One-command setup

Manual registration

Supported server software

Blocked endpoints

Verification

Setup Guide

Credits & Earnings

Hoster Portal