Open WebUI and GPU Trader

Open WebUI is an open-source, browser-based interface for interacting with large language models (LLMs) like those running on Ollama, LM Studio, or local GGUF models. Designed for speed, simplicity, and privacy, Open WebUI offers a clean chat interface with support for multi-user access, prompt history, system prompts, and customizable personas. It runs entirely on your own hardware or containerized environment, ensuring full control over data and model execution.

GPU Trader offers a pre-configured managed template to deploy Open WebUI in a single click—making it easy to run private, performant LLMs without needing cloud accounts or API keys.

Install Open WebUI on GPU Trader

1

Sign Up and Create an Account

Start by signing in or creating an account on GPU Trader. If you need help, watch the GPU Trader Quickstart or read the docs.

2

Find and Rent a Compatible Instance

Use the Find a Device page to browse available GPUs. Filter by model, price, or provider to select a system that meets Open WebUI’s requirements.

Once selected, rent the instance directly from the dashboard.

Remember that Open WebUI is an interface to models. The engine that allows you to use them is Ollama and the models have different infrastructure requirements. Review the Model Compatibility table below if you are unsure what instance to pick

Need more help? Read the docs on finding and renting an instance.

3

Configure Your Instance for Open WebUI

After renting your instance, configure it with GPU Trader’s pre-built template Open WebUI template. To deploy them template click “Add Stack”, browse and select the Open WebUI template. Click “Use Template.”

Template:

services:
  ollama:
    volumes:
      - ./volume1/docker/ollama:/root/.ollama:rw
    pull_policy: always
    tty: true
    restart: always
    image: ollama/ollama:latest
    ports:
      - 11434-11534:11434
    labels:
      - io.gputrader.ports.name.11434=Ollama API
    environment:
      - NVIDIA_VISIBLE_DEVICES=all
      - NVIDIA_DRIVER_CAPABILITIES=compute,utility
  open-webui:
    image: ghcr.io/open-webui/open-webui:cuda
    restart: always
    volumes:
      - ./volume1/docker/open-webui:/app/backend/data:rw
    depends_on:
      - ollama
    environment:
      - OLLAMA_BASE_URL=http://ollama:11434
      - WEBUI_SECRET_KEY=
    ports:
      - 8080-8180:8080
    labels:
      - io.gputrader.ports.name.8080=Open WebUI
    extra_hosts:
      - host.docker.internal:host-gateway

Need more help? Read the docs on managing an instance or templates.

4

Launch Open WebUI and Start Conversing

Once Open WebUI is installed, launch the application by selecting the port dropdown or kebab menu to access the UI from your browser. Now you can create your account, add new users, and download the models of your choosing.

Get Started with Open WebUI

We will be the first to tell you, we are not experts in Open WebUI or the models that run on it, even though we think it is really useful and fun. If you are new to LLMs it is best to refer to the documentation for that model. To help you understand model compatibility we have created a list. The list is not comprehensive and should be used as a reference. There is good information on the web dedicated to optimizing the performance of your LLM.

Model Compatibility

LLMBest ForRecommended GPU(s)Notes
DeepSeek R1Logical reasoning, math, multilingual tasks2x A100 80GB or 1x H100 80GBEfficient Mixture-of-Experts model; rivals GPT-4 in performance.
Qwen 2.5-72BCoding, multilingual tasks, long-context understanding1x H100 80GB or 2x A100 80GBExcels in coding and supports 128K context window.
Mistral 7BChatbots, summarization, lightweight applications1x A10 24GB or betterFast and efficient; ideal for real-time applications.
Gemma 2 27BMultilingual QA, RAG, fine-tuning1x A100 80GB or 2x A40Optimized for performance on NVIDIA GPUs; good for RAG pipelines.
LLaMA 3.3 70BGeneral-purpose reasoning, coding, assistant tasks2x A100 80GB or 1x H100 80GBStrong all-rounder; extensive community support.