Skip to main content
This guide walks through running both OpenClaw and ByteRover CLI on local LLMs using LM Studio.

Tested Configuration

ComponentVersion / Details
MachineMac M4 Pro, RAM 24GB
LM Studio0.4.9
OpenClaw2026.4.12
ByteRover CLI3.3.0
This is an experimental setup. ByteRover and autonomous agents can run with OpenClaw on a Apple RAM 24GB machine, but for production usage we recommend at least an Apple M4 with RAM 48GB.

Step 1 — Download the Models

Search for and download both GGUF files directly from LM Studio’s Discover tab.
1

Download Gemma 4 E4B for OpenClaw

Search for unsloth/gemma-4-E4B-it-GGUF and download gemma-4-E4B-it-UD-Q4_K_XL.gguf.Download Gemma 4 E4B in LM Studio
2

Download Qwen3.5-9B for ByteRover

Search for unsloth/Qwen3.5-9B-GGUF and download Qwen3.5-9B-Q4_K_S.gguf.Download Qwen3.5-9B in LM Studio
On a 24 GB machine, both models fit in memory simultaneously. Gemma 4 E4B at Q4 uses ~8.7 GB and Qwen3.5-9B at Q4 uses ~10.5 GB.

Step 2 — Load Both Models in LM Studio

LM Studio serves all loaded models from a single endpoint at http://localhost:1234/v1. Load both models before starting the server.
1

Open My Models

Go to the Models tab. You should see both downloaded models listed.LM Studio My Models screen
2

Load Gemma 4 E4B

Click on gemma-4-E4B-it-UD-Q4_K_XL.gguf and click Load. Note the API Identifier — LM Studio assigns it google/gemma-4-e4b. This is the model ID you will use in OpenClaw’s config.Gemma 4 E4B load configuration
3

Load Qwen3.5-9B

Click on Qwen3.5-9B-Q4_K_S.gguf.gguf and click Load. The API Identifier will be qwen3.5-9b.Qwen3.5-9B load configuration
4

Verify both models are ready

Open the Developer tab. Both models should show READY status, reachable at http://127.0.0.1:1234.Both models loaded and ready in LM Studio Developer tabConfirm with:
curl http://localhost:1234/v1/models
The response should list both google/gemma-4-e4b and qwen3.5-9b.

Step 3 — Configure Your Agent

Both OpenClaw and Hermes use the same local provider setup. Pick the agent you are using.
Run the OpenClaw onboard wizard:
openclaw onboard
1

Select Custom Provider

When prompted for Model/auth provider, scroll down and select Custom Provider.Select Custom Provider in OpenClaw onboard wizard
2

Enter the endpoint details

Fill in the following when prompted:
FieldValue
API Base URLhttp://localhost:1234/v1
API Key(leave blank)
Endpoint compatibilityOpenAI-compatible
Model IDgoogle/gemma-4-e4b
Model aliasgoogle-gemma-4-e4b
The wizard verifies the endpoint and reports Verification successful.OpenClaw Custom Provider setup and verification
3

Start OpenClaw and verify

Launch OpenClaw. It will use google/gemma-4-e4b served by LM Studio at localhost:1234.OpenClaw running with local Gemma model
Context limit — OpenClaw works normally 50,000 tokens and above. To update this, edit openclaw.json manually, then run openclaw gateway restart to apply changes.Context limit exceeded warning in OpenClaw
The wizard writes the following into ~/.openclaw/openclaw.json. You can also add this manually:
{
  "models": {
    "mode": "merge",
    "providers": {
      "custom-localhost-1234": {
        "baseUrl": "http://localhost:1234/v1",
        "api": "openai-completions",
        "models": [
          {
            "id": "google/gemma-4-e4b",
            "name": "gemma-4-E4B-it (Local)",
            "contextWindow": 50000,
            "maxTokens": 50000,
            "input": ["text"],
            "cost": { "input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0 },
            "reasoning": false
          }
        ]
      }
    }
  },
  "agents": {
    "defaults": {
      "model": {
        "primary": "custom-localhost-1234/google/gemma-4-e4b"
      },
      "models": {
        "custom-localhost-1234/google/gemma-4-e4b": {
          "alias": "google-gemma-4-e4b"
        }
      }
    }
  }
}

Step 4 — Configure ByteRover CLI

Connect ByteRover to the same local endpoint and select the Qwen model.
1

Open the providers command

In the ByteRover TUI, type /providers and press Enter.ByteRover TUI /providers command
2

Select OpenAI Compatible

Scroll to OpenAI Compatible and press Enter. This covers LM Studio, Ollama, and any other OpenAI-compatible local server.Select OpenAI Compatible provider
3

Enter the base URL

When prompted, enter http://localhost:1234/v1 and press Enter. Leave the API key blank.Enter LM Studio base URL
4

Select the Qwen model

From the model list, select qwen3.5-9b (128K ctx).Select qwen3.5-9b model

Step 5 — Verify ByteRover Is Working

Run a quick curate command to confirm ByteRover is using the local Qwen model.
1

Run a curate command

/curate "caching algorithm list: lru, lfu, fifo"
ByteRover curate command sample
2

Confirm it processes with the local model

ByteRover sends the request to Qwen3.5-9B on LM Studio. You can watch the LM Studio Developer tab update in real time.ByteRover curate working with Qwen local model
3

Review the results

ByteRover returns structured knowledge extracted from the curate request.ByteRover curate result in terminalThe context tree is updated with new memory files you can inspect directly.ByteRover memory file created in VS CodeChecking new memory file via ByteRover commands

Step 6 — Enable ByteRover Memory Integration

Connect your agent to ByteRover for persistent memory across sessions.

OpenClaw Integration

Configure ByteRover as the context engine for OpenClaw

Hermes Integration

Configure ByteRover as the memory provider for Hermes

Reference

LLM Providers

Connect an external provider or use the built-in LLM

Onboard Context

Learn how to seed your context tree with existing knowledge

Reference

Configuration details, troubleshooting, and advanced topics

Local & Cloud

Exploring local & cloud options