Autonomous Local Agents

This guide walks through running both OpenClaw and ByteRover CLI on local LLMs using LM Studio.

Tested Configuration

Component	Version / Details
Machine	Mac M4 Pro, RAM 24GB
LM Studio	0.4.9
OpenClaw	2026.4.12
ByteRover CLI	3.3.0

This is an experimental setup. ByteRover and autonomous agents can run with OpenClaw on a Apple RAM 24GB machine, but for production usage we recommend at least an Apple M4 with RAM 48GB.

Step 1 — Download the Models

Search for and download both GGUF files directly from LM Studio’s Discover tab.

Download Gemma 4 E4B for OpenClaw

Search for unsloth/gemma-4-E4B-it-GGUF and download gemma-4-E4B-it-UD-Q4_K_XL.gguf.

Download Qwen3.5-9B for ByteRover

Search for unsloth/Qwen3.5-9B-GGUF and download Qwen3.5-9B-Q4_K_S.gguf.

On a 24 GB machine, both models fit in memory simultaneously. Gemma 4 E4B at Q4 uses ~8.7 GB and Qwen3.5-9B at Q4 uses ~10.5 GB.

Step 2 — Load Both Models in LM Studio

LM Studio serves all loaded models from a single endpoint at http://localhost:1234/v1. Load both models before starting the server.

Open My Models

Go to the Models tab. You should see both downloaded models listed.

Load Gemma 4 E4B

Click on gemma-4-E4B-it-UD-Q4_K_XL.gguf and click Load. Note the API Identifier — LM Studio assigns it google/gemma-4-e4b. This is the model ID you will use in OpenClaw’s config.

Load Qwen3.5-9B

Click on Qwen3.5-9B-Q4_K_S.gguf.gguf and click Load. The API Identifier will be qwen3.5-9b.

Verify both models are ready

Open the Developer tab. Both models should show READY status, reachable at http://127.0.0.1:1234.

Both models loaded and ready in LM Studio Developer tab

Confirm with:

curl http://localhost:1234/v1/models

The response should list both google/gemma-4-e4b and qwen3.5-9b.

Step 3 — Configure Your Agent

Both OpenClaw and Hermes use the same local provider setup. Pick the agent you are using.

OpenClaw
Hermes

Run the OpenClaw onboard wizard:

openclaw onboard

Select Custom Provider

When prompted for Model/auth provider, scroll down and select Custom Provider.

Enter the endpoint details

Fill in the following when prompted:

Field	Value
API Base URL	`http://localhost:1234/v1`
API Key	(leave blank)
Endpoint compatibility	`OpenAI-compatible`
Model ID	`google/gemma-4-e4b`
Model alias	`google-gemma-4-e4b`

The wizard verifies the endpoint and reports Verification successful.

OpenClaw Custom Provider setup and verification

Start OpenClaw and verify

Launch OpenClaw. It will use google/gemma-4-e4b served by LM Studio at localhost:1234.

Context limit — OpenClaw works normally 50,000 tokens and above. To update this, edit openclaw.json manually, then run openclaw gateway restart to apply changes.

Context limit exceeded warning in OpenClaw

Resulting openclaw.json config

The wizard writes the following into ~/.openclaw/openclaw.json. You can also add this manually:

{
  "models": {
    "mode": "merge",
    "providers": {
      "custom-localhost-1234": {
        "baseUrl": "http://localhost:1234/v1",
        "api": "openai-completions",
        "models": [
          {
            "id": "google/gemma-4-e4b",
            "name": "gemma-4-E4B-it (Local)",
            "contextWindow": 50000,
            "maxTokens": 50000,
            "input": ["text"],
            "cost": { "input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0 },
            "reasoning": false
          }
        ]
      }
    }
  },
  "agents": {
    "defaults": {
      "model": {
        "primary": "custom-localhost-1234/google/gemma-4-e4b"
      },
      "models": {
        "custom-localhost-1234/google/gemma-4-e4b": {
          "alias": "google-gemma-4-e4b"
        }
      }
    }
  }
}

Run the Hermes model setup wizard:

hermes setup model

Select Custom Provider

When prompted for the model provider, scroll down and select Custom Provider (any OpenAI or Anthropic compatible endpoint).

Enter the endpoint details

Fill in the following when prompted:

Field	Value
API Base URL	`http://localhost:1234/v1`
API Key	(leave blank)
Endpoint compatibility	`OpenAI-compatible`
Model ID	`google/gemma-4-e4b`

The wizard confirms the endpoint and model are configured.

Configure context length

Hermes will prompt you to set the context length. Set it to match the model’s context window (65000 tokens for Gemma 4 E4B because hermes agent required minimum 64000 tokens).

Start Hermes and verify

Launch Hermes. It will use google/gemma-4-e4b served by LM Studio at localhost:1234.

Step 4 — Configure ByteRover CLI

Connect ByteRover to the same local endpoint and select the Qwen model.

Open the providers command

In the ByteRover TUI, type /providers and press Enter.

Select OpenAI Compatible

Scroll to OpenAI Compatible and press Enter. This covers LM Studio, Ollama, and any other OpenAI-compatible local server.

Enter the base URL

When prompted, enter http://localhost:1234/v1 and press Enter. Leave the API key blank.

Select the Qwen model

From the model list, select qwen3.5-9b (128K ctx).

Connect the local provider

brv providers connect openai-compatible --base-url http://localhost:1234/v1

Switch to the Qwen model

brv model switch qwen3.5-9b

Step 5 — Verify ByteRover Is Working

Run a quick curate command to confirm ByteRover is using the local Qwen model.

Run a curate command

/curate "caching algorithm list: lru, lfu, fifo"

Confirm it processes with the local model

ByteRover sends the request to Qwen3.5-9B on LM Studio. You can watch the LM Studio Developer tab update in real time.

ByteRover curate working with Qwen local model

Review the results

ByteRover returns structured knowledge extracted from the curate request.

The context tree is updated with new memory files you can inspect directly.

ByteRover memory file created in VS Code

Checking new memory file via ByteRover commands

Step 6 — Enable ByteRover Memory Integration

Connect your agent to ByteRover for persistent memory across sessions.

OpenClaw Integration

Configure ByteRover as the context engine for OpenClaw

Hermes Integration

Configure ByteRover as the memory provider for Hermes

Reference

LLM Providers

Connect an external provider or use the built-in LLM

Onboard Context

Learn how to seed your context tree with existing knowledge

Reference

Configuration details, troubleshooting, and advanced topics

Local & Cloud

Exploring local & cloud options

​Tested Configuration

​Step 1 — Download the Models

​Step 2 — Load Both Models in LM Studio

​Step 3 — Configure Your Agent

​Step 4 — Configure ByteRover CLI

​Step 5 — Verify ByteRover Is Working

​Step 6 — Enable ByteRover Memory Integration

OpenClaw Integration

Hermes Integration

​Reference

LLM Providers

Onboard Context

Reference

Local & Cloud

Tested Configuration

Step 1 — Download the Models

Step 2 — Load Both Models in LM Studio

Step 3 — Configure Your Agent

Step 4 — Configure ByteRover CLI

Step 5 — Verify ByteRover Is Working

Step 6 — Enable ByteRover Memory Integration

Reference