Architecture

Full deployment

All traffic goes through a single SPCS ingress endpoint on port 18789. Snowflake handles TLS and authentication. The Cortex proxy runs as an internal-only sidecar in the same service — it is not exposed to ingress.

Channels  <--socket/ws-->  SPCS Ingress  <-->  OpenClaw Gateway (:18789)
                                                  |
                                                  +-- Web UI
                                                  +-- WebSocket RPC
                                                  +-- /v1/* OpenAI-compat API
                                                  +-- Plugin HTTP routes
                                                  |
                                                Cortex Proxy Sidecar (:8080, internal)
                                                  +-- POST /v1/chat/completions
                                                  |       (OpenAI / Snowflake / Llama)
                                                  +-- POST /v1/messages
                                                  |       (Claude, native cache_control)
                                                  +-- Secret masking (per-shape walker)
                                                  +-- Request transforms
                                                  +-- SSE streaming passthrough
                                                  +-- Cache-stat logging
                                                  |
                                                Snowflake Cortex (outbound)

Two containers in one SPCS service:

  • openclaw — the main gateway. 1–2 CPU, 2–4Gi RAM. Secrets injected as env vars. Volume mounted at /home/node/.openclaw and backed by the snowclaw_state_stage internal stage.
  • cortex-proxy — the FastAPI sidecar. 0.5–1 CPU, 512Mi–1Gi RAM. Same secret env vars for masking.

Providers

The generated openclaw.json writes two providers, both backed by the same proxy on different endpoints:

ProviderTransportEndpointModelsCaching
cortex-claudeanthropic-messages/v1/messagesClaude onlyNative cache_control markers, honored by Cortex
cortex-openaiopenai-completions/v1/chat/completionsOpenAI / Snowflake / LlamaAutomatic, server-side (>1024 tokens)

Routing happens at config-write time — Claude models go to cortex-claude, everything else to cortex-openai. You can add more providers manually.

Standalone proxy

A lighter deployment shape: just the Cortex proxy as its own SPCS service with a public endpoint. Lets external OpenClaw agents reach Cortex without the full stack.

External OpenClaw Agent  -->  SPCS Ingress (auth + Sf-Context-Current-User)
                               |
                               +-- Cortex Proxy (:8080, PUBLIC endpoint)
                                     +-- X-Cortex-Token → Bearer auth
                                     +-- Request transforms
                                     +-- Rate limit retry (429 backoff)
                                     |
                                     +-- Snowflake Cortex LLMs

Single container, no secrets, no volumes, no masking. Each user authenticates with their own PAT via the X-Cortex-Token custom header (which SPCS passes through untouched; the Authorization header is stripped at ingress). See snowclaw proxy for the deployment commands.

Build-time vs. runtime

SnowClaw is careful about what lives where, because it changes what a redeploy means:

Lives at build time (baked into image)Lives at runtime (stage-backed volume)Lives as Snowflake SECRET
Dockerfile + base image versionopenclaw.jsonAll .env variables
Cortex Code skill definitionskills/connections.toml
Build-hook installationsworkspace/SNOWCLAW_MASK_VARS list
npm plugins (installed via openclaw plugins install)Path-based plugins copied into build context

That split is why snowclaw push and snowclaw restart can hot-reload config without a full deploy: the gateway reads config from the volume, not from the image.