Architecture

Full deployment

All traffic goes through a single SPCS ingress endpoint on port 18789. Snowflake handles TLS and authentication. The Cortex proxy runs as an internal-only sidecar in the same service — it is not exposed to ingress.

Channels  <--socket/ws-->  SPCS Ingress  <-->  OpenClaw Gateway (:18789)
                                                  |
                                                  +-- Web UI
                                                  +-- WebSocket RPC
                                                  +-- /v1/* OpenAI-compat API
                                                  +-- Plugin HTTP routes
                                                  |
                                                Cortex Proxy Sidecar (:8080, internal)
                                                  +-- POST /v1/chat/completions
                                                  |       (OpenAI / Snowflake / Llama)
                                                  +-- POST /v1/messages
                                                  |       (Claude, native cache_control)
                                                  +-- Secret masking (per-shape walker)
                                                  +-- Request transforms
                                                  +-- SSE streaming passthrough
                                                  +-- Cache-stat logging
                                                  |
                                                Snowflake Cortex (outbound)

Two containers in one SPCS service:

openclaw — the main gateway. 1–2 CPU, 2–4Gi RAM. Secrets injected as env vars. Volume mounted at /home/node/.openclaw and backed by the snowclaw_state_stage internal stage.
cortex-proxy — the FastAPI sidecar. 0.5–1 CPU, 512Mi–1Gi RAM. Same secret env vars for masking.

Providers

The generated openclaw.json writes two providers, both backed by the same proxy on different endpoints:

Provider	Transport	Endpoint	Models	Caching
`cortex-claude`	`anthropic-messages`	`/v1/messages`	Claude only	Native `cache_control` markers, honored by Cortex
`cortex-openai`	`openai-completions`	`/v1/chat/completions`	OpenAI / Snowflake / Llama	Automatic, server-side (>1024 tokens)

Routing happens at config-write time — Claude models go to cortex-claude, everything else to cortex-openai. You can add more providers manually.

Standalone proxy

A lighter deployment shape: just the Cortex proxy as its own SPCS service with a public endpoint. Lets external OpenClaw agents reach Cortex without the full stack.

External OpenClaw Agent  -->  SPCS Ingress (auth + Sf-Context-Current-User)
                               |
                               +-- Cortex Proxy (:8080, PUBLIC endpoint)
                                     +-- X-Cortex-Token → Bearer auth
                                     +-- Request transforms
                                     +-- Rate limit retry (429 backoff)
                                     |
                                     +-- Snowflake Cortex LLMs

Single container, no secrets, no volumes, no masking. Each user authenticates with their own PAT via the X-Cortex-Token custom header (which SPCS passes through untouched; the Authorization header is stripped at ingress). See snowclaw proxy for the deployment commands.

Build-time vs. runtime

SnowClaw is careful about what lives where, because it changes what a redeploy means:

Lives at build time (baked into image)	Lives at runtime (stage-backed volume)	Lives as Snowflake `SECRET`
Dockerfile + base image version	`openclaw.json`	All `.env` variables
Cortex Code skill definition	`skills/`	`connections.toml`
Build-hook installations	`workspace/`	`SNOWCLAW_MASK_VARS` list
npm plugins (installed via `openclaw plugins install`)	Path-based plugins copied into build context

That split is why snowclaw push and snowclaw restart can hot-reload config without a full deploy: the gateway reads config from the volume, not from the image.

PreviousBuild hooks