企业私有化部署 · AI 网关

把无界模型云的 AI 网关作为一个独立组件部署进你自己的基础设施：同一套 /v1 API，模型路由、密钥、计费、审计、策略全部留在企业内网，数据不出域。

「AI 网关 / Tensor Gateway」是无界模型云的企业私有化部署版本。部署后，你会得到一个企业内部的模型调用入口和管理后台：业务系统只访问这个网关，管理员在后台统一管理组织、API Key、模型供应商、用量和审计。

本文提供一套可直接复制使用的 Docker Compose 部署方式。默认部署在本机，管理后台地址是 http://localhost:3000，Grafana 地址是 http://localhost:3030。

私有化部署属于企业方案，需要结合你的集群规模、可用区与合规要求做容量规划。可在控制台联系客服获取部署支持，或直接联系客服开通。

为什么把网关放进自己的基础设施

把这一个组件部署在企业内部，你就在「业务」和「各家上游模型」之间插入了一个自己掌控的中枢：

数据不出域 — 提示词、响应、密钥、调用记录都留在内网，满足合规与数据安全要求。
统一入口 — 所有团队、所有应用共用一套 Base URL 与 Key 体系，不必各自对接多家上游。
集中治理 — 路由、限流、计费、审计、告警在一处配置、全局生效，模型供应商对业务侧透明可替换。
平滑迁移 — 与公有云无界模型云同一套 API，已有代码只需换 Base URL 与 Key 即可迁移过来。

网关提供什么能力

网关能力包含下面这些模块。实际可调用的模型取决于你在后台启用的供应商和模型：

统一多模型路由 — 一套接口覆盖六大能力：llm（对话）、image（图像生成 / 编辑）、asr（语音识别）、tts（语音合成）、ocr（文字识别）、vision-segment（视觉分割）。各能力共享同一账户、同一 Key、同一计费体系。
租户 API Key 与 scope — 在控制台为每个应用 / 团队创建独立的 gk_ 开头 Key，按 scope 控制可调用的能力与路径，便于审计、限额、停用。
计量计费 — 调用按能力、模型、计费项（token / 张 / 页 / 时长 / 字符）沉淀，可做预算与复盘。
配额限流 — 基于滑动窗口的热计数器做 RPM / 额度控制，多副本下计数一致。
调用审计 — 每次请求的请求体 / 响应、用量、计费、故障转移链路落库，供安全与对账使用。
组织级策略 — 路由策略、供应商分配、溢出（overflow）策略、告警与熔断按组织维度配置，热生效。

接入示例

网关的 API 表面与无界模型云一致。下面以本机部署地址为例，演示一次 OpenAI 兼容对话调用。AI_GATEWAY_API_KEY 是你在后台创建的 gk_ 开头网关 Key。

curl "http://localhost:3000/v1/chat/completions" \
  -H "Authorization: Bearer $AI_GATEWAY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen-plus",
    "messages": [
      { "role": "user", "content": "用一句话介绍你自己。" }
    ]
  }'

import os, requests

resp = requests.post(
    "http://localhost:3000/v1/chat/completions",
    headers={
        "Authorization": f"Bearer {os.environ['AI_GATEWAY_API_KEY']}",
        "Content-Type": "application/json",
    },
    json={
        "model": "qwen-plus",
        "messages": [
            {"role": "user", "content": "用一句话介绍你自己。"}
        ],
    },
)
print(resp.json())

const resp = await fetch("http://localhost:3000/v1/chat/completions", {
  method: "POST",
  headers: {
    Authorization: `Bearer ${process.env.AI_GATEWAY_API_KEY}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "qwen-plus",
    messages: [{ role: "user", content: "用一句话介绍你自己。" }],
  }),
});
console.log(await resp.json());

成功响应是标准的 OpenAI 兼容结构：

{
  "id": "chatcmpl-xxx",
  "object": "chat.completion",
  "model": "qwen-plus",
  "choices": [
    {
      "index": 0,
      "message": { "role": "assistant", "content": "你好，我是一个中文助手。" },
      "finish_reason": "stop"
    }
  ],
  "usage": { "prompt_tokens": 12, "completion_tokens": 9, "total_tokens": 21 }
}

网关数据面要求 Authorization: Bearer gk_... 形态的 API Key。私有化实例默认预置 DashScope 和 DeepSeek，均使用 OpenAI 兼容接口；其它协议或供应商可在后台继续添加。

能力与模型

私有化部署使用与无界模型云一致的网关能力。首次启动时，系统默认预置两个大模型供应商：

阿里云百炼 / DashScope — 常用模型包括 qwen-plus、qwen-max、qwen-vl-plus。
DeepSeek — 常用模型包括 deepseek-v4-pro、deepseek-v4-flash。

如果在 .env 里填写 DASHSCOPE_API_KEY 或 DEEPSEEK_API_KEY，对应供应商会在首次启动后自动可用。你也可以稍后在后台补充密钥、添加其它 OpenAI / Anthropic 兼容供应商，或接入自托管推理服务。

具体可用的模型 ID、版本与价格以控制台为准。建议把模型 ID 作为业务系统配置项管理，便于后续切换模型。

更多接入细节参见概览、快速开始、价格与计费。

部署与依赖

下面是一套可直接复制的 Docker Compose。它会启动 AI 网关、管理后台、PostgreSQL、Redis、GreptimeDB、Vector 和 Grafana。

先确认机器已经安装 Docker 和 Docker Compose v2：

docker --version
docker compose version

第一步：准备 `.env`

新建一个空目录，在目录里创建 .env：

# 首次登录使用的超级管理员
ONPREM_ADMIN_EMAIL=admin@example.com
ONPREM_ADMIN_PASSWORD=ai-gateway-admin-2026

# 可选：系统默认供应商密钥。填写后首次启动即可直接使用对应模型。
DASHSCOPE_API_KEY=
DEEPSEEK_API_KEY=

环境变量说明：

变量	默认值	说明
`ONPREM_ADMIN_EMAIL`	`admin@example.com`	首次登录后台的超级管理员邮箱。
`ONPREM_ADMIN_PASSWORD`	`ai-gateway-admin-2026`	首次登录后台的超级管理员密码。数据库初始化后，重启不会重置密码。
`DASHSCOPE_API_KEY`	空	阿里云百炼 / DashScope 的系统默认密钥。
`DEEPSEEK_API_KEY`	空	DeepSeek 的系统默认密钥。

下表配置已有可直接使用的默认值，首次部署不用填写。需要修改默认密码、组织名称或本地用量采集凭证时，把对应变量加入 .env 即可：

变量	默认值
`BETTER_AUTH_SECRET`	`ai-gateway-onprem-better-auth-secret-change-after-deployment`
`POSTGRES_PASSWORD`	`ai-gateway-postgres-2026`
`GRAFANA_ADMIN_PASSWORD`	`ai-gateway-grafana-2026`
`AI_GATEWAY_LOCAL_USAGE_INGEST_TOKEN`	`ai-gateway-local-usage-ingest-token-2026`
`ONPREM_ORG_NAME`	`default-org`

示例默认密码用于快速启动。正式使用前，建议将管理员密码、Grafana 密码和数据库密码改成企业自有密码。模型供应商 Key 只保存在本地私有化实例中，不会进入镜像，也不会同步到无界模型云公有云。

第二步：准备 `docker-compose.yml`

在同一目录创建 docker-compose.yml，复制下面完整内容：

name: ai-gateway-onprem

x-ai-gateway-image: &ai-gateway-image uhub.service.ucloud.cn/tensorfusion/ai-gateway-onprem:0.3.166

services:
  postgres:
    image: uhub.service.ucloud.cn/tensorfusion/pgvector:pg16
    restart: unless-stopped
    environment:
      POSTGRES_USER: postgres
      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-ai-gateway-postgres-2026}
      POSTGRES_DB: raas
    volumes:
      - pg-data:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U postgres -d raas"]
      interval: 5s
      timeout: 3s
      retries: 10

  postgres-init:
    image: uhub.service.ucloud.cn/tensorfusion/pgvector:pg16
    restart: "no"
    environment:
      PGPASSWORD: ${POSTGRES_PASSWORD:-ai-gateway-postgres-2026}
    command:
      - sh
      - -lc
      - |
        psql -h postgres -U postgres -d raas \
          -v ON_ERROR_STOP=1 \
          -c 'CREATE EXTENSION IF NOT EXISTS vector;'
    depends_on:
      postgres:
        condition: service_healthy

  redis:
    image: uhub.service.ucloud.cn/tensorfusion/redis:7-alpine
    restart: unless-stopped
    volumes:
      - redis-data:/data
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 5s
      timeout: 3s
      retries: 10

  greptimedb:
    image: uhub.service.ucloud.cn/tensorfusion/greptimedb:v1.0.1
    restart: unless-stopped
    command: standalone start --http-addr 0.0.0.0:4000 --rpc-addr 0.0.0.0:4001 --mysql-addr 0.0.0.0:4002 --postgres-addr 0.0.0.0:4003
    volumes:
      - greptime-data:/tmp/greptimedb
    healthcheck:
      test: ["CMD-SHELL", "curl -sf http://localhost:4000/health || exit 1"]
      interval: 5s
      timeout: 3s
      retries: 10

  migrate:
    image: *ai-gateway-image
    restart: "no"
    command: ["bun", "run", "migrate/migrate.js"]
    env_file: .env
    environment:
      NODE_ENV: production
      DEPLOYMENT_MODE: onprem
      BETTER_AUTH_SECRET: ${BETTER_AUTH_SECRET:-ai-gateway-onprem-better-auth-secret-change-after-deployment}
      DATABASE_URL: postgres://postgres:${POSTGRES_PASSWORD:-ai-gateway-postgres-2026}@postgres:5432/raas
    depends_on:
      postgres-init:
        condition: service_completed_successfully

  ai-gateway-app:
    image: *ai-gateway-image
    restart: unless-stopped
    env_file: .env
    environment:
      NODE_ENV: production
      PORT: "3000"
      DEPLOYMENT_MODE: onprem
      BETTER_AUTH_URL: http://localhost:3000
      CLIENT_ORIGIN: http://localhost:3000
      ONPREM_TRUST_ANY_ORIGIN: "false"
      BETTER_AUTH_SECRET: ${BETTER_AUTH_SECRET:-ai-gateway-onprem-better-auth-secret-change-after-deployment}
      ONPREM_ADMIN_EMAIL: ${ONPREM_ADMIN_EMAIL:-admin@example.com}
      ONPREM_ADMIN_PASSWORD: ${ONPREM_ADMIN_PASSWORD:-ai-gateway-admin-2026}
      ONPREM_ORG_NAME: ${ONPREM_ORG_NAME:-default-org}
      DATABASE_URL: postgres://postgres:${POSTGRES_PASSWORD:-ai-gateway-postgres-2026}@postgres:5432/raas
      REDIS_URL: redis://redis:6379
      GREPTIME_DATABASE_URL: postgres://greptimedb:4003/public
      GREPTIMEDB_REMOTE_WRITE_UPSTREAM_URL: http://greptimedb:4000/v1/prometheus/write?db=public
      GREPTIMEDB_INFLUX_WRITE_UPSTREAM_URL: http://greptimedb:4000/v1/influxdb/api/v2/write?bucket=public
      METRICS_QUERY_URL: http://greptimedb:4000/v1/prometheus
      METRICS_QUERY_DB: public
      AI_GATEWAY_USAGE_PIPELINE: vector
      AI_GATEWAY_LOCAL_USAGE_INGEST_TOKEN: ${AI_GATEWAY_LOCAL_USAGE_INGEST_TOKEN:-ai-gateway-local-usage-ingest-token-2026}
      AI_GATEWAY_LOGS_DIR: /var/log/ai-gateway
      IMAGE_DATA_DIR: /data/images
      TZ: Asia/Shanghai
      DASHSCOPE_API_KEY: ${DASHSCOPE_API_KEY:-}
      DEEPSEEK_API_KEY: ${DEEPSEEK_API_KEY:-}
    ports:
      - "127.0.0.1:3000:3000"
    volumes:
      - image-data:/data/images
      - usage-log:/var/log/ai-gateway
    depends_on:
      migrate:
        condition: service_completed_successfully
      redis:
        condition: service_healthy
      greptimedb:
        condition: service_healthy
    healthcheck:
      test: ["CMD", "bun", "-e", "fetch('http://localhost:3000/api/health').then(r=>process.exit(r.ok?0:1)).catch(()=>process.exit(1))"]
      interval: 10s
      timeout: 5s
      retries: 12
      start_period: 60s

  vector:
    image: uhub.service.ucloud.cn/tensorfusion/vector:0.40.0-alpine
    restart: unless-stopped
    environment:
      AI_GATEWAY_LOCAL_USAGE_INGEST_TOKEN: ${AI_GATEWAY_LOCAL_USAGE_INGEST_TOKEN:-ai-gateway-local-usage-ingest-token-2026}
    volumes:
      - usage-log:/var/log/ai-gateway:ro
      - vector-data:/var/lib/vector
    entrypoint: ["/bin/sh", "-c"]
    command:
      - |
        cat >/tmp/vector.yaml <<'YAML'
        api:
          enabled: true
          address: 0.0.0.0:8686
        data_dir: /var/lib/vector
        sources:
          usage_log:
            type: file
            include:
              - /var/log/ai-gateway/usage/*.log*
            read_from: beginning
          master_scrape:
            type: prometheus_scrape
            endpoints:
              - http://ai-gateway-app:3000/api/metrics/prom
            scrape_interval_secs: 15
        transforms:
          parse_usage:
            type: remap
            inputs: [usage_log]
            source: |
              parsed, err = parse_json(.message)
              if err != null {
                abort
              }
              . = parsed
        sinks:
          local_usage_batch:
            type: http
            inputs: [parse_usage]
            uri: http://ai-gateway-app:3000/api/local/usage-batch
            method: post
            encoding:
              codec: json
            framing:
              method: newline_delimited
            request:
              headers:
                content-type: application/x-ndjson
              retry_attempts: 4294967295
            auth:
              strategy: bearer
              token: "$${AI_GATEWAY_LOCAL_USAGE_INGEST_TOKEN}"
            batch:
              max_events: 50
              timeout_secs: 1
            buffer:
              type: disk
              when_full: block
              max_size: 268435488
          greptimedb_prom:
            type: prometheus_remote_write
            inputs: [master_scrape]
            endpoint: http://greptimedb:4000/v1/prometheus/write?db=public
            healthcheck:
              enabled: false
        YAML
        exec vector --config /tmp/vector.yaml
    depends_on:
      ai-gateway-app:
        condition: service_healthy
      greptimedb:
        condition: service_healthy
    healthcheck:
      test: ["CMD-SHELL", "wget -q -O- http://127.0.0.1:8686/health | grep '\"ok\":true' >/dev/null"]
      interval: 10s
      timeout: 5s
      retries: 12
      start_period: 30s

  grafana:
    image: uhub.service.ucloud.cn/tensorfusion/grafana:11.3.0
    restart: unless-stopped
    environment:
      GF_SECURITY_ADMIN_PASSWORD: ${GRAFANA_ADMIN_PASSWORD:-ai-gateway-grafana-2026}
      GF_USERS_DEFAULT_THEME: light
    entrypoint: ["/bin/sh", "-c"]
    command:
      - |
        mkdir -p /etc/grafana/provisioning/datasources
        cat >/etc/grafana/provisioning/datasources/greptime.yaml <<'YAML'
        apiVersion: 1
        datasources:
          - name: GreptimeDB PromQL
            uid: greptime-promql
            type: prometheus
            access: proxy
            url: http://greptimedb:4000/v1/prometheus
            isDefault: true
          - name: GreptimeDB Logs
            uid: greptime-logs
            type: postgres
            access: proxy
            url: greptimedb:4003
            database: public
            user: greptime
            jsonData:
              postgresVersion: 1200
              sslmode: disable
        YAML
        mkdir -p /etc/grafana/provisioning/dashboards /var/lib/grafana/dashboards
        cat >/etc/grafana/provisioning/dashboards/ai-gateway.yaml <<'YAML'
        apiVersion: 1
        providers:
          - name: ai-gateway-onprem
            orgId: 1
            type: file
            disableDeletion: false
            updateIntervalSeconds: 30
            options:
              path: /var/lib/grafana/dashboards
        YAML
        cat >/var/lib/grafana/dashboards/ai-gateway-onprem-overview.json <<'JSON'
        {
          "annotations": { "list": [] },
          "editable": true,
          "fiscalYearStartMonth": 0,
          "graphTooltip": 0,
          "id": null,
          "links": [],
          "panels": [
            {
              "datasource": { "type": "prometheus", "uid": "greptime-promql" },
              "gridPos": { "h": 6, "w": 8, "x": 0, "y": 0 },
              "id": 1,
              "targets": [
                {
                  "datasource": { "type": "prometheus", "uid": "greptime-promql" },
                  "expr": "sum(rate(ai_gateway_outbox_emits_total[5m]))",
                  "legendFormat": "events",
                  "refId": "A"
                }
              ],
              "title": "事件发布速率",
              "type": "stat"
            },
            {
              "datasource": { "type": "prometheus", "uid": "greptime-promql" },
              "gridPos": { "h": 6, "w": 16, "x": 8, "y": 0 },
              "id": 2,
              "targets": [
                {
                  "datasource": { "type": "prometheus", "uid": "greptime-promql" },
                  "expr": "sum(rate(ai_gateway_outbox_worker_published_total[5m]))",
                  "legendFormat": "published",
                  "refId": "A"
                },
                {
                  "datasource": { "type": "prometheus", "uid": "greptime-promql" },
                  "expr": "sum(rate(ai_gateway_outbox_worker_publish_failures_total[5m]))",
                  "legendFormat": "failures",
                  "refId": "B"
                }
              ],
              "title": "Outbox 处理",
              "type": "timeseries"
            },
            {
              "datasource": { "type": "postgres", "uid": "greptime-logs" },
              "gridPos": { "h": 8, "w": 24, "x": 0, "y": 6 },
              "id": 3,
              "targets": [
                {
                  "datasource": { "type": "postgres", "uid": "greptime-logs" },
                  "format": "table",
                  "rawQuery": true,
                  "rawSql": "SHOW TABLES",
                  "refId": "A"
                }
              ],
              "title": "Greptime 表",
              "type": "table"
            }
          ],
          "refresh": "30s",
          "schemaVersion": 40,
          "tags": ["ai-gateway", "onprem"],
          "templating": { "list": [] },
          "time": { "from": "now-1h", "to": "now" },
          "timepicker": {},
          "timezone": "browser",
          "title": "AI Gateway Onprem Overview",
          "uid": "ai-gateway-onprem-overview",
          "version": 1,
          "weekStart": ""
        }
        JSON
        exec /run.sh
    ports:
      - "127.0.0.1:3030:3000"
    volumes:
      - grafana-data:/var/lib/grafana
    depends_on:
      greptimedb:
        condition: service_healthy
    healthcheck:
      test: ["CMD-SHELL", "wget -q -O- http://127.0.0.1:3000/api/health | grep '\"database\": \"ok\"' >/dev/null"]
      interval: 10s
      timeout: 5s
      retries: 12
      start_period: 30s

volumes:
  pg-data:
  redis-data:
  greptime-data:
  image-data:
  usage-log:
  vector-data:
  grafana-data:

第三步：启动

docker compose pull
docker compose up -d
docker compose ps

正常情况下会看到：

postgres           healthy
redis              healthy
greptimedb         healthy
migrate            exited 0
ai-gateway-app     healthy
vector             healthy
grafana            healthy

第四步：验证服务

curl http://localhost:3000/api/health
curl http://localhost:3030/api/health

第一个接口确认 AI 网关已启动，第二个接口确认 Grafana 可用。

第五步：首次登录

打开：

http://localhost:3000

使用 .env 里的 ONPREM_ADMIN_EMAIL / ONPREM_ADMIN_PASSWORD 登录。这个账号是私有化实例的第一个超级管理员。

第六步：配置模型供应商

私有化实例首次启动时，系统默认只预置两个供应商：阿里云百炼 / DashScope 和 DeepSeek。

如果你已经在 .env 填写 DASHSCOPE_API_KEY 或 DEEPSEEK_API_KEY，对应供应商会自动可用。常用模型示例：

供应商	可用模型示例
阿里云百炼 / DashScope	`qwen-plus`、`qwen-max`、`qwen-vl-plus`
DeepSeek	`deepseek-v4-pro`、`deepseek-v4-flash`

如果启动时没有填写密钥，或需要更新密钥：

登录管理后台，进入左侧 大模型对话。
切换到配置。
找到 阿里云百炼 或 DeepSeek，点击 切换到自有密钥 或编辑按钮。
填写供应商 API Key，选择要开放的模型，点击保存。
进入 API密钥 页面，为业务系统创建 gk_ 开头的网关 Key。

需要接入其它供应商或自托管推理服务时，在 大模型对话 > 配置 点击 添加供应商：

字段	填写方式
供应商名称	后台显示名称，例如 `本地 Qwen`。
供应商标识	稳定英文标识，例如 `local-qwen`。
API 格式	按上游接口选择 OpenAI 或 Anthropic 兼容格式。
站点地址	上游 Base URL，例如 `http://vllm:8000`。
API 路径	对话接口通常为 `/v1/chat/completions`。
API 密钥	填写上游 Key；没有鉴权的内网服务可以留空。
模型列表	填写要开放的模型 ID，多个模型用英文逗号分隔。

第七步：业务系统接入

业务系统只需要使用本地网关地址和后台创建的 gk_ Key：

curl "http://localhost:3000/v1/chat/completions" \
  -H "Authorization: Bearer gk_xxx" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen-plus",
    "messages": [
      { "role": "user", "content": "你好" }
    ]
  }'

使用 DeepSeek 时，把 model 改成 deepseek-v4-flash 或后台已启用的其它模型 ID。

第八步：查看用量和运行状态

打开：

http://localhost:3030

用户名是 admin，默认密码是 ai-gateway-grafana-2026。如果你在 .env 里覆盖过 GRAFANA_ADMIN_PASSWORD，使用覆盖后的密码。默认看板会展示网关运行状态、用量采集和基础指标。

升级

升级前先完成 PostgreSQL 备份。升级时，同思会提供新的 AI 网关镜像标签；把 docker-compose.yml 里 x-ai-gateway-image 的标签替换成新版本后重启。

docker compose exec -T postgres sh -lc \
  'pg_dump -U postgres raas' \
  | gzip > ai-gateway-before-upgrade-$(date +%F-%H%M).sql.gz

docker compose pull ai-gateway-app migrate
docker compose up -d
docker compose ps

migrate 会先执行数据库迁移，成功后再启动新版本网关。

备份

生产环境至少备份 PostgreSQL。它保存用户、组织、API Key、供应商、模型配置和计费配置。

docker compose exec -T postgres sh -lc \
  'pg_dump -U postgres raas' \
  | gzip > ai-gateway-backup-$(date +%F).sql.gz

恢复到一个新的空数据卷时，先停服务、重建 PostgreSQL 卷，再导入备份：

docker compose down
docker volume rm ai-gateway-onprem_pg-data
docker compose up -d postgres
gzip -dc ai-gateway-backup-YYYY-MM-DD.sql.gz \
  | docker compose exec -T postgres sh -lc 'psql -U postgres raas'
docker compose up -d

需要同时保留运行数据时，按下面策略备份卷：

数据卷	内容	建议
`pg-data`	用户、组织、API Key、供应商、模型配置、计费配置	必须备份。
`greptime-data`	用量、指标、调用日志和看板查询数据	需要历史用量和审计时备份。
`image-data`	图像生成 / 编辑等落盘文件	启用图像能力时备份。
`usage-log`	Vector 尚未回灌前的本地用量日志	升级前建议保留，确认 Vector 正常后可按策略清理。
`grafana-data`	Grafana 本地用户偏好和手动修改的看板	如果在 Grafana 里改过看板，需要备份。
`redis-data` / `vector-data`	缓存和采集状态	通常可重建；严格审计场景可一并备份。

生产环境建议

给不同业务系统创建不同的 gk_ Key，便于限流、审计和停用。
生产访问建议通过企业自己的 HTTPS 域名或反向代理转发到 127.0.0.1:3000。
反向代理不要对外开放 /api/local/usage-batch，该接口只供内部采集组件使用。
把 BETTER_AUTH_SECRET、POSTGRES_PASSWORD、ONPREM_ADMIN_PASSWORD、GRAFANA_ADMIN_PASSWORD、AI_GATEWAY_LOCAL_USAGE_INGEST_TOKEN、供应商 Key 放进企业自己的密钥管理系统。
PostgreSQL、Redis、GreptimeDB 在生产环境建议使用企业已有的高可用版本。
需要多区域、多副本或自托管 GPU 推理集群时，可在规划阶段联系客服开通获取拓扑与容量建议。

接入建议

生产环境把 API Key 放在服务端，不要暴露到浏览器或客户端。
给不同应用使用不同 Key，便于审计、限额和停用。
业务代码把数据面 Base URL 与模型 ID 作为配置项，迁移或切换部署时只需改配置。
部署前规划好 PostgreSQL / Redis / GreptimeDB 的容量与高可用。
需要部署支持时联系客服开通。