视觉分割（SAM3 图像 / 视频）

POST /v1/vision-segment/predictions 图片分割与 /v1/vision-segment/video 视频分割，基于 SAM3。

无界模型云提供基于 SAM3（Segment Anything Model 3）的视觉分割能力：按文本提示、点提示或框提示对图片或视频做实例分割，返回遮罩、叠加图、坐标框与置信分。

图片分割：POST https://ai.tos.run/v1/vision-segment/predictions，application/json。
视频分割：POST https://ai.tos.run/v1/vision-segment/video，application/json。
默认模型 facebook/sam3，默认 provider wujie（自托管）。

私有化部署 API 完全一致，仅需替换 Base 域名 → 企业私有化部署。

鉴权与 Base

Host：视觉分割在 master host https://ai.tos.run 上提供——数据面 api.tos.run 没有对应的 vision-segment 路径，curl / SDK 直接打到 ai.tos.run。
鉴权头：Authorization: Bearer $TOS_API_KEY，Key 以 gk_ 开头。
scope：需要 ai:vision-segment（创建 Key 时勾选；ai:* 通配亦可）。

详见鉴权。

请求结构

两个接口共享外层结构：

{
  "provider": "wujie",
  "model": "facebook/sam3",
  "input": { /* 见下方 */ }
}

provider：默认 wujie（自托管 SAM3）。
model：默认 facebook/sam3。
input：分割参数对象，图片与视频字段不同。

input.image / input.video 既接受 https://... 的 URL，也接受 data:image/*;base64,... / data:video/*;base64,... 的 base64 data URI，按需选用。

图片分割

向 /v1/vision-segment/predictions 发送 JSON。input 字段：

字段	类型	必填	默认	说明
`image`	string	是	—	https URL 或 `data:image/*` base64 URI
`prompt`	string	否	`person`	文本提示，描述要分割的目标
`threshold`	number	否	`0.5`	置信阈值，0–1
`point_prompts`	array	否	—	点提示，元素 `{ x, y, positive }`
`box_prompts`	array	否	—	框提示，元素 `{ x_min, y_min, x_max, y_max }`
`mask_color`	string	否	`green`	遮罩颜色，可选 `green`/`red`/`blue`/`yellow`/`cyan`/`magenta`
`mask_opacity`	number	否	`0.5`	遮罩不透明度，0–1
`save_overlay`	boolean	否	`false`	是否输出叠加图
`mask_only`	boolean	否	`false`	仅输出遮罩
`return_zip`	boolean	否	`true`	多文件结果是否打包 zip
`include_scores`	boolean	否	`false`	是否返回 `scores`
`include_boxes`	boolean	否	`false`	是否返回 `boxes`

curl "https://ai.tos.run/v1/vision-segment/predictions" \
  -H "Authorization: Bearer $TOS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "provider": "wujie",
    "model": "facebook/sam3",
    "input": {
      "image": "https://example.com/street.jpg",
      "prompt": "person",
      "threshold": 0.5,
      "mask_color": "green",
      "include_scores": true,
      "include_boxes": true
    }
  }'

import os
import requests

resp = requests.post(
    "https://ai.tos.run/v1/vision-segment/predictions",
    headers={"Authorization": f"Bearer {os.environ['TOS_API_KEY']}"},
    json={
        "provider": "wujie",
        "model": "facebook/sam3",
        "input": {
            "image": "https://example.com/street.jpg",
            "prompt": "person",
            "threshold": 0.5,
            "mask_color": "green",
            "include_scores": True,
            "include_boxes": True,
        },
    },
    timeout=300,
)
resp.raise_for_status()
result = resp.json()
print(result["status"], result.get("output"))

const resp = await fetch("https://ai.tos.run/v1/vision-segment/predictions", {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${process.env.TOS_API_KEY}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    provider: "wujie",
    model: "facebook/sam3",
    input: {
      image: "https://example.com/street.jpg",
      prompt: "person",
      threshold: 0.5,
      mask_color: "green",
      include_scores: true,
      include_boxes: true,
    },
  }),
});
const result = await resp.json();
console.log(result.status, result.output);

视频分割

向 /v1/vision-segment/video 发送 JSON。input 字段（prompt 在视频接口为必填）：

字段	类型	必填	默认	说明
`video`	string	是	—	https URL 或 `data:video/*` base64 URI
`prompt`	string	是	—	文本提示，描述要分割并跟踪的目标
`output_format`	string	否	`mp4`	输出格式，可选 `mp4`/`frames`/`masks`/`coco_rle`
`threshold`	number	否	`0.5`	置信阈值，0–1
`frame_stride`	integer	否	`1`	抽帧步长
`start_time`	number	否	`0`	起始时间（秒）
`end_time`	number	否	—	结束时间（秒）
`max_frames`	integer	否	—	最大处理帧数
`prompt_propagation`	boolean	否	`true`	是否跨帧传播提示
`re_detect_every`	integer	否	`30`	每隔多少帧重新检测
`mask_color`	string	否	`green`	遮罩颜色，同图片接口取值
`mask_opacity`	number	否	`0.5`	遮罩不透明度，0–1
`fps`	number	否	—	输出帧率

curl "https://ai.tos.run/v1/vision-segment/video" \
  -H "Authorization: Bearer $TOS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "provider": "wujie",
    "model": "facebook/sam3",
    "input": {
      "video": "https://example.com/clip.mp4",
      "prompt": "the running dog",
      "output_format": "mp4",
      "frame_stride": 1,
      "max_frames": 300
    }
  }'

响应

两个接口返回相同结构。status 是任务状态机，output 携带结果文件：

{
  "id": "vs_8f3c...",
  "status": "succeeded",
  "input": { "...": "回显的请求 input" },
  "output": "data:image/png;base64,iVBORw0KGgo...",
  "logs": "...",
  "error": null,
  "metrics": { "...": "耗时等指标" },
  "scores": [0.97],
  "boxes": [[10, 20, 200, 400]]
}

status：starting / processing / succeeded / canceled / failed。
output：结果文件，既可能是 data: base64 URI，也可能是经网关物化（materialize）后的同源下载 URL（落在 /api/vision-segment/outputs/<filename>）。
error：失败时给出错误说明（成功为 null）。
scores / boxes：分别在 include_scores / include_boxes 为 true（图片接口）时返回。

当 output 为同源下载 URL 时，下载该文件走 /api/vision-segment/outputs/...，同样受 ai:vision-segment scope 保护——用同一个 gk_ Key 的 Authorization: Bearer 头访问即可。

视频分割是重计算任务，耗时随视频时长、frame_stride、max_frames 显著增长。请把客户端超时设大（≥ 300s），并以响应里的 status 判断是否成功，不要只看 HTTP 状态码。

视觉分割（SAM3 图像 / 视频）

鉴权与 Base

请求结构

图片分割

视频分割

响应

相关页面

目录