跳转至

06 — 从 S3 加载 model cost map

概述

在原有"HTTP 远程拉取 + 本地 backup"机制之外,本仓库扩展了从 S3 兼容对象存储(AWS S3、MinIO、内网 OSS 等)签名读取 model_prices_and_context_window.json 的能力。该路径与 HTTP 路径共用 fetch_remote_model_cost_map() 入口,通过 S3 环境变量是否齐全来切换。


加载优先级

flowchart TD
    A[import litellm\n__init__.py:438-440] --> B[get_model_cost_map url\nget_model_cost_map.py:221]
    B --> C{LITELLM_LOCAL_MODEL_COST_MAP\n=true?}
    C -- 是 --> D[load_local_model_cost_map\n读 backup JSON]
    C -- 否 --> E[fetch_remote_model_cost_map url]
    E --> F{_has_required_s3_env\n4 个 S3 变量齐全?}
    F -- 是 --> G[fetch_s3_signed_model_cost_map\nboto3 SigV4 GET]
    F -- 否 --> H[httpx.get url\n默认 GitHub raw]
    G -- 成功 --> I[validate_model_cost_map]
    H -- 成功 --> I
    G -- 失败 --> J[fallback: load_local_model_cost_map]
    H -- 失败 --> J
    I -- 通过 --> K[litellm.model_cost]
    I -- 校验失败 --> J
    D --> K
    J --> K

三级优先级

级别 触发条件 价格来源
1 LITELLM_LOCAL_MODEL_COST_MAP=true litellm/model_prices_and_context_window_backup.json
2 LITELLM_LOCAL_MODEL_COST_MAP!=true 且 S3 四变量齐全 S3 对象 s3://{S3_BUCKET_NAME}/{MODEL_COST_MAP_S3_KEY}
3 以上都不满足 LITELLM_MODEL_COST_MAP_URL(默认 GitHub raw)

任何一级拉取失败 → 自动 fallback 到本地 backup。


入口代码

文件litellm/litellm_core_utils/get_model_cost_map.py

get_model_cost_map() 公开入口

# get_model_cost_map.py:221-261
def get_model_cost_map(url: str) -> dict:
    # 1. 强制本地 backup
    if os.getenv("LITELLM_LOCAL_MODEL_COST_MAP", "").lower() == "true":
        return GetModelCostMap.load_local_model_cost_map()

    # 2. 远程拉取(S3 或 HTTP)
    try:
        content = GetModelCostMap.fetch_remote_model_cost_map(url)
    except Exception as e:
        verbose_logger.warning(...)
        return GetModelCostMap.load_local_model_cost_map()

    # 3. 完整性校验(模型总数)
    if not GetModelCostMap.validate_model_cost_map(
        fetched_map=content,
        backup_model_count=GetModelCostMap._get_backup_model_count(),
    ):
        return GetModelCostMap.load_local_model_cost_map()

    return content

fetch_remote_model_cost_map() 分流

# get_model_cost_map.py:141-154
@staticmethod
def fetch_remote_model_cost_map(url: str, timeout: int = 5) -> dict:
    if GetModelCostMap._has_required_s3_env():
        return GetModelCostMap.fetch_s3_signed_model_cost_map(timeout=timeout)

    response = httpx.get(url, timeout=timeout)
    response.raise_for_status()
    return response.json()

S3 环境检测

# get_model_cost_map.py:156-165
@staticmethod
def _has_required_s3_env() -> bool:
    return all(
        [
            os.getenv("S3_ENDPOINT_URL"),
            os.getenv("S3_ACCESS_KEY"),
            os.getenv("S3_SECRET_KEY"),
            os.getenv("S3_BUCKET_NAME"),
        ]
    )

fetch_s3_signed_model_cost_map() 签名读取

# get_model_cost_map.py:167-218
@staticmethod
def fetch_s3_signed_model_cost_map(timeout: int = 5) -> dict:
    import boto3
    from botocore.config import Config

    endpoint_url = os.getenv("S3_ENDPOINT_URL")
    access_key = os.getenv("S3_ACCESS_KEY")
    secret_key = os.getenv("S3_SECRET_KEY")
    bucket = os.getenv("S3_BUCKET_NAME")
    key = os.getenv("MODEL_COST_MAP_S3_KEY", "model_prices_and_context_window.json")
    region_name = os.getenv("S3_REGION_NAME", "us-east-1")

    if not endpoint_url or not access_key or not secret_key or not bucket:
        raise ValueError("Missing required S3 env for model cost map: ...")

    client_kwargs = {
        "service_name": "s3",
        "region_name": region_name,
        "config": Config(
            signature_version="s3v4",     # SigV4 签名
            connect_timeout=timeout,
            read_timeout=timeout,
        ),
        "endpoint_url": endpoint_url,
        "aws_access_key_id": access_key,
        "aws_secret_access_key": secret_key,
    }

    s3_client = boto3.client(**client_kwargs)
    response = s3_client.get_object(Bucket=bucket, Key=key)
    body = response["Body"].read()
    return json.loads(body)

环境变量清单

变量 必需 默认值 说明
S3_ENDPOINT_URL S3 兼容端点 URL(如 http://127.0.0.1:19000
S3_ACCESS_KEY S3 Access Key
S3_SECRET_KEY S3 Secret Key
S3_BUCKET_NAME Bucket 名(cost map 文件必须在此 bucket 内)
S3_REGION_NAME us-east-1 区域名,SigV4 签名需要
MODEL_COST_MAP_S3_KEY model_prices_and_context_window.json Bucket 内的对象 key
LITELLM_LOCAL_MODEL_COST_MAP 设为 true 时跳过 S3/HTTP,强制本地 backup

S3 前 5 个变量与 LiteLLM 的 S3 日志回调共用(相同 bucket 上既存日志也存 cost map 文件)。

.env.local.example 参考litellm/.env.local.example:49-56):

S3_ENDPOINT_URL=http://127.0.0.1:19000
S3_ACCESS_KEY=minioadmin
S3_SECRET_KEY=minioadmin
S3_BUCKET_NAME=litellm-logs
S3_REGION_NAME=us-east-1
# Model cost map 会复用上面的 S3 连接信息;文件必须在 S3_BUCKET_NAME 这个 bucket 内。
MODEL_COST_MAP_S3_KEY=model_prices_and_context_window.json

启动摘要

scripts/start-local.sh 在启动前打印 cost map 来源,便于确认生效路径(start-local.sh:251-266):

local model_cost_map_source
if [[ "${LITELLM_LOCAL_MODEL_COST_MAP:-}" == "true" || ... ]]; then
    model_cost_map_source="local backup (LITELLM_LOCAL_MODEL_COST_MAP=${LITELLM_LOCAL_MODEL_COST_MAP})"
elif [[ -n "${S3_ENDPOINT_URL:-}" && -n "${S3_ACCESS_KEY:-}" && -n "${S3_SECRET_KEY:-}" && -n "${S3_BUCKET_NAME:-}" ]]; then
    model_cost_map_source="s3://${S3_BUCKET_NAME}/${MODEL_COST_MAP_S3_KEY:-model_prices_and_context_window.json}"
else
    model_cost_map_source="remote HTTP fallback"
fi
echo "[start-local] model_cost_map=${model_cost_map_source}"

启动输出示例:

[start-local] model_cost_map=s3://litellm-logs/model_prices_and_context_window.json
[start-local] model_cost_map=local backup (LITELLM_LOCAL_MODEL_COST_MAP=true)
[start-local] model_cost_map=remote HTTP fallback

热重载对 S3 的支持

POST /reload/model_cost_map 端点(proxy_server.py:12081-12147)直接调用 get_model_cost_map(url=litellm.model_cost_map_url),与启动时同一函数,因此完整经过三级优先级判断:

# proxy_server.py:12107-12109
model_cost_map_url = litellm.model_cost_map_url
new_model_cost_map = get_model_cost_map(url=model_cost_map_url)
litellm.model_cost = new_model_cost_map

效果:热重载会重新从 S3 拉取最新 JSON(如果 S3 环境变量齐全且 LITELLM_LOCAL_MODEL_COST_MAP!=true),无需重启进程

多 Pod 同步:热重载将 force_reload=True 写入 litellm_config 表,其他 Pod 通过 _check_and_reload_model_cost_map() 轮询此标志触发同步重载(proxy_server.py:4519-4619)。


验证当前内存中的 cost map

端点GET /public/litellm_model_cost_map 定义litellm/proxy/public_endpoints/public_endpoints.py:175-193

@router.get("/public/litellm_model_cost_map", ...)
async def get_litellm_model_cost_map():
    import litellm
    _model_cost_map = litellm.model_cost   # ← 直接返回内存态 dict
    return _model_cost_map

直接返回进程内 litellm.model_cost 完整 dict,因此可用于验证: - 热重载后 S3 上的新 key 是否已加载到内存 - 本地 backup JSON 与 S3/远程 JSON 的差异

示例:

curl -s "http://127.0.0.1:4000/ai-gateway/public/litellm_model_cost_map" \
  -H "Authorization: Bearer <master-key>" \
  | jq 'has("Vendor2/Claude-4.6-Opus")'

数据流图

sequenceDiagram
    participant Env as 环境变量
    participant Init as litellm.__init__
    participant GMCM as get_model_cost_map()
    participant S3 as S3-compatible storage
    participant MC as litellm.model_cost
    participant Reload as POST /reload/model_cost_map
    participant Public as GET /public/litellm_model_cost_map

    Note over Init: import 阶段(同步)
    Init->>GMCM: 调用 get_model_cost_map(url)
    GMCM->>Env: 读取 LITELLM_LOCAL_MODEL_COST_MAP
    alt =true
        GMCM-->>MC: 写入 backup JSON 内容
    else !=true
        GMCM->>Env: 读取 S3_ENDPOINT_URL 等 4 变量
        alt S3 变量齐全
            GMCM->>S3: boto3 SigV4 GetObject
            S3-->>GMCM: JSON bytes
            GMCM-->>MC: 写入 S3 拉取内容
        else 任一缺失
            GMCM->>GMCM: httpx.get(url)
            GMCM-->>MC: 写入 HTTP 拉取内容
        end
    end

    Note over Reload: 运行时热重载
    Reload->>GMCM: get_model_cost_map(url)(同一逻辑)
    GMCM-->>MC: 覆盖内存 dict

    Note over Public: 验证当前态
    Public->>MC: 直接读 litellm.model_cost
    Public-->>外部: 返回完整 dict JSON