06 — 从 S3 加载 model cost map¶
概述¶
在原有"HTTP 远程拉取 + 本地 backup"机制之外,本仓库扩展了从 S3 兼容对象存储(AWS S3、MinIO、内网 OSS 等)签名读取 model_prices_and_context_window.json 的能力。该路径与 HTTP 路径共用 fetch_remote_model_cost_map() 入口,通过 S3 环境变量是否齐全来切换。
加载优先级¶
flowchart TD
A[import litellm\n__init__.py:438-440] --> B[get_model_cost_map url\nget_model_cost_map.py:221]
B --> C{LITELLM_LOCAL_MODEL_COST_MAP\n=true?}
C -- 是 --> D[load_local_model_cost_map\n读 backup JSON]
C -- 否 --> E[fetch_remote_model_cost_map url]
E --> F{_has_required_s3_env\n4 个 S3 变量齐全?}
F -- 是 --> G[fetch_s3_signed_model_cost_map\nboto3 SigV4 GET]
F -- 否 --> H[httpx.get url\n默认 GitHub raw]
G -- 成功 --> I[validate_model_cost_map]
H -- 成功 --> I
G -- 失败 --> J[fallback: load_local_model_cost_map]
H -- 失败 --> J
I -- 通过 --> K[litellm.model_cost]
I -- 校验失败 --> J
D --> K
J --> K
三级优先级:
| 级别 | 触发条件 | 价格来源 |
|---|---|---|
| 1 | LITELLM_LOCAL_MODEL_COST_MAP=true |
litellm/model_prices_and_context_window_backup.json |
| 2 | LITELLM_LOCAL_MODEL_COST_MAP!=true 且 S3 四变量齐全 |
S3 对象 s3://{S3_BUCKET_NAME}/{MODEL_COST_MAP_S3_KEY} |
| 3 | 以上都不满足 | LITELLM_MODEL_COST_MAP_URL(默认 GitHub raw) |
任何一级拉取失败 → 自动 fallback 到本地 backup。
入口代码¶
文件:litellm/litellm_core_utils/get_model_cost_map.py
get_model_cost_map() 公开入口¶
# get_model_cost_map.py:221-261
def get_model_cost_map(url: str) -> dict:
# 1. 强制本地 backup
if os.getenv("LITELLM_LOCAL_MODEL_COST_MAP", "").lower() == "true":
return GetModelCostMap.load_local_model_cost_map()
# 2. 远程拉取(S3 或 HTTP)
try:
content = GetModelCostMap.fetch_remote_model_cost_map(url)
except Exception as e:
verbose_logger.warning(...)
return GetModelCostMap.load_local_model_cost_map()
# 3. 完整性校验(模型总数)
if not GetModelCostMap.validate_model_cost_map(
fetched_map=content,
backup_model_count=GetModelCostMap._get_backup_model_count(),
):
return GetModelCostMap.load_local_model_cost_map()
return content
fetch_remote_model_cost_map() 分流¶
# get_model_cost_map.py:141-154
@staticmethod
def fetch_remote_model_cost_map(url: str, timeout: int = 5) -> dict:
if GetModelCostMap._has_required_s3_env():
return GetModelCostMap.fetch_s3_signed_model_cost_map(timeout=timeout)
response = httpx.get(url, timeout=timeout)
response.raise_for_status()
return response.json()
S3 环境检测¶
# get_model_cost_map.py:156-165
@staticmethod
def _has_required_s3_env() -> bool:
return all(
[
os.getenv("S3_ENDPOINT_URL"),
os.getenv("S3_ACCESS_KEY"),
os.getenv("S3_SECRET_KEY"),
os.getenv("S3_BUCKET_NAME"),
]
)
fetch_s3_signed_model_cost_map() 签名读取¶
# get_model_cost_map.py:167-218
@staticmethod
def fetch_s3_signed_model_cost_map(timeout: int = 5) -> dict:
import boto3
from botocore.config import Config
endpoint_url = os.getenv("S3_ENDPOINT_URL")
access_key = os.getenv("S3_ACCESS_KEY")
secret_key = os.getenv("S3_SECRET_KEY")
bucket = os.getenv("S3_BUCKET_NAME")
key = os.getenv("MODEL_COST_MAP_S3_KEY", "model_prices_and_context_window.json")
region_name = os.getenv("S3_REGION_NAME", "us-east-1")
if not endpoint_url or not access_key or not secret_key or not bucket:
raise ValueError("Missing required S3 env for model cost map: ...")
client_kwargs = {
"service_name": "s3",
"region_name": region_name,
"config": Config(
signature_version="s3v4", # SigV4 签名
connect_timeout=timeout,
read_timeout=timeout,
),
"endpoint_url": endpoint_url,
"aws_access_key_id": access_key,
"aws_secret_access_key": secret_key,
}
s3_client = boto3.client(**client_kwargs)
response = s3_client.get_object(Bucket=bucket, Key=key)
body = response["Body"].read()
return json.loads(body)
环境变量清单¶
| 变量 | 必需 | 默认值 | 说明 |
|---|---|---|---|
S3_ENDPOINT_URL |
✅ | — | S3 兼容端点 URL(如 http://127.0.0.1:19000) |
S3_ACCESS_KEY |
✅ | — | S3 Access Key |
S3_SECRET_KEY |
✅ | — | S3 Secret Key |
S3_BUCKET_NAME |
✅ | — | Bucket 名(cost map 文件必须在此 bucket 内) |
S3_REGION_NAME |
❌ | us-east-1 |
区域名,SigV4 签名需要 |
MODEL_COST_MAP_S3_KEY |
❌ | model_prices_and_context_window.json |
Bucket 内的对象 key |
LITELLM_LOCAL_MODEL_COST_MAP |
❌ | — | 设为 true 时跳过 S3/HTTP,强制本地 backup |
S3 前 5 个变量与 LiteLLM 的 S3 日志回调共用(相同 bucket 上既存日志也存 cost map 文件)。
.env.local.example 参考(litellm/.env.local.example:49-56):
S3_ENDPOINT_URL=http://127.0.0.1:19000
S3_ACCESS_KEY=minioadmin
S3_SECRET_KEY=minioadmin
S3_BUCKET_NAME=litellm-logs
S3_REGION_NAME=us-east-1
# Model cost map 会复用上面的 S3 连接信息;文件必须在 S3_BUCKET_NAME 这个 bucket 内。
MODEL_COST_MAP_S3_KEY=model_prices_and_context_window.json
启动摘要¶
scripts/start-local.sh 在启动前打印 cost map 来源,便于确认生效路径(start-local.sh:251-266):
local model_cost_map_source
if [[ "${LITELLM_LOCAL_MODEL_COST_MAP:-}" == "true" || ... ]]; then
model_cost_map_source="local backup (LITELLM_LOCAL_MODEL_COST_MAP=${LITELLM_LOCAL_MODEL_COST_MAP})"
elif [[ -n "${S3_ENDPOINT_URL:-}" && -n "${S3_ACCESS_KEY:-}" && -n "${S3_SECRET_KEY:-}" && -n "${S3_BUCKET_NAME:-}" ]]; then
model_cost_map_source="s3://${S3_BUCKET_NAME}/${MODEL_COST_MAP_S3_KEY:-model_prices_and_context_window.json}"
else
model_cost_map_source="remote HTTP fallback"
fi
echo "[start-local] model_cost_map=${model_cost_map_source}"
启动输出示例:
[start-local] model_cost_map=s3://litellm-logs/model_prices_and_context_window.json
[start-local] model_cost_map=local backup (LITELLM_LOCAL_MODEL_COST_MAP=true)
[start-local] model_cost_map=remote HTTP fallback
热重载对 S3 的支持¶
POST /reload/model_cost_map 端点(proxy_server.py:12081-12147)直接调用 get_model_cost_map(url=litellm.model_cost_map_url),与启动时同一函数,因此完整经过三级优先级判断:
# proxy_server.py:12107-12109
model_cost_map_url = litellm.model_cost_map_url
new_model_cost_map = get_model_cost_map(url=model_cost_map_url)
litellm.model_cost = new_model_cost_map
效果:热重载会重新从 S3 拉取最新 JSON(如果 S3 环境变量齐全且 LITELLM_LOCAL_MODEL_COST_MAP!=true),无需重启进程。
多 Pod 同步:热重载将 force_reload=True 写入 litellm_config 表,其他 Pod 通过 _check_and_reload_model_cost_map() 轮询此标志触发同步重载(proxy_server.py:4519-4619)。
验证当前内存中的 cost map¶
端点:GET /public/litellm_model_cost_map
定义:litellm/proxy/public_endpoints/public_endpoints.py:175-193
@router.get("/public/litellm_model_cost_map", ...)
async def get_litellm_model_cost_map():
import litellm
_model_cost_map = litellm.model_cost # ← 直接返回内存态 dict
return _model_cost_map
直接返回进程内 litellm.model_cost 完整 dict,因此可用于验证:
- 热重载后 S3 上的新 key 是否已加载到内存
- 本地 backup JSON 与 S3/远程 JSON 的差异
示例:
curl -s "http://127.0.0.1:4000/ai-gateway/public/litellm_model_cost_map" \
-H "Authorization: Bearer <master-key>" \
| jq 'has("Vendor2/Claude-4.6-Opus")'
数据流图¶
sequenceDiagram
participant Env as 环境变量
participant Init as litellm.__init__
participant GMCM as get_model_cost_map()
participant S3 as S3-compatible storage
participant MC as litellm.model_cost
participant Reload as POST /reload/model_cost_map
participant Public as GET /public/litellm_model_cost_map
Note over Init: import 阶段(同步)
Init->>GMCM: 调用 get_model_cost_map(url)
GMCM->>Env: 读取 LITELLM_LOCAL_MODEL_COST_MAP
alt =true
GMCM-->>MC: 写入 backup JSON 内容
else !=true
GMCM->>Env: 读取 S3_ENDPOINT_URL 等 4 变量
alt S3 变量齐全
GMCM->>S3: boto3 SigV4 GetObject
S3-->>GMCM: JSON bytes
GMCM-->>MC: 写入 S3 拉取内容
else 任一缺失
GMCM->>GMCM: httpx.get(url)
GMCM-->>MC: 写入 HTTP 拉取内容
end
end
Note over Reload: 运行时热重载
Reload->>GMCM: get_model_cost_map(url)(同一逻辑)
GMCM-->>MC: 覆盖内存 dict
Note over Public: 验证当前态
Public->>MC: 直接读 litellm.model_cost
Public-->>外部: 返回完整 dict JSON