02 · Provider 异常映射：跨 provider 矩阵与已知不一致¶

本文聚焦 litellm/litellm_core_utils/exception_mapping_utils.py（2465 行）这一个文件——它是 LiteLLM 把上游错误转成 01 目录里那些 LiteLLM 异常类的唯一枢纽。

读完应该能回答：

调 GPT-5 失败抛的 BadRequestError 跟调 Claude 失败抛的 BadRequestError，背后是同一段代码吗？
上游 502，LiteLLM 一定抛 BadGatewayError 吗？
我要接入一个新 provider，应该按哪个 provider 抄？
为什么我看到的 cloudflare 错总是 APIConnectionError？

1. 两个事实先定调¶

事实 1：18 个分支，实际服务 60+ provider¶

文件里实际只有 18 个 if/elif custom_llm_provider == "xxx" 大分支。但实际接入的 provider 远不止 18 个：

# constants.py:711-770
openai_compatible_providers: List = [
    "anyscale", "groq", "nvidia_nim", "cerebras", "baseten", "sambanova",
    "ai21_chat", "ai21", "volcengine", "codestral", "deepseek", "deepinfra",
    "perplexity", "xinference", "xai", "zai", "together_ai", "fireworks_ai",
    "empower", "friendliai", "azure_ai", "github", "litellm_proxy",
    "hosted_vllm", "llamafile", "lm_studio", "galadriel", "github_copilot",
    "chatgpt", "novita", "meta_llama", "publicai", "synthetic", "apertis",
    "nano-gpt", "poe", "chutes", "featherless_ai", "nscale", "nebius",
    "dashscope", "moonshot", "v0", "helicone", "morph", "lambda_ai",
    "hyperbolic", "vercel_ai_gateway", "aiml",
    # ...共 54 个
]

exception_mapping_utils.py:355-361:

if (
    custom_llm_provider == "openai"
    or custom_llm_provider == "text-completion-openai"
    or custom_llm_provider == "custom_openai"
    or custom_llm_provider in litellm.openai_compatible_providers   # ← 54 个
    or custom_llm_provider == "mistral"
):

→ 共 58 个 provider 共用同一段 openai 映射（行 356-617）。所以你看到 moonshot / deepseek / groq / dashscope 报错时，走的实际是 OpenAI 分支——上游说 500 在这一段就映射成 InternalServerError，跟真正的 OpenAI 一视同仁。

剩下 17 个独立分支：anthropic、replicate、bedrock、sagemaker、vertex_ai、cloudflare、cohere、huggingface、ai21(独立)、nlp_cloud、together_ai(独立)、aleph_alpha、ollama、vllm、azure、openrouter + 全局兜底。

⚠️ ai21 和 together_ai 出现了两次——既在 openai_compatible_providers 里又有独立分支。映射代码按 elif 链顺序匹配，所以 ai21_chat 走 openai 分支，但 ai21 同时命中 openai 分支和独立分支——前者赢（行 356-617 在前，1673 行的 elif custom_llm_provider == "ai21" 永远不会进入）。这是 dead code。together_ai 同理（1839）。如果你 grep 看 together_ai/ai21 独立分支，它根本不会执行——所有 together_ai 错都按 openai 规则映射。

事实 2：双判定结构（关键字 + status_code）¶

绝大多数 provider 分支结构都是：

elif custom_llm_provider == "xxx":
    # ① 先按 error_str 关键字判定（pre-status-code）
    if "context limit" in error_str:
        raise ContextWindowExceededError(...)
    elif "content policy" in error_str:
        raise ContentPolicyViolationError(...)
    ...

    # ② 再按 status_code 判定
    if hasattr(original_exception, "status_code"):
        if original_exception.status_code == 400:
            raise BadRequestError(...)
        elif original_exception.status_code == 401:
            raise AuthenticationError(...)
        ...

→ 关键字判定优先级高于 status_code。这是导致反直觉映射的根因之一——同一个 502 错误，关键字命中"context limit"就当 400 处理，关键字都不命中才轮到 502 自己的分支。

2. 全局路径：3 个不走 provider 分支的入口¶

在进入 provider if/elif 链之前，有三条全局快速通道直接出口：

2.1 LITELLM_EXCEPTION_TYPES 已包装直通（行 240-244）¶

if any(
    isinstance(original_exception, exc_type)
    for exc_type in litellm.LITELLM_EXCEPTION_TYPES
):
    return original_exception

如果异常已经是 LiteLLM 异常（被下层映射过），直接返回。保护双重映射：例如 Router 重试时上一次拿到的已经是 RateLimitError，再过一次 exception_type 不会被错误包装成 APIConnectionError。

2.2 全局 Timeout 字符串嗅探（行 329-344）¶

if (
    "Request Timeout Error" in error_str
    or "Request timed out" in error_str
    or "Timed out generating response" in error_str
    or "The read operation timed out" in error_str
):
    raise Timeout(...)

→ 任何 provider 的 error string 含上述 4 个子串之一就立即转 Timeout，跳过整个 provider 分支链。

⚠️ 暗坑：如果上游本来返了一个 500 但消息恰好包含 "Request timed out"（如 nginx 504 错误页内容），会被映射成 Timeout(status_code=408)，不是 504/500。这会改变 Router 的 cooldown 和 retry 路径——按 408 走 cooldown 第一关白名单（✅），按 408 走 _should_retry（✅），但实际上上游可能是永久故障。

2.3 litellm_proxy 反查（行 346-354）¶

if custom_llm_provider == "litellm_proxy":
    extract_and_raise_litellm_exception(
        response=getattr(original_exception, "response", None),
        error_str=error_str,
        model=model,
        custom_llm_provider=custom_llm_provider,
    )

调用 extract_and_raise_litellm_exception（行 194-229），用正则 litellm\.\w+Error 从错误消息里反查类名再 raise。

这是 LiteLLM 套娃 LiteLLM 时的特殊路径——上游 proxy 已经把异常名前缀写进 message（litellm.RateLimitError: ...），下游 proxy 拿到后用正则提取类名重建异常对象，避免二次包装。

⚠️ 三个注意： 1. 这个函数走完会 raise——直接跳出 exception_type，下面的 openai 分支不会执行。 2. 但 litellm_proxy 也在 openai_compatible_providers 里（constants.py:734），如果反查没命中（消息不含 litellm.XxxError 前缀），会 fall-through 到 openai 分支。 3. 如果上游 proxy 用了新版本引入的异常类（如假设的 litellm.QuotaExceededError），下游 proxy 没这个类，getattr(litellm, "QuotaExceededError", None) 返 None，继续 fall-through而非抛错——错误名信息会被丢失。

3. status_code × provider 矩阵（精简版）¶

完整 18 × 13 矩阵太宽不便阅读。这里按"标准化程度"分 3 档（S / M / L），每档列出该档常见的 status_code 处理情况。

S 档：处理 6 个以上 status_code 的"完整 provider"¶

Provider	400	401	403	404	408	422	429	500	502	503	504	其它
`openai` 系 (58 个)	BR	Au	—	NF	TO	BR	RL	IS	BG	SU	TO	API
`anthropic`	BR	Au	—	NF	TO	BR	RL	IS	BG	SU	TO	—
`azure`	BR	Au	—	—	TO	BR	RL	—	BG	SU	TO	API
`vertex_ai`	BR	Au	PD	NF	TO	—	RL	IS	BG	SU	—	—
`bedrock`	BR	Au	PD	NF	TO	BR	RL	SU	—	SU	TO	—
`sagemaker`	BR	Au	—	NF	TO	BR	RL	SU	—	SU	TO	—
`openrouter`	BR	Au	—	NF	TO	BR	RL	—	—	SU	TO	API

M 档：4-5 个 status_code¶

Provider	400	401	403	404	408	422	429	500	502	503	504	其它
`replicate`	BR	Au	—	—	TO	UE	RL	SU	—	—	—	API
`cohere`	BR	—	—	—	TO	BR	(`*CCE`→RL)	IS	—	—	—	API/IS
`huggingface`	BR	Au	—	—	TO	—	RL	—	—	SU	—	API
`nlp_cloud`	BR	Au	Au	—	—	BR	RL	API	—	API	SU	API

L 档：< 4 个 status_code 的"裸 provider"——大量 fall-through 到全局兜底¶

Provider	处理的 status_code	其它行为
`cloudflare`	仅按字符串："Authentication error"→Au, "must have required property"→BR	其它全部走全局兜底 → `APIConnectionError`（⚠️ 见 §4.1）
`ai21`(独立分支)	dead code（见 §1）	—
`together_ai`(独立分支)	dead code（同上）	—
`aleph_alpha`	401, 400, 429, 500	缺 404/408/422/502/503/504 → 全局兜底
`ollama`	按字符串：连接错/路径错/超时	缺所有 status_code 处理 → 全局兜底
`vllm`	仅 status_code == 0 → `APIConnectionError`（行 2047）	缺所有其它 status → 全局兜底
`litellm_proxy`	不走 status_code，全靠反查 message 前缀	命中不了就 fall-through

图例： - BR = BadRequestError，Au = AuthenticationError，PD = PermissionDeniedError，NF = NotFoundError - TO = Timeout，RL = RateLimitError，UE = UnprocessableEntityError - IS = InternalServerError，BG = BadGatewayError，SU = ServiceUnavailableError - API = APIError，*CCE = CohereConnectionError 类名嗅探 - "—" = 该 status_code 在此 provider 分支里没有处理（→ 走全局兜底 / 上层逻辑） - 粗体 = 跟"OpenAI 风格"不同，详见 §4

完整行号见底部 §6 行号附录。

4. 已知不一致点¶

跨 provider 比较时，这 7 处是真的会改变 Router 行为的不一致——不是代码风格差异，是异常类选错会让 cooldown / retry / 客户端 status_code 全错。

4.1 ⚠️ cloudflare：几乎裸奔，全部 fall-through 到 APIConnectionError¶

行 1480-1496：

elif custom_llm_provider == "cloudflare":
    if "Authentication error" in error_str:
        raise AuthenticationError(...)
    if "must have required property" in error_str:
        raise BadRequestError(...)
    # ← 没了。这就是全部。

没有 status_code 分支。如果 cloudflare 返 500/502/503，error_str 不含上面两个关键字，会一路 fall-through 到全局兜底（行 2361-2383）→ APIConnectionError。

后果： - 客户端看到的 status_code 是 500（APIConnectionError 硬编码），不管上游真的是 502/503/其它 - Router cooldown 被 ignored_strings = ["APIConnectionError"] 跳过（cooldown_handlers.py:57-63），故障 deployment 不会被冷却 - Router 也不会 retry（_should_retry(500) 是 True，但 APIConnectionError 本身没 response.status_code）

如果你 prod 在用 cloudflare workers ai，这点要重视。建议监控里把 APIConnectionError + cloudflare 单独画一条。

4.2 ⚠️ together_ai 独立分支有死循环 bug（同时是 dead code）¶

行 1908-1947：

if hasattr(original_exception, "status_code"):
    if original_exception.status_code == 408: raise Timeout(...)
    elif 422: raise BadRequestError(...)
    elif 429: raise RateLimitError(...)
    elif 524: raise Timeout(...)
    # ← 401/403/500/503 没处理,且没有 else 兜底
else:
    raise APIError(
        status_code=original_exception.status_code,   # ← AttributeError!
        ...
    )

两个问题：

else 分支保证进来时没有 status_code 属性，但第一行就访问 original_exception.status_code → AttributeError。这是真 bug。
if hasattr ... 块缺 401/403/500/503 处理——status_code 是这些值时静默 fall-through 到全局兜底 → APIConnectionError。

⚠️ 但整段是 dead code（§1 事实 1）：together_ai 在 openai_compatible_providers 里，永远走 openai 分支。所以这个 bug 不会真的爆——但留着是历史包袱。

4.3 cohere 500 用 `InternalServerError`，replicate/bedrock/sagemaker 500 用 `ServiceUnavailableError`¶

cohere 1554 vs replicate 819 vs bedrock 1059 vs sagemaker 1175：

Provider	上游真返 500	抛哪个 LiteLLM 异常
openai 系 / anthropic / vertex_ai / cohere	500	`InternalServerError`（status_code=500）
replicate / bedrock / sagemaker / aleph_alpha	500	`ServiceUnavailableError`（status_code=503）

后果： - 客户端看到的 status_code 不同（500 vs 503） - _should_retry 都返 True（两者都是 5xx），所以 retry 行为一样 - Cooldown 都进白名单，所以 cooldown 行为一样 - 但监控/告警按 status_code 分组的会割裂——同样的故障在两个面板上

合理选择：500 应该用 InternalServerError。bedrock/sagemaker 那条线可能是历史习惯——把 5xx 都当"暂时不可用"。

4.4 nlp_cloud 500/503 都用 `APIError`，504/520 用 `ServiceUnavailableError`¶

行 1807-1829：

elif original_exception.status_code == 500 or status_code == 503:
    raise APIError(status_code=..., ...)   # ← 不是 InternalServerError/SU
elif status_code == 504 or status_code == 520:
    raise ServiceUnavailableError(...)     # ← 不是 Timeout（其他 provider 504→TO）

对比：

status	nlp_cloud	其他 provider
500	`APIError(500)`	`InternalServerError(500)`
503	`APIError(503)`	`ServiceUnavailableError(503)`
504	`ServiceUnavailableError(503)`（注意 status_code 被替换）	`Timeout(408)` 或 `Timeout(504)`
520	`ServiceUnavailableError(503)`	没处理（落全局兜底）

最严重：504 在其他 provider 是 Timeout，nlp_cloud 变成 ServiceUnavailableError——RetryPolicy.TimeoutErrorRetries 配置对 nlp_cloud 504 完全无效。

4.5 vertex_ai 字符串 "403" 和 status_code 403 抛不同异常¶

行 1319-1333 vs 行 1411：

elif "403" in error_str:
    raise BadRequestError(...)   # ← 字符串路径
...
elif original_exception.status_code == 403:
    raise PermissionDeniedError(...)   # ← status_code 路径

字符串路径优先（§1 事实 2），所以多数情况下 vertex 403 抛 BadRequestError。

后果：客户端拿到 400 而不是 403。_should_retry(400)=False、_should_retry(403)=False，都不 retry；cooldown 都不进白名单——所以实际行为没变。但 status_code 错了，监控/客户端逻辑可能误判。

实际是想要 PermissionDenied 的：上游真的 403（key 没开通 vertex 配额）就是 PermissionDeniedError；但字符串里出现 "403" 三个字符就被截胡。

4.6 sagemaker：credentials 错被映射为 `BadRequestError`¶

行 1142-1149：

if "Unable to locate credentials" in error_str:
    raise BadRequestError(...)

对比 bedrock 行 1008-1019：

if (
    "Unable to locate credentials" in error_str
    or "The security token included in the request is invalid"
    in error_str
):
    raise AuthenticationError(...)

同样的 AWS 错误消息，bedrock 映射为 AuthenticationError（401），sagemaker 映射为 BadRequestError（400）。后果不一样：401 进 cooldown 白名单 + 路径 1.4 立即冷却；400 既不 retry 也不 cooldown。

4.7 vllm：status_code == 0 → APIConnectionError（设计正确，但与文档不直观）¶

行 2043-2052：

elif custom_llm_provider == "vllm":
    if hasattr(original_exception, "status_code"):
        if original_exception.status_code == 0:
            raise APIConnectionError(...)

这是对的——vllm client（litellm/llms/vllm/completion/handler.py）有约定：本地连不上时把 status_code 设为 0。这里转 APIConnectionError 语义正确（本地网络问题）。

但 vllm 分支只有这一条——其它任何 status（500、502、503）都 fall-through 到全局兜底 → APIConnectionError（也是网络层错），Router 永远不冷却 vllm deployment。如果 vllm 真的 OOM 返 500，你的 LiteLLM 会反复打过去。

5. 全局兜底：`ensure generic errors always return APIConnectionError`¶

行 2350-2410：

if "BadRequestError.__init__() missing 1 required positional argument: 'param'" in str(original_exception):
    # 处理 openai-python sdk 的一个 edge-case bug
    raise BadRequestError(...)
else:
    # ensure generic errors always return APIConnectionError
    if hasattr(original_exception, "request"):
        raise APIConnectionError(
            message="{} - {}".format(exception_provider, error_str),
            ...
        )
    else:
        raise APIConnectionError(
            message="{}\n{}".format(str(original_exception), traceback.format_exc()),
            ...
        )

这里是最后一道防线——所有 provider 分支都不命中时，统一抛 APIConnectionError。

⚠️ 设计意图 vs 实际后果： - 设计意图：LiteLLM 不希望 client 看到原始 httpx.HTTPError 之类的异常，必须包装成 OpenAI 兼容格式。APIConnectionError 是个"通用未知错"的兜底类。 - 实际后果：因为这个兜底，任何 provider 分支不处理的 status_code 都会被映射成 APIConnectionError。结合 cooldown 白名单的字符串跳过逻辑——provider 分支不完整 = Router 在这个 provider 上的健康判断失效。

这就是为什么 cloudflare（§4.1）和 vllm（§4.7）那么危险——它们的 provider 分支太短了。

最最后兜底行 2400-2410：上面 try 块抛了任何异常（包括映射逻辑自己崩了），catch 后再走一次 LITELLM_EXCEPTION_TYPES 检查；都不匹配就再抛一个 APIConnectionError，拿 traceback.format_exc() 拼到 message 里。这是兜底的兜底。

6. message-based 关键字判定速查¶

按 provider 列出所有靠错误字符串关键字（不靠 status_code）决定异常类型的位置。

openai 系（含 58 个 provider，行 387-502）¶

关键字	抛出	行号
`ExceptionCheckers.is_error_str_rate_limit()` (含 "rate limit"/"429"/"service tier capacity exceeded" 等)	`RateLimitError`	387
`is_error_str_context_window_exceeded()` (含 "exceed context limit" 等 8 个子串)	`ContextWindowExceededError`	395
`"invalid_request_error"` + `"model_not_found"`	`NotFoundError`	404
`"A timeout occurred"`	`Timeout`	416
内容审核三元组（`content_policy_violation` / `Invalid prompt ... violating our usage policy` / `request was rejected as a result of the safety system`）	`ContentPolicyViolationError`	424
`"invalid_request_error"` && NOT `"Incorrect API key provided"`	`BadRequestError`	446
`"Web server is returning an unknown error"` / `"The server had an error processing your request"`	`InternalServerError`	460
`"Request too large"`	`RateLimitError` ⚠️（不是 BR）	469
`"The api_key client option must be set ..."`	`AuthenticationError`	479
`"Mistral API raised a streaming error"`	`APIError`	490

anthropic（行 621-651）¶

关键字	抛出
`"prompt is too long"` / `"prompt: length"`	`ContextWindowExceededError`
`"overloaded_error"` / `"Overloaded"`	`InternalServerError`
`"Invalid API Key"`	`AuthenticationError`
`"content filtering policy"`	`ContentPolicyViolationError`
`"Client error '400 Bad Request'"`	`BadRequestError`

vertex_ai（行 1262-1385）¶

关键字	抛出
`"Vertex AI API has not been used in project"` / `"Unable to find your project"`	`BadRequestError`
`"400 Request payload size exceeds"`	`ContextWindowExceededError`
`is_error_str_context_window_exceeded()`	`ContextWindowExceededError`
`"None Unknown Error."` / `"Content has no parts."`	`InternalServerError`
`"API key not valid."`	`AuthenticationError`
`"403"`	`BadRequestError` ⚠️（不是 PD，见 §4.5）
`"The response was blocked."` / `"Output blocked by content filtering policy"`	`ContentPolicyViolationError`
429 系列 5 个关键字（`"429 Quota exceeded"` / `"Quota exceeded for"` / `"Resource exhausted"` / `"IndexError: list index out of range"` ⚠️ / `"429 Unable to submit ... temporarily out of capacity"`）	`RateLimitError`
`"500 Internal Server Error"` / `"The model is overloaded."`	`InternalServerError`

⚠️ "IndexError: list index out of range" 映射到 RateLimitError 很可疑——这看起来像是 Python 代码 bug 信息，跟限流没关系。可能是上游某个特定路径 quota 耗尽时 SDK 抛 IndexError，作者经验性映射。核对前不要依赖。

bedrock（行 968-1049）¶

关键字	抛出
6 个 context window 关键字（`"too many tokens"` / `"expected maxLength:"` / `"Input is too long"` / `"prompt is too long"` / `"prompt: length: 1.."` / `"Too many input tokens"`）	`ContextWindowExceededError`
`"Conversation blocks and tool result blocks cannot be provided in the same turn."`	`BadRequestError`
`"Malformed input request"` / `"A conversation must start with a user message."`	`BadRequestError`
`"Unable to locate credentials"` / `"The security token included in the request is invalid"`	`AuthenticationError`
`"AccessDeniedException"`	`PermissionDeniedError`
`"throttlingException"` / `"ThrottlingException"`	`RateLimitError`
`"Connect timeout on endpoint URL"` / `"timed out"`	`Timeout`
`"Could not process image"`	`InternalServerError`

其它 provider 关键字简表¶

Provider	关键字数	备注
sagemaker	4	`"Unable to locate credentials"` 映射成 `BadRequestError`（§4.6）⚠️
cloudflare	2	整个分支只有这两条（§4.1）
cohere	7	含类名嗅探 `"CohereConnectionError" in exception_type`
huggingface	3
nlp_cloud	6+（JSON detail 解析）
together_ai	多（JSON error_response 解析）	dead code
aleph_alpha	2
ollama	4（dict.get("error") 字符串嗅探）
azure	6+	含 `body["error"]["inner_error"]["code"] == "ResponsibleAIPolicyViolation"` 这种深结构判定

7. 历史 bug：2026-05-29 commit `2bee019` 修了什么¶

commit 2bee019 TCLOUD-11688 fix 订阅模式故障转移把 vertex_ai 的 502 映射从 APIConnectionError 改成 BadGatewayError：

                     if original_exception.status_code == 502:
                         exception_mapping_worked = True
-                        raise APIConnectionError(
-                            message=f"{custom_llm_provider.capitalize()}Exception - {error_str}",
+                        raise BadGatewayError(
+                            message=f"BadGatewayError: {custom_llm_provider.capitalize()} - {error_str}",
                             llm_provider=custom_llm_provider,
                             model=model,
                         )

修复前：vertex 上游 502 → APIConnectionError(status_code=500) → Router 不冷却 → 反复打坏实例。 修复后：vertex 上游 502 → BadGatewayError(status_code=502) → cooldown 白名单进 → 路径 1.3 失败率高时冷却 → fallback 切走。

→ 这就是 §4.1 cloudflare / §4.7 vllm 当前还在的 bug 类型——历史包袱按 provider 逐个清。

8. 实操：要接入一个新 provider 该按谁抄？¶

按"完整度"由高到低：

抄哪个	优点	适用
openai 系（行 356-617）	11 个 status_code + 10 个关键字判定，最完整；走 `openai_compatible_providers` 一键复用	你的 provider 是 OpenAI 兼容 API
anthropic（行 618-732）	7 个 status_code 含 502/503/504；message 判定有内容审核	类似 Anthropic 的 messages 格式
vertex_ai（行 1262-1479）	字符串判定丰富；新 commit 已修 502 坑；但 403 字符串路径有§4.5的坑	Google 系
azure（行 2053-2253）	内容审核走 `body["error"]["inner_error"]["code"]` 结构化判定；完整 status_code 处理	Azure 衍生

不该抄： - ❌ cloudflare、vllm、aleph_alpha 这种短分支——会继承 fall-through 到 APIConnectionError 的问题 - ❌ nlp_cloud——status_code 500/503/504 用错了类 - ❌ replicate/bedrock/sagemaker 的 500→ServiceUnavailableError——客户端 status_code 会错 - ❌ together_ai 独立分支——它本身是 dead code

接入新 provider 的最小 checklist¶

elif custom_llm_provider == "your_provider":
    # 1. 提取 message
    message = get_error_message(error_obj=original_exception)
    if message is None:
        message = getattr(original_exception, "message", str(original_exception))

    # 2. 关键字优先（可选）—— 上游字符串能给的语义先抓
    if "context length" in error_str:
        raise ContextWindowExceededError(message=..., model=model, llm_provider="your_provider")
    if "content policy" in error_str:
        raise ContentPolicyViolationError(...)

    # 3. status_code 全量覆盖
    if hasattr(original_exception, "status_code"):
        sc = original_exception.status_code
        if sc == 400: raise BadRequestError(...)
        elif sc == 401: raise AuthenticationError(...)
        elif sc == 403: raise PermissionDeniedError(...)
        elif sc == 404: raise NotFoundError(...)
        elif sc == 408: raise Timeout(...)
        elif sc == 422: raise UnprocessableEntityError(...)
        elif sc == 429: raise RateLimitError(...)
        elif sc == 500: raise InternalServerError(...)
        elif sc == 502: raise BadGatewayError(...)           # ← 别图省事用 APIConnectionError
        elif sc == 503: raise ServiceUnavailableError(...)
        elif sc == 504: raise Timeout(exception_status_code=504, ...)
        else: raise APIError(status_code=sc, ...)            # ← 不要让它 fall-through 到全局兜底

第 3 步那个 else: raise APIError(...) 是关键——它保证非标准 status_code 也带正确的 status_code 出去，不会变 APIConnectionError。

9. 行号附录：全部 `raise` 位置¶

主要 status_code → 异常类 → 行号速查（按 status_code 然后 provider 排序）：

400 BadRequestError¶

openai 507 / anthropic 672 / replicate 780 / bedrock 1080 / sagemaker 1196 / vertex_ai 1389 / cloudflare 1491 / cohere 1541 / huggingface 1635 / nlp_cloud 1769 / aleph_alpha 1981 / azure 2164 / openrouter 2260

401 AuthenticationError¶

openai 516 / anthropic 662 / replicate 769 / bedrock 1072 / sagemaker 1188 / vertex_ai 1404 / huggingface 1627 / ai21 1694 / nlp_cloud 1780 / aleph_alpha 1974 / azure 2174 / openrouter 2269

403 PermissionDeniedError¶

vertex_ai 1411（仅此一处用了正确的类）

404 NotFoundError¶

openai 525 / anthropic 679 / bedrock 1088 / sagemaker 1204 / vertex_ai 1425 / openrouter 2278

408 Timeout¶

openai 534 / anthropic 686 / replicate 796 / bedrock 1096 / sagemaker 1212 / vertex_ai 1432 / cohere 1549 / huggingface 1643 / ai21 1702 / together_ai 1911 / azure 2183 / openrouter 2287

422 UnprocessableEntityError¶

replicate 788（唯一用 UE 的——其他都映射为 BadRequestError）

429 RateLimitError¶

openai 552 / anthropic 693 / replicate 811 / bedrock 1113 / sagemaker 1232 / vertex_ai 1440 / huggingface 1650 / ai21 1717 / nlp_cloud 1801 / together_ai 1926 / aleph_alpha 1989 / azure 2200 / openrouter 2304

500 InternalServerError / ServiceUnavailableError 混用¶

openai 561 IS / anthropic 703 IS / vertex_ai 1455 IS / cohere 1556 IS（§4.3） replicate 819 SU ⚠️ / bedrock 1059 SU ⚠️ / sagemaker 1175 SU ⚠️ / aleph_alpha 1997 SU ⚠️ nlp_cloud 1812 APIError ⚠️（§4.4）

502 BadGatewayError¶

openai 570 / anthropic 711 / vertex_ai 1468（2bee019 修复） / azure 2209

503 ServiceUnavailableError¶

openai 579 / anthropic 719 / bedrock 1122 / sagemaker 1241 / vertex_ai 1475 / huggingface 1658 / azure 2218 / openrouter 2313 nlp_cloud 1812 用了 APIError ⚠️

504 Timeout / 其它¶

openai 588 TO / anthropic 727 TO / bedrock 1131 TO / sagemaker 1250 TO / azure 2227 TO / openrouter 2322 TO nlp_cloud 1824 ServiceUnavailableError ⚠️（§4.4）

APIConnectionError 出现位置¶

openai fallback 608 / vllm status==0 2047（§4.7）/ azure "Connection error" 2154 / azure no-status fallback 2248 / openrouter no-status fallback 2341 / 全局兜底 2367 + 2374 + 2404

下一步¶

每条异常 Router 怎么反应（cooldown / retry / fallback） → 03-router-behavior.md
客户端能看到什么 / 哪里能查到这次的具体信息 → 04-where-to-see.md
从症状倒推根因 → 05-troubleshooting-by-symptom.md