技能说明

Routes LLM requests to a local model (Ollama, LM Studio, llamafile) before falling back to cloud APIs. Tracks token savings and cost avoidance in a persistent dashboard. Use when: (1) user asks to run a task with a local model first, (2) user wants to reduce cloud API costs or keep requests private, (3) user asks to see their token savings or LLM routing dashboard, (4) any request where local-vs-cloud routing should be decided automatically. Supports Ollama, LM Studio, and llamafile as local providers.


中文介绍

在优先将LLM请求路由至本地模型(Ollama、LM Studio、llamafile)后再回退至云API。在持久化仪表板中追踪令牌节省与成本节约。适用于以下情况:(1)用户要求优先使用本地模型执行任务;(2)用户希望降低云API成本或保持请求私密;(3)用户要求查看其令牌节省情况或LLM路由仪表板;(4)任何需要自动判断本地与云路由的请求。支持Ollama、LM Studio和llamafile作为本地模型提供商。

直接复制以下提示词,发送给你的 AI 助手即可完成安装。

帮我下载并安装这个SKILL:https://skillhub.cstcloud.cn/download/local-first-llm

点击右上角 下载SKILL 按钮

元信息

分类:Tool
下载:4
浏览:4
标签:
local-llm-routing token-savings-tracker cloud-fallback