feat: add openllm connect docs (#3)

fix/app-link-error
takatost 2023-08-21 00:10:45 +08:00 committed by GitHub
parent 3c2a76fdfe
commit dfe189754f
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
8 changed files with 90 additions and 2 deletions

View File

@ -46,6 +46,7 @@
* [Hugging Face](advanced/model-configuration/hugging-face.md)
* [Replicate](advanced/model-configuration/replicate.md)
* [Xinference](advanced/model-configuration/xinference.md)
* [OpenLLM](advanced/model-configuration/openllm.md)
* [More Integration](advanced/more-integration.md)
## use cases

View File

@ -7,6 +7,8 @@ Dify currently supports major model providers such as OpenAI's GPT series. Here
* Anthropic
* Hugging Face Hub
* Replicate
* Xinference
* OpenLLM
* iFLYTEK SPARK
* WENXINYIYAN
* TONGYI
@ -76,6 +78,7 @@ There are many third-party models on hosting type providers. Access models need
* [Hugging Face](hugging-face.md).
* [Replicate](replicate.md).
* [Xinference](xinference.md).
* [OpenLLM](openllm.md).
### Use model

View File

@ -0,0 +1,40 @@
# Connecting to OpenLLM Local Deployed Models
> 🚧 WIP
With [OpenLLM](https://github.com/bentoml/OpenLLM), you can run inference with any open-source large-language models, deploy to the cloud or on-premises, and build powerful AI apps.
And Dify supports connecting to OpenLLM deployed large language model's inference capabilities locally.
## Deploy OpenLLM Model
Each OpenLLM Server can deploy one model, and you can deploy it in the following way:
1. First, install OpenLLM through PyPI:
```bash
$ pip install openllm
```
2. Locally deploy and start the OpenLLM model:
```bash
$ openllm start opt --model_id facebook/opt-125m -p 3333
2023-08-20T23:49:59+0800 [INFO] [cli] Prometheus metrics for HTTP BentoServer from "_service:svc" can be accessed at http://localhost:3333/metrics.
2023-08-20T23:50:00+0800 [INFO] [cli] Starting production HTTP BentoServer from "_service:svc" listening on http://0.0.0.0:3333 (Press CTRL+C to quit)
```
After OpenLLM starts, it provides API access service for the local port `3333`, the endpoint being: `http://127.0.0.1:3333`. Since the default 3000 port conflicts with Dify's WEB service, the port is changed to 3333 here.
If you need to modify the host or port, you can view the help information for starting OpenLLM: `openllm start opt --model_id facebook/opt-125m -h`.
> Note: Using the `facebook/opt-125m` model here is only for demonstration, and the effect may not be good. Please choose the appropriate model according to the actual situation. For more models, please refer to: [Supported Model List](https://github.com/bentoml/OpenLLM#-supported-models).
3. After the model is deployed, use the connected model in Dify.
Fill in under `Settings > Model Providers > OpenLLM`:
- Model Name: `facebook/opt-125m`
- Server URL: `http://127.0.0.1:3333`
Click "Save" and the model can be used in the application.
This instruction is only for quick connection as an example. For more features and information on using OpenLLM, please refer to: [OpenLLM](https://github.com/bentoml/OpenLLM)

View File

@ -1,6 +1,6 @@
# Connecting to Xinference Local Deployed Models
> WIP 🚧
> 🚧 WIP
[Xorbits inference](https://github.com/xorbitsai/inference) is a powerful and versatile library designed to serve language, speech recognition, and multimodal models, and can even be used on laptops. It supports various models compatible with GGML, such as chatglm, baichuan, whisper, vicuna, orca, etc.
And Dify supports connecting to Xinference deployed large language model inference and embedding capabilities locally.

View File

@ -45,6 +45,7 @@
* [接入 Hugging Face 上的开源模型](advanced/model-configuration/hugging-face.md)
* [接入 Replicate 上的开源模型](advanced/model-configuration/replicate.md)
* [接入 Xinference 部署的本地模型](advanced/model-configuration/xinference.md)
* [接入 OpenLLM 部署的本地模型](advanced/model-configuration/openllm.md)
* [更多集成](advanced/more-integration.md)
## 使用案例 <a href="#use-cases" id="use-cases"></a>

View File

@ -7,6 +7,8 @@ Dify 目前已支持主流的模型供应商,例如 OpenAI 的 GPT 系列。
* Anthropic
* Hugging Face Hub
* Replicate
* Xinference
* OpenLLM
* 讯飞星火
* 文心一言
* 通义千问
@ -79,6 +81,7 @@ Dify 使用了 [PKCS1_OAEP](https://pycryptodome.readthedocs.io/en/latest/src/ci
* [Hugging Face](hugging-face.md)。
* [Replicate](replicate.md)。
* [Xinference](xinference.md)。
* [OpenLLM](openllm.md)。

View File

@ -0,0 +1,40 @@
# 接入 OpenLLM 部署的本地模型
> 🚧 WIP
使用 [OpenLLM](https://github.com/bentoml/OpenLLM), 您可以针对任何开源大型语言模型进行推理,部署到云端或本地,并构建强大的 AI 应用程序。
Dify 支持以本地部署的方式接入 OpenLLM 部署的大型语言模型的推理能力。
## 部署 OpenLLM 模型
每个 OpenLLM Server 可以部署一个模型,您可以通过以下方式部署:
1. 首先通过 PyPI 安装 OpenLLM
```bash
$ pip install openllm
```
2. 本地部署并启动 OpenLLM 模型:
```bash
$ openllm start opt --model_id facebook/opt-125m -p 3333
2023-08-20T23:49:59+0800 [INFO] [cli] Prometheus metrics for HTTP BentoServer from "_service:svc" can be accessed at http://localhost:3333/metrics.
2023-08-20T23:50:00+0800 [INFO] [cli] Starting production HTTP BentoServer from "_service:svc" listening on http://0.0.0.0:3333 (Press CTRL+C to quit)
```
OpenLLM 启动后,为本机的 `3333` 端口提供 API 接入服务,端点为:`http://127.0.0.1:3333`,由于默认的 3000 端口与 Dify 的 WEB 服务冲突,这边修改为 3333 端口。
如需修改 host 或 port可查看 OpenLLM 启动的帮助信息:`openllm start opt --model_id facebook/opt-125m -h`。
> 注意:此处使用 facebook/opt-125m 模型仅作为示例,效果可能不佳,请根据实际情况选择合适的模型,更多模型请参考:[支持的模型列表](https://github.com/bentoml/OpenLLM#-supported-models)。
3. 模型部署完毕,在 Dify 中使用接入模型
`设置 > 模型供应商 > OpenLLM` 中填入:
- 模型名称:`facebook/opt-125m`
- 服务器 URL`http://127.0.0.1:3333`
"保存" 后即可在应用中使用该模型。
本说明仅作为快速接入的示例,如需使用 OpenLLM 更多特性和信息,请参考:[OpenLLM](https://github.com/bentoml/OpenLLM)

View File

@ -1,6 +1,6 @@
# 接入 Xinference 部署的本地模型
> WIP 🚧
> 🚧 WIP
[Xorbits inference](https://github.com/xorbitsai/inference) 是一个强大且通用的分布式推理框架旨在为大型语言模型、语音识别模型和多模态模型提供服务甚至可以在笔记本电脑上使用。它支持多种与GGML兼容的模型,如 chatglm, baichuan, whisper, vicuna, orca 等。
Dify 支持以本地部署的方式接入 Xinference 部署的大型语言模型推理和 embedding 能力。