feat: add localai
parent
555382ce4b
commit
f38144de4a
|
|
@ -48,6 +48,7 @@
|
|||
* [Replicate](advanced/model-configuration/replicate.md)
|
||||
* [Xinference](advanced/model-configuration/xinference.md)
|
||||
* [OpenLLM](advanced/model-configuration/openllm.md)
|
||||
* [LocalAI](advanced/model-configuration/localai.md)
|
||||
* [More Integration](advanced/more-integration.md)
|
||||
|
||||
## use cases
|
||||
|
|
|
|||
|
|
@ -0,0 +1,89 @@
|
|||
# Integrating with LocalAI for Local Model Deployment
|
||||
|
||||
[LocalAI](https://github.com/go-skynet/LocalAI) is a drop-in replacement REST API that's compatible with OpenAI API specifications for local inferencing. It allows you to run LLMs (and not only) locally or on-prem with consumer grade hardware, supporting multiple model families that are compatible with the ggml format. Does not require GPU.
|
||||
|
||||
Dify allows integration with LocalAI for local deployment of large language model inference and embedding capabilities.
|
||||
|
||||
## Deploying LocalAI
|
||||
|
||||
You can refer to the official [Getting Started](https://localai.io/basics/getting_started/) guide for deployment, or quickly integrate following the steps below:
|
||||
|
||||
(These steps are derived from [LocalAI Data query example](https://github.com/go-skynet/LocalAI/blob/master/examples/langchain-chroma/README.md))
|
||||
|
||||
1. First, clone the LocalAI code repository and navigate to the specified directory.
|
||||
|
||||
```bash
|
||||
$ git clone https://github.com/go-skynet/LocalAI
|
||||
$ cd LocalAI/examples/langchain-chroma
|
||||
```
|
||||
|
||||
2. Download example LLM and Embedding models.
|
||||
|
||||
```bash
|
||||
$ wget https://huggingface.co/skeskinen/ggml/resolve/main/all-MiniLM-L6-v2/ggml-model-q4_0.bin -O models/bert
|
||||
$ wget https://gpt4all.io/models/ggml-gpt4all-j.bin -O models/ggml-gpt4all-j
|
||||
```
|
||||
|
||||
Here, we choose two smaller models that are compatible across all platforms. `ggml-gpt4all-j` serves as the default LLM model, and `all-MiniLM-L6-v2` serves as the default Embedding model, for quick local deployment.
|
||||
|
||||
3. Configure the .env file.
|
||||
|
||||
```shell
|
||||
$ mv .env.example .env
|
||||
```
|
||||
|
||||
NOTE: Ensure that the THREADS variable value in `.env` doesn't exceed the number of CPU cores on your machine.
|
||||
|
||||
4. Start LocalAI.
|
||||
|
||||
```shell
|
||||
# start with docker-compose
|
||||
$ docker-compose up -d --build
|
||||
|
||||
# tail the logs & wait until the build completes
|
||||
$ docker logs -f langchain-chroma-api-1
|
||||
7:16AM INF Starting LocalAI using 4 threads, with models path: /models
|
||||
7:16AM INF LocalAI version: v1.24.1 (9cc8d9086580bd2a96f5c96a6b873242879c70bc)
|
||||
```
|
||||
|
||||
The LocalAI request API endpoint will be available at http://127.0.0.1:8080.
|
||||
|
||||
And it provides two models, namely:
|
||||
|
||||
- LLM Model: `ggml-gpt4all-j`
|
||||
|
||||
External access name: `gpt-3.5-turbo` (This name is customizable and can be configured in `models/gpt-3.5-turbo.yaml`).
|
||||
|
||||
- Embedding Model: `all-MiniLM-L6-v2`
|
||||
|
||||
External access name: `text-embedding-ada-002` (This name is customizable and can be configured in `models/embeddings.yaml`).
|
||||
|
||||
5. Integrate the models into Dify.
|
||||
|
||||
Go to `Settings > Model Providers > LocalAI` and fill in:
|
||||
|
||||
Model 1: `ggml-gpt4all-j`
|
||||
|
||||
- Model Type: Text Generation
|
||||
|
||||
- Model Name: `gpt-3.5-turbo`
|
||||
|
||||
- Server URL: http://127.0.0.1:8080
|
||||
|
||||
If Dify is deployed via docker, fill in the host domain: `http://<your-LocalAI-endpoint-domain>:8080`, which can be a LAN IP address, like: `http://192.168.1.100:8080`
|
||||
|
||||
Click "Save" to use the model in the application.
|
||||
|
||||
Model 2: `all-MiniLM-L6-v2`
|
||||
|
||||
- Model Type: Embeddings
|
||||
|
||||
- Model Name: `text-embedding-ada-002`
|
||||
|
||||
- Server URL: http://127.0.0.1:8080
|
||||
|
||||
If Dify is deployed via docker, fill in the host domain: `http://<your-LocalAI-endpoint-domain>:8080`, which can be a LAN IP address, like: `http://192.168.1.100:8080`
|
||||
|
||||
Click "Save" to use the model in the application.
|
||||
|
||||
For more information about LocalAI, please refer to: https://github.com/go-skynet/LocalAI
|
||||
|
|
@ -47,6 +47,7 @@
|
|||
* [接入 Replicate 上的开源模型](advanced/model-configuration/replicate.md)
|
||||
* [接入 Xinference 部署的本地模型](advanced/model-configuration/xinference.md)
|
||||
* [接入 OpenLLM 部署的本地模型](advanced/model-configuration/openllm.md)
|
||||
* [接入 LocalAI 部署的本地模型](advanced/model-configuration/localai.md)
|
||||
* [更多集成](advanced/more-integration.md)
|
||||
|
||||
## 使用案例 <a href="#use-cases" id="use-cases"></a>
|
||||
|
|
|
|||
|
|
@ -0,0 +1,97 @@
|
|||
# 接入 LocalAI 部署的本地模型
|
||||
|
||||
[LocalAI](https://github.com/go-skynet/LocalAI) 是一个本地推理框架,提供了 RESTFul API,与 OpenAI API 规范兼容。它允许你在消费级硬件上本地或者在自有服务器上运行 LLM(和其他模型),支持与 ggml 格式兼容的多种模型家族。不需要 GPU。
|
||||
Dify 支持以本地部署的方式接入 LocalAI 部署的大型语言模型推理和 embedding 能力。
|
||||
|
||||
## 部署 LocalAI
|
||||
|
||||
可参考官方 [Getting Started](https://localai.io/basics/getting_started/) 进行部署,也可参考下方步骤进行快速接入:
|
||||
|
||||
(以下步骤来自 [LocalAI Data query example](https://github.com/go-skynet/LocalAI/blob/master/examples/langchain-chroma/README.md))
|
||||
|
||||
1. 首先拉取 LocalAI 代码仓库,并进入指定目录
|
||||
|
||||
```bash
|
||||
$ git clone https://github.com/go-skynet/LocalAI
|
||||
$ cd LocalAI/examples/langchain-chroma
|
||||
```
|
||||
|
||||
2. 下载范例 LLM 和 Embedding 模型
|
||||
|
||||
```bash
|
||||
$ wget https://huggingface.co/skeskinen/ggml/resolve/main/all-MiniLM-L6-v2/ggml-model-q4_0.bin -O models/bert
|
||||
$ wget https://gpt4all.io/models/ggml-gpt4all-j.bin -O models/ggml-gpt4all-j
|
||||
```
|
||||
|
||||
这里选用了较小且全平台兼容的两个模型,`ggml-gpt4all-j` 作为默认 LLM 模型,`all-MiniLM-L6-v2` 作为默认 Embedding 模型,方便在本地快速部署使用。
|
||||
|
||||
3. 配置 .env 文件
|
||||
|
||||
```shell
|
||||
$ mv .env.example .env
|
||||
```
|
||||
|
||||
NOTE:请确保 `.env` 中的 THREADS 变量值不超过您本机的 CPU 核心数。
|
||||
|
||||
4. 启动 LocalAI
|
||||
|
||||
```shell
|
||||
# start with docker-compose
|
||||
$ docker-compose up -d --build
|
||||
|
||||
# tail the logs & wait until the build completes
|
||||
$ docker logs -f langchain-chroma-api-1
|
||||
7:16AM INF Starting LocalAI using 4 threads, with models path: /models
|
||||
7:16AM INF LocalAI version: v1.24.1 (9cc8d9086580bd2a96f5c96a6b873242879c70bc)
|
||||
|
||||
┌───────────────────────────────────────────────────┐
|
||||
│ Fiber v2.48.0 │
|
||||
│ http://127.0.0.1:8080 │
|
||||
│ (bound on host 0.0.0.0 and port 8080) │
|
||||
│ │
|
||||
│ Handlers ............ 55 Processes ........... 1 │
|
||||
│ Prefork ....... Disabled PID ................ 14 │
|
||||
└───────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
开放了本机 `http://127.0.0.1:8080` 作为 LocalAI 请求 API 的端点。
|
||||
|
||||
并提供了两个模型,分别为:
|
||||
|
||||
- LLM 模型:`ggml-gpt4all-j`
|
||||
|
||||
对外访问名称:`gpt-3.5-turbo`(该名称可自定义,在 `models/gpt-3.5-turbo.yaml` 中配置。
|
||||
|
||||
- Embedding 模型:`all-MiniLM-L6-v2`
|
||||
|
||||
对外访问名称:`text-embedding-ada-002`(该名称可自定义,在 `models/embeddings.yaml` 中配置。
|
||||
|
||||
5. LocalAI API 服务部署完毕,在 Dify 中使用接入模型
|
||||
|
||||
在 `设置 > 模型供应商 > LocalAI` 中填入:
|
||||
|
||||
模型 1:`ggml-gpt4all-j`
|
||||
|
||||
- 模型类型:文本生成
|
||||
|
||||
- 模型名称:`gpt-3.5-turbo`
|
||||
|
||||
- 服务器 URL:http://127.0.0.1:8080
|
||||
|
||||
若 Dify 为 docker 部署,请填入 host 域名:`http://<your-LocalAI-endpoint-domain>:8080`,可填写局域网 IP 地址,如:`http://192.168.1.100:8080`
|
||||
|
||||
"保存" 后即可在应用中使用该模型。
|
||||
|
||||
模型 2:`all-MiniLM-L6-v2`
|
||||
|
||||
- 模型类型:Embeddings
|
||||
|
||||
- 模型名称:`text-embedding-ada-002`
|
||||
|
||||
- 服务器 URL:http://127.0.0.1:8080
|
||||
|
||||
若 Dify 为 docker 部署,请填入 host 域名:`http://<your-LocalAI-endpoint-domain>:8080`,可填写局域网 IP 地址,如:`http://192.168.1.100:8080`
|
||||
|
||||
"保存" 后即可在应用中使用该模型。
|
||||
|
||||
如需获取 LocalAI 更多信息,请参考:https://github.com/go-skynet/LocalAI
|
||||
Loading…
Reference in New Issue