feat: add localai

pull/12/head
takatost 2023-09-01 17:55:44 +08:00
parent 555382ce4b
commit f38144de4a
4 changed files with 188 additions and 0 deletions

View File

@ -48,6 +48,7 @@
* [Replicate](advanced/model-configuration/replicate.md)
* [Xinference](advanced/model-configuration/xinference.md)
* [OpenLLM](advanced/model-configuration/openllm.md)
* [LocalAI](advanced/model-configuration/localai.md)
* [More Integration](advanced/more-integration.md)
## use cases

View File

@ -0,0 +1,89 @@
# Integrating with LocalAI for Local Model Deployment
[LocalAI](https://github.com/go-skynet/LocalAI) is a drop-in replacement REST API that's compatible with OpenAI API specifications for local inferencing. It allows you to run LLMs (and not only) locally or on-prem with consumer grade hardware, supporting multiple model families that are compatible with the ggml format. Does not require GPU.
Dify allows integration with LocalAI for local deployment of large language model inference and embedding capabilities.
## Deploying LocalAI
You can refer to the official [Getting Started](https://localai.io/basics/getting_started/) guide for deployment, or quickly integrate following the steps below:
(These steps are derived from [LocalAI Data query example](https://github.com/go-skynet/LocalAI/blob/master/examples/langchain-chroma/README.md))
1. First, clone the LocalAI code repository and navigate to the specified directory.
```bash
$ git clone https://github.com/go-skynet/LocalAI
$ cd LocalAI/examples/langchain-chroma
```
2. Download example LLM and Embedding models.
```bash
$ wget https://huggingface.co/skeskinen/ggml/resolve/main/all-MiniLM-L6-v2/ggml-model-q4_0.bin -O models/bert
$ wget https://gpt4all.io/models/ggml-gpt4all-j.bin -O models/ggml-gpt4all-j
```
Here, we choose two smaller models that are compatible across all platforms. `ggml-gpt4all-j` serves as the default LLM model, and `all-MiniLM-L6-v2` serves as the default Embedding model, for quick local deployment.
3. Configure the .env file.
```shell
$ mv .env.example .env
```
NOTE: Ensure that the THREADS variable value in `.env` doesn't exceed the number of CPU cores on your machine.
4. Start LocalAI.
```shell
# start with docker-compose
$ docker-compose up -d --build
# tail the logs & wait until the build completes
$ docker logs -f langchain-chroma-api-1
7:16AM INF Starting LocalAI using 4 threads, with models path: /models
7:16AM INF LocalAI version: v1.24.1 (9cc8d9086580bd2a96f5c96a6b873242879c70bc)
```
The LocalAI request API endpoint will be available at http://127.0.0.1:8080.
And it provides two models, namely:
- LLM Model: `ggml-gpt4all-j`
External access name: `gpt-3.5-turbo` (This name is customizable and can be configured in `models/gpt-3.5-turbo.yaml`).
- Embedding Model: `all-MiniLM-L6-v2`
External access name: `text-embedding-ada-002` (This name is customizable and can be configured in `models/embeddings.yaml`).
5. Integrate the models into Dify.
Go to `Settings > Model Providers > LocalAI` and fill in:
Model 1: `ggml-gpt4all-j`
- Model Type: Text Generation
- Model Name: `gpt-3.5-turbo`
- Server URL: http://127.0.0.1:8080
If Dify is deployed via docker, fill in the host domain: `http://<your-LocalAI-endpoint-domain>:8080`, which can be a LAN IP address, like: `http://192.168.1.100:8080`
Click "Save" to use the model in the application.
Model 2: `all-MiniLM-L6-v2`
- Model Type: Embeddings
- Model Name: `text-embedding-ada-002`
- Server URL: http://127.0.0.1:8080
If Dify is deployed via docker, fill in the host domain: `http://<your-LocalAI-endpoint-domain>:8080`, which can be a LAN IP address, like: `http://192.168.1.100:8080`
Click "Save" to use the model in the application.
For more information about LocalAI, please refer to: https://github.com/go-skynet/LocalAI

View File

@ -47,6 +47,7 @@
* [接入 Replicate 上的开源模型](advanced/model-configuration/replicate.md)
* [接入 Xinference 部署的本地模型](advanced/model-configuration/xinference.md)
* [接入 OpenLLM 部署的本地模型](advanced/model-configuration/openllm.md)
* [接入 LocalAI 部署的本地模型](advanced/model-configuration/localai.md)
* [更多集成](advanced/more-integration.md)
## 使用案例 <a href="#use-cases" id="use-cases"></a>

View File

@ -0,0 +1,97 @@
# 接入 LocalAI 部署的本地模型
[LocalAI](https://github.com/go-skynet/LocalAI) 是一个本地推理框架,提供了 RESTFul API与 OpenAI API 规范兼容。它允许你在消费级硬件上本地或者在自有服务器上运行 LLM和其他模型支持与 ggml 格式兼容的多种模型家族。不需要 GPU。
Dify 支持以本地部署的方式接入 LocalAI 部署的大型语言模型推理和 embedding 能力。
## 部署 LocalAI
可参考官方 [Getting Started](https://localai.io/basics/getting_started/) 进行部署,也可参考下方步骤进行快速接入:
(以下步骤来自 [LocalAI Data query example](https://github.com/go-skynet/LocalAI/blob/master/examples/langchain-chroma/README.md)
1. 首先拉取 LocalAI 代码仓库,并进入指定目录
```bash
$ git clone https://github.com/go-skynet/LocalAI
$ cd LocalAI/examples/langchain-chroma
```
2. 下载范例 LLM 和 Embedding 模型
```bash
$ wget https://huggingface.co/skeskinen/ggml/resolve/main/all-MiniLM-L6-v2/ggml-model-q4_0.bin -O models/bert
$ wget https://gpt4all.io/models/ggml-gpt4all-j.bin -O models/ggml-gpt4all-j
```
这里选用了较小且全平台兼容的两个模型,`ggml-gpt4all-j` 作为默认 LLM 模型,`all-MiniLM-L6-v2` 作为默认 Embedding 模型,方便在本地快速部署使用。
3. 配置 .env 文件
```shell
$ mv .env.example .env
```
NOTE请确保 `.env` 中的 THREADS 变量值不超过您本机的 CPU 核心数。
4. 启动 LocalAI
```shell
# start with docker-compose
$ docker-compose up -d --build
# tail the logs & wait until the build completes
$ docker logs -f langchain-chroma-api-1
7:16AM INF Starting LocalAI using 4 threads, with models path: /models
7:16AM INF LocalAI version: v1.24.1 (9cc8d9086580bd2a96f5c96a6b873242879c70bc)
┌───────────────────────────────────────────────────┐
│ Fiber v2.48.0 │
│ http://127.0.0.1:8080 │
│ (bound on host 0.0.0.0 and port 8080) │
│ │
│ Handlers ............ 55 Processes ........... 1 │
│ Prefork ....... Disabled PID ................ 14 │
└───────────────────────────────────────────────────┘
```
开放了本机 `http://127.0.0.1:8080` 作为 LocalAI 请求 API 的端点。
并提供了两个模型,分别为:
- LLM 模型:`ggml-gpt4all-j`
对外访问名称:`gpt-3.5-turbo`(该名称可自定义,在 `models/gpt-3.5-turbo.yaml` 中配置。
- Embedding 模型:`all-MiniLM-L6-v2`
对外访问名称:`text-embedding-ada-002`(该名称可自定义,在 `models/embeddings.yaml` 中配置。
5. LocalAI API 服务部署完毕,在 Dify 中使用接入模型
`设置 > 模型供应商 > LocalAI` 中填入:
模型 1`ggml-gpt4all-j`
- 模型类型:文本生成
- 模型名称:`gpt-3.5-turbo`
- 服务器 URLhttp://127.0.0.1:8080
若 Dify 为 docker 部署,请填入 host 域名:`http://<your-LocalAI-endpoint-domain>:8080`,可填写局域网 IP 地址,如:`http://192.168.1.100:8080`
"保存" 后即可在应用中使用该模型。
模型 2`all-MiniLM-L6-v2`
- 模型类型Embeddings
- 模型名称:`text-embedding-ada-002`
- 服务器 URLhttp://127.0.0.1:8080
若 Dify 为 docker 部署,请填入 host 域名:`http://<your-LocalAI-endpoint-domain>:8080`,可填写局域网 IP 地址,如:`http://192.168.1.100:8080`
"保存" 后即可在应用中使用该模型。
如需获取 LocalAI 更多信息请参考https://github.com/go-skynet/LocalAI