feat: add localai

2023-09-01 17:55:44 +08:00 · 2023-09-01 17:55:44 +08:00 · f38144de4a
parent 555382ce4b
commit f38144de4a
4 changed files with 188 additions and 0 deletions
--- a/en/SUMMARY.md
+++ b/en/SUMMARY.md
@ -48,6 +48,7 @@
  * [Replicate](advanced/model-configuration/replicate.md)
  * [Xinference](advanced/model-configuration/xinference.md)
  * [OpenLLM](advanced/model-configuration/openllm.md)
+  * [LocalAI](advanced/model-configuration/localai.md)
 * [More Integration](advanced/more-integration.md)

 ## use cases
--- a/en/advanced/model-configuration/localai.md
+++ b/en/advanced/model-configuration/localai.md
@ -0,0 +1,89 @@
+# Integrating with LocalAI for Local Model Deployment
+
+[LocalAI](https://github.com/go-skynet/LocalAI) is a drop-in replacement REST API that's compatible with OpenAI API specifications for local inferencing. It allows you to run LLMs (and not only) locally or on-prem with consumer grade hardware, supporting multiple model families that are compatible with the ggml format. Does not require GPU.
+
+Dify allows integration with LocalAI for local deployment of large language model inference and embedding capabilities.
+
+## Deploying LocalAI
+
+You can refer to the official [Getting Started](https://localai.io/basics/getting_started/) guide for deployment, or quickly integrate following the steps below:
+
+(These steps are derived from [LocalAI Data query example](https://github.com/go-skynet/LocalAI/blob/master/examples/langchain-chroma/README.md))
+
+1. First, clone the LocalAI code repository and navigate to the specified directory.
+
+    ```bash
+    $ git clone https://github.com/go-skynet/LocalAI
+    $ cd LocalAI/examples/langchain-chroma
+    ```
+
+2. Download example LLM and Embedding models.
+
+    ```bash
+    $ wget https://huggingface.co/skeskinen/ggml/resolve/main/all-MiniLM-L6-v2/ggml-model-q4_0.bin -O models/bert
+    $ wget https://gpt4all.io/models/ggml-gpt4all-j.bin -O models/ggml-gpt4all-j
+    ```
+
+    Here, we choose two smaller models that are compatible across all platforms. `ggml-gpt4all-j` serves as the default LLM model, and `all-MiniLM-L6-v2` serves as the default Embedding model, for quick local deployment.
+
+3. Configure the .env file.
+
+   ```shell
+   $ mv .env.example .env
+   ```
+   
+   NOTE: Ensure that the THREADS variable value in `.env` doesn't exceed the number of CPU cores on your machine.
+
+4. Start LocalAI.
+
+    ```shell
+    # start with docker-compose
+    $ docker-compose up -d --build
+
+    # tail the logs & wait until the build completes
+    $ docker logs -f langchain-chroma-api-1
+    7:16AM INF Starting LocalAI using 4 threads, with models path: /models
+    7:16AM INF LocalAI version: v1.24.1 (9cc8d9086580bd2a96f5c96a6b873242879c70bc)
+    ```
+
+	The LocalAI request API endpoint will be available at http://127.0.0.1:8080.
+
+    And it provides two models, namely:
+
+    - LLM Model: `ggml-gpt4all-j`
+
+      External access name: `gpt-3.5-turbo` (This name is customizable and can be configured in `models/gpt-3.5-turbo.yaml`).
+
+    - Embedding Model: `all-MiniLM-L6-v2`
+
+      External access name: `text-embedding-ada-002` (This name is customizable and can be configured in `models/embeddings.yaml`).
+
+5. Integrate the models into Dify.
+
+   Go to `Settings > Model Providers > LocalAI` and fill in:
+
+   Model 1: `ggml-gpt4all-j`
+
+   - Model Type: Text Generation
+
+   - Model Name: `gpt-3.5-turbo`
+
+   - Server URL: http://127.0.0.1:8080
+
+     If Dify is deployed via docker, fill in the host domain: `http://<your-LocalAI-endpoint-domain>:8080`, which can be a LAN IP address, like: `http://192.168.1.100:8080`
+
+   Click "Save" to use the model in the application.
+
+   Model 2: `all-MiniLM-L6-v2`
+
+   - Model Type: Embeddings
+
+   - Model Name: `text-embedding-ada-002`
+
+   - Server URL: http://127.0.0.1:8080
+
+     If Dify is deployed via docker, fill in the host domain: `http://<your-LocalAI-endpoint-domain>:8080`, which can be a LAN IP address, like: `http://192.168.1.100:8080`
+
+   Click "Save" to use the model in the application.
+
+For more information about LocalAI, please refer to: https://github.com/go-skynet/LocalAI
--- a/zh_CN/SUMMARY.md
+++ b/zh_CN/SUMMARY.md
@ -47,6 +47,7 @@
  * [接入 Replicate 上的开源模型](advanced/model-configuration/replicate.md)
  * [接入 Xinference 部署的本地模型](advanced/model-configuration/xinference.md)
  * [接入 OpenLLM 部署的本地模型](advanced/model-configuration/openllm.md)
+  * [接入 LocalAI 部署的本地模型](advanced/model-configuration/localai.md)
 * [更多集成](advanced/more-integration.md)

 ## 使用案例 <a href="#use-cases" id="use-cases"></a>
--- a/zh_CN/advanced/model-configuration/localai.md
+++ b/zh_CN/advanced/model-configuration/localai.md
@ -0,0 +1,97 @@
+# 接入 LocalAI 部署的本地模型
+
+[LocalAI](https://github.com/go-skynet/LocalAI) 是一个本地推理框架，提供了 RESTFul API，与 OpenAI API 规范兼容。它允许你在消费级硬件上本地或者在自有服务器上运行 LLM（和其他模型），支持与 ggml 格式兼容的多种模型家族。不需要 GPU。
+Dify 支持以本地部署的方式接入 LocalAI 部署的大型语言模型推理和 embedding 能力。
+
+## 部署 LocalAI
+
+可参考官方 [Getting Started](https://localai.io/basics/getting_started/) 进行部署，也可参考下方步骤进行快速接入：
+
+（以下步骤来自 [LocalAI Data query example](https://github.com/go-skynet/LocalAI/blob/master/examples/langchain-chroma/README.md)）
+
+1. 首先拉取 LocalAI 代码仓库，并进入指定目录
+
+    ```bash
+    $ git clone https://github.com/go-skynet/LocalAI
+    $ cd LocalAI/examples/langchain-chroma
+    ```
+
+2. 下载范例 LLM 和 Embedding 模型
+
+    ```bash
+    $ wget https://huggingface.co/skeskinen/ggml/resolve/main/all-MiniLM-L6-v2/ggml-model-q4_0.bin -O models/bert
+    $ wget https://gpt4all.io/models/ggml-gpt4all-j.bin -O models/ggml-gpt4all-j
+    ```
+
+    这里选用了较小且全平台兼容的两个模型，`ggml-gpt4all-j` 作为默认 LLM 模型，`all-MiniLM-L6-v2` 作为默认 Embedding 模型，方便在本地快速部署使用。
+
+3. 配置 .env 文件
+
+   ```shell
+   $ mv .env.example .env
+   ```
+
+   NOTE：请确保 `.env` 中的 THREADS 变量值不超过您本机的 CPU 核心数。
+
+4. 启动 LocalAI
+
+   ```shell
+   # start with docker-compose
+   $ docker-compose up -d --build
+   
+   # tail the logs & wait until the build completes
+   $ docker logs -f langchain-chroma-api-1
+   7:16AM INF Starting LocalAI using 4 threads, with models path: /models
+   7:16AM INF LocalAI version: v1.24.1 (9cc8d9086580bd2a96f5c96a6b873242879c70bc)
+   
+    ┌───────────────────────────────────────────────────┐ 
+    │                   Fiber v2.48.0                   │ 
+    │               http://127.0.0.1:8080               │ 
+    │       (bound on host 0.0.0.0 and port 8080)       │ 
+    │                                                   │ 
+    │ Handlers ............ 55  Processes ........... 1 │ 
+    │ Prefork ....... Disabled  PID ................ 14 │ 
+    └───────────────────────────────────────────────────┘ 
+   ```
+
+   开放了本机 `http://127.0.0.1:8080` 作为 LocalAI 请求 API 的端点。
+
+   并提供了两个模型，分别为：
+
+   - LLM 模型：`ggml-gpt4all-j`
+
+     对外访问名称：`gpt-3.5-turbo`（该名称可自定义，在 `models/gpt-3.5-turbo.yaml` 中配置。
+
+   - Embedding 模型：`all-MiniLM-L6-v2`
+
+     对外访问名称：`text-embedding-ada-002`（该名称可自定义，在 `models/embeddings.yaml` 中配置。
+
+5. LocalAI API 服务部署完毕，在 Dify 中使用接入模型
+
+   在 `设置 > 模型供应商 > LocalAI` 中填入：
+
+   模型 1：`ggml-gpt4all-j`
+
+   - 模型类型：文本生成
+
+   - 模型名称：`gpt-3.5-turbo`
+
+   - 服务器 URL：http://127.0.0.1:8080
+
+     若 Dify 为 docker 部署，请填入 host 域名：`http://<your-LocalAI-endpoint-domain>:8080`，可填写局域网 IP 地址，如：`http://192.168.1.100:8080`
+
+   "保存" 后即可在应用中使用该模型。
+
+   模型 2：`all-MiniLM-L6-v2`
+
+   - 模型类型：Embeddings
+
+   - 模型名称：`text-embedding-ada-002`
+
+   - 服务器 URL：http://127.0.0.1:8080
+
+     若 Dify 为 docker 部署，请填入 host 域名：`http://<your-LocalAI-endpoint-domain>:8080`，可填写局域网 IP 地址，如：`http://192.168.1.100:8080`
+
+   "保存" 后即可在应用中使用该模型。
+
+如需获取 LocalAI 更多信息，请参考：https://github.com/go-skynet/LocalAI