From f38144de4a51a7c9b4fcceae794623d2a0825ecb Mon Sep 17 00:00:00 2001 From: takatost Date: Fri, 1 Sep 2023 17:55:44 +0800 Subject: [PATCH] feat: add localai --- en/SUMMARY.md | 1 + en/advanced/model-configuration/localai.md | 89 +++++++++++++++++ zh_CN/SUMMARY.md | 1 + zh_CN/advanced/model-configuration/localai.md | 97 +++++++++++++++++++ 4 files changed, 188 insertions(+) create mode 100644 en/advanced/model-configuration/localai.md create mode 100644 zh_CN/advanced/model-configuration/localai.md diff --git a/en/SUMMARY.md b/en/SUMMARY.md index 8fb99e7..eef6e21 100644 --- a/en/SUMMARY.md +++ b/en/SUMMARY.md @@ -48,6 +48,7 @@ * [Replicate](advanced/model-configuration/replicate.md) * [Xinference](advanced/model-configuration/xinference.md) * [OpenLLM](advanced/model-configuration/openllm.md) + * [LocalAI](advanced/model-configuration/localai.md) * [More Integration](advanced/more-integration.md) ## use cases diff --git a/en/advanced/model-configuration/localai.md b/en/advanced/model-configuration/localai.md new file mode 100644 index 0000000..115a8ee --- /dev/null +++ b/en/advanced/model-configuration/localai.md @@ -0,0 +1,89 @@ +# Integrating with LocalAI for Local Model Deployment + +[LocalAI](https://github.com/go-skynet/LocalAI) is a drop-in replacement REST API that's compatible with OpenAI API specifications for local inferencing. It allows you to run LLMs (and not only) locally or on-prem with consumer grade hardware, supporting multiple model families that are compatible with the ggml format. Does not require GPU. + +Dify allows integration with LocalAI for local deployment of large language model inference and embedding capabilities. + +## Deploying LocalAI + +You can refer to the official [Getting Started](https://localai.io/basics/getting_started/) guide for deployment, or quickly integrate following the steps below: + +(These steps are derived from [LocalAI Data query example](https://github.com/go-skynet/LocalAI/blob/master/examples/langchain-chroma/README.md)) + +1. First, clone the LocalAI code repository and navigate to the specified directory. + + ```bash + $ git clone https://github.com/go-skynet/LocalAI + $ cd LocalAI/examples/langchain-chroma + ``` + +2. Download example LLM and Embedding models. + + ```bash + $ wget https://huggingface.co/skeskinen/ggml/resolve/main/all-MiniLM-L6-v2/ggml-model-q4_0.bin -O models/bert + $ wget https://gpt4all.io/models/ggml-gpt4all-j.bin -O models/ggml-gpt4all-j + ``` + + Here, we choose two smaller models that are compatible across all platforms. `ggml-gpt4all-j` serves as the default LLM model, and `all-MiniLM-L6-v2` serves as the default Embedding model, for quick local deployment. + +3. Configure the .env file. + + ```shell + $ mv .env.example .env + ``` + + NOTE: Ensure that the THREADS variable value in `.env` doesn't exceed the number of CPU cores on your machine. + +4. Start LocalAI. + + ```shell + # start with docker-compose + $ docker-compose up -d --build + + # tail the logs & wait until the build completes + $ docker logs -f langchain-chroma-api-1 + 7:16AM INF Starting LocalAI using 4 threads, with models path: /models + 7:16AM INF LocalAI version: v1.24.1 (9cc8d9086580bd2a96f5c96a6b873242879c70bc) + ``` + + The LocalAI request API endpoint will be available at http://127.0.0.1:8080. + + And it provides two models, namely: + + - LLM Model: `ggml-gpt4all-j` + + External access name: `gpt-3.5-turbo` (This name is customizable and can be configured in `models/gpt-3.5-turbo.yaml`). + + - Embedding Model: `all-MiniLM-L6-v2` + + External access name: `text-embedding-ada-002` (This name is customizable and can be configured in `models/embeddings.yaml`). + +5. Integrate the models into Dify. + + Go to `Settings > Model Providers > LocalAI` and fill in: + + Model 1: `ggml-gpt4all-j` + + - Model Type: Text Generation + + - Model Name: `gpt-3.5-turbo` + + - Server URL: http://127.0.0.1:8080 + + If Dify is deployed via docker, fill in the host domain: `http://:8080`, which can be a LAN IP address, like: `http://192.168.1.100:8080` + + Click "Save" to use the model in the application. + + Model 2: `all-MiniLM-L6-v2` + + - Model Type: Embeddings + + - Model Name: `text-embedding-ada-002` + + - Server URL: http://127.0.0.1:8080 + + If Dify is deployed via docker, fill in the host domain: `http://:8080`, which can be a LAN IP address, like: `http://192.168.1.100:8080` + + Click "Save" to use the model in the application. + +For more information about LocalAI, please refer to: https://github.com/go-skynet/LocalAI \ No newline at end of file diff --git a/zh_CN/SUMMARY.md b/zh_CN/SUMMARY.md index df33fd6..6334208 100644 --- a/zh_CN/SUMMARY.md +++ b/zh_CN/SUMMARY.md @@ -47,6 +47,7 @@ * [接入 Replicate 上的开源模型](advanced/model-configuration/replicate.md) * [接入 Xinference 部署的本地模型](advanced/model-configuration/xinference.md) * [接入 OpenLLM 部署的本地模型](advanced/model-configuration/openllm.md) + * [接入 LocalAI 部署的本地模型](advanced/model-configuration/localai.md) * [更多集成](advanced/more-integration.md) ## 使用案例 diff --git a/zh_CN/advanced/model-configuration/localai.md b/zh_CN/advanced/model-configuration/localai.md new file mode 100644 index 0000000..e1d7e99 --- /dev/null +++ b/zh_CN/advanced/model-configuration/localai.md @@ -0,0 +1,97 @@ +# 接入 LocalAI 部署的本地模型 + +[LocalAI](https://github.com/go-skynet/LocalAI) 是一个本地推理框架,提供了 RESTFul API,与 OpenAI API 规范兼容。它允许你在消费级硬件上本地或者在自有服务器上运行 LLM(和其他模型),支持与 ggml 格式兼容的多种模型家族。不需要 GPU。 +Dify 支持以本地部署的方式接入 LocalAI 部署的大型语言模型推理和 embedding 能力。 + +## 部署 LocalAI + +可参考官方 [Getting Started](https://localai.io/basics/getting_started/) 进行部署,也可参考下方步骤进行快速接入: + +(以下步骤来自 [LocalAI Data query example](https://github.com/go-skynet/LocalAI/blob/master/examples/langchain-chroma/README.md)) + +1. 首先拉取 LocalAI 代码仓库,并进入指定目录 + + ```bash + $ git clone https://github.com/go-skynet/LocalAI + $ cd LocalAI/examples/langchain-chroma + ``` + +2. 下载范例 LLM 和 Embedding 模型 + + ```bash + $ wget https://huggingface.co/skeskinen/ggml/resolve/main/all-MiniLM-L6-v2/ggml-model-q4_0.bin -O models/bert + $ wget https://gpt4all.io/models/ggml-gpt4all-j.bin -O models/ggml-gpt4all-j + ``` + + 这里选用了较小且全平台兼容的两个模型,`ggml-gpt4all-j` 作为默认 LLM 模型,`all-MiniLM-L6-v2` 作为默认 Embedding 模型,方便在本地快速部署使用。 + +3. 配置 .env 文件 + + ```shell + $ mv .env.example .env + ``` + + NOTE:请确保 `.env` 中的 THREADS 变量值不超过您本机的 CPU 核心数。 + +4. 启动 LocalAI + + ```shell + # start with docker-compose + $ docker-compose up -d --build + + # tail the logs & wait until the build completes + $ docker logs -f langchain-chroma-api-1 + 7:16AM INF Starting LocalAI using 4 threads, with models path: /models + 7:16AM INF LocalAI version: v1.24.1 (9cc8d9086580bd2a96f5c96a6b873242879c70bc) + + ┌───────────────────────────────────────────────────┐ + │ Fiber v2.48.0 │ + │ http://127.0.0.1:8080 │ + │ (bound on host 0.0.0.0 and port 8080) │ + │ │ + │ Handlers ............ 55 Processes ........... 1 │ + │ Prefork ....... Disabled PID ................ 14 │ + └───────────────────────────────────────────────────┘ + ``` + + 开放了本机 `http://127.0.0.1:8080` 作为 LocalAI 请求 API 的端点。 + + 并提供了两个模型,分别为: + + - LLM 模型:`ggml-gpt4all-j` + + 对外访问名称:`gpt-3.5-turbo`(该名称可自定义,在 `models/gpt-3.5-turbo.yaml` 中配置。 + + - Embedding 模型:`all-MiniLM-L6-v2` + + 对外访问名称:`text-embedding-ada-002`(该名称可自定义,在 `models/embeddings.yaml` 中配置。 + +5. LocalAI API 服务部署完毕,在 Dify 中使用接入模型 + + 在 `设置 > 模型供应商 > LocalAI` 中填入: + + 模型 1:`ggml-gpt4all-j` + + - 模型类型:文本生成 + + - 模型名称:`gpt-3.5-turbo` + + - 服务器 URL:http://127.0.0.1:8080 + + 若 Dify 为 docker 部署,请填入 host 域名:`http://:8080`,可填写局域网 IP 地址,如:`http://192.168.1.100:8080` + + "保存" 后即可在应用中使用该模型。 + + 模型 2:`all-MiniLM-L6-v2` + + - 模型类型:Embeddings + + - 模型名称:`text-embedding-ada-002` + + - 服务器 URL:http://127.0.0.1:8080 + + 若 Dify 为 docker 部署,请填入 host 域名:`http://:8080`,可填写局域网 IP 地址,如:`http://192.168.1.100:8080` + + "保存" 后即可在应用中使用该模型。 + +如需获取 LocalAI 更多信息,请参考:https://github.com/go-skynet/LocalAI \ No newline at end of file