2.4 KiB
Integrating Local Models Deployed with Ollama
Ollama is a local inference framework client that allows one-click deployment of LLMs such as Llama 2, Mistral, Llava, etc. Dify supports integrating LLM and Text Embedding capabilities of large language models deployed with Ollama.
Quick Integration
Download and Launch Ollama
-
Download Ollama
Visit https://ollama.ai/download to download the Ollama client for your system.
-
Run Ollama and Chat with Llava
ollama run llavaAfter successful launch, Ollama starts an API service on local port 11434, which can be accessed at
http://localhost:11434.For other models, visit Ollama Models for more details.
-
Integrate Ollama in Dify
In
Settings > Model Providers > Ollama, fill in:-
Model Name:
llava -
Base URL:
http://<your-ollama-endpoint-domain>:11434Enter the base URL where the Ollama service is accessible.
If Dify is deployed using docker, consider using the local network IP address, e.g.,
http://192.168.1.100:11434or the docker host machine IP address, e.g.,http://172.17.0.1:11434.For local source code deployment, use
http://localhost:11434. -
Model Type:
Chat -
Model Context Length:
4096The maximum context length of the model. If unsure, use the default value of 4096.
-
Maximum Token Limit:
4096The maximum number of tokens returned by the model. If there are no specific requirements for the model, this can be consistent with the model context length.
-
Support for Vision:
YesCheck this option if the model supports image understanding (multimodal), like
llava.
Click "Save" to use the model in the application after verifying that there are no errors.
The integration method for Embedding models is similar to LLM, just change the model type to Text Embedding.
-
-
Use Ollama Models
Enter
Prompt Eng.page of the App that needs to be configured, select thellavamodel under the Ollama provider, and use it after configuring the model parameters.
For more information on Ollama, please refer to: https://github.com/jmorganca/ollama


