mirror of
https://github.com/zylon-ai/private-gpt.git
synced 2025-12-22 04:30:11 +01:00
feat(llm): Support for Google Gemini LLMs and Embeddings (#1965)
Some checks are pending
publish docs / publish-docs (push) Waiting to run
release-please / release-please (push) Waiting to run
tests / setup (push) Waiting to run
tests / ${{ matrix.quality-command }} (black) (push) Blocked by required conditions
tests / ${{ matrix.quality-command }} (mypy) (push) Blocked by required conditions
tests / ${{ matrix.quality-command }} (ruff) (push) Blocked by required conditions
tests / test (push) Blocked by required conditions
tests / all_checks_passed (push) Blocked by required conditions
Some checks are pending
publish docs / publish-docs (push) Waiting to run
release-please / release-please (push) Waiting to run
tests / setup (push) Waiting to run
tests / ${{ matrix.quality-command }} (black) (push) Blocked by required conditions
tests / ${{ matrix.quality-command }} (mypy) (push) Blocked by required conditions
tests / ${{ matrix.quality-command }} (ruff) (push) Blocked by required conditions
tests / test (push) Blocked by required conditions
tests / all_checks_passed (push) Blocked by required conditions
* Support for Google Gemini LLMs and Embeddings Initial support for Gemini, enables usage of Google LLMs and embedding models (see settings-gemini.yaml) Install via poetry install --extras "llms-gemini embeddings-gemini" Notes: * had to bump llama-index-core to later version that supports Gemini * poetry --no-update did not work: Gemini/llama_index seem to require more (transient) updates to make it work... * fix: crash when gemini is not selected * docs: add gemini llm --------- Co-authored-by: Javier Martinez <javiermartinezalvarez98@gmail.com>
This commit is contained in:
parent
19a7c065ef
commit
fc13368bc7
9 changed files with 382 additions and 59 deletions
|
|
@ -199,3 +199,36 @@ Navigate to http://localhost:8001 to use the Gradio UI or to http://localhost:80
|
|||
For a fully private setup on Intel GPUs (such as a local PC with an iGPU, or discrete GPUs like Arc, Flex, and Max), you can use [IPEX-LLM](https://github.com/intel-analytics/ipex-llm).
|
||||
|
||||
To deploy Ollama and pull models using IPEX-LLM, please refer to [this guide](https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Quickstart/ollama_quickstart.html). Then, follow the same steps outlined in the [Using Ollama](#using-ollama) section to create a `settings-ollama.yaml` profile and run the private-GPT server.
|
||||
|
||||
### Using Gemini
|
||||
|
||||
If you cannot run a local model (because you don't have a GPU, for example) or for testing purposes, you may
|
||||
decide to run PrivateGPT using Gemini as the LLM and Embeddings model. In addition, you will benefit from
|
||||
multimodal inputs, such as text and images, in a very large contextual window.
|
||||
|
||||
In order to do so, create a profile `settings-gemini.yaml` with the following contents:
|
||||
|
||||
```yaml
|
||||
llm:
|
||||
mode: gemini
|
||||
|
||||
embedding:
|
||||
mode: gemini
|
||||
|
||||
gemini:
|
||||
api_key: <your_gemini_api_key> # You could skip this configuration and use the GEMINI_API_KEY env var instead
|
||||
model: <gemini_model_to_use> # Optional model to use. Default is models/gemini-pro"
|
||||
embedding_model: <gemini_embeddings_to_use> # Optional model to use. Default is "models/embedding-001"
|
||||
```
|
||||
|
||||
And run PrivateGPT loading that profile you just created:
|
||||
|
||||
`PGPT_PROFILES=gemini make run`
|
||||
|
||||
or
|
||||
|
||||
`PGPT_PROFILES=gemini poetry run python -m private_gpt`
|
||||
|
||||
When the server is started it will print a log *Application startup complete*.
|
||||
Navigate to http://localhost:8001 to use the Gradio UI or to http://localhost:8001/docs (API section) to try the API.
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue