feat(llm): Support for Google Gemini LLMs and Embeddings (#1965)

* Support for Google Gemini LLMs and Embeddings Initial support for Gemini, enables usage of Google LLMs and embedding models (see settings-gemini.yaml) Install via poetry install --extras "llms-gemini embeddings-gemini" Notes: * had to bump llama-index-core to later version that supports Gemini * poetry --no-update did not work: Gemini/llama_index seem to require more (transient) updates to make it work... * fix: crash when gemini is not selected * docs: add gemini llm --------- Co-authored-by: Javier Martinez <javiermartinezalvarez98@gmail.com>
2025-12-22 04:30:11 +01:00 · 2024-07-08 11:47:36 +02:00 · 2024-07-08 11:47:36 +02:00 · fc13368bc7
commit fc13368bc7
parent 19a7c065ef
9 changed files with 382 additions and 59 deletions
--- a/fern/docs/pages/manual/llms.mdx
+++ b/fern/docs/pages/manual/llms.mdx
@ -199,3 +199,36 @@ Navigate to http://localhost:8001 to use the Gradio UI or to http://localhost:80
 For a fully private setup on Intel GPUs (such as a local PC with an iGPU, or discrete GPUs like Arc, Flex, and Max), you can use [IPEX-LLM](https://github.com/intel-analytics/ipex-llm).

 To deploy Ollama and pull models using IPEX-LLM, please refer to [this guide](https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Quickstart/ollama_quickstart.html). Then, follow the same steps outlined in the [Using Ollama](#using-ollama) section to create a `settings-ollama.yaml` profile and run the private-GPT server.
+
+### Using Gemini
+
+If you cannot run a local model (because you don't have a GPU, for example) or for testing purposes, you may
+decide to run PrivateGPT using Gemini as the LLM and Embeddings model. In addition, you will benefit from
+multimodal inputs, such as text and images, in a very large contextual window.
+
+In order to do so, create a profile `settings-gemini.yaml` with the following contents:
+
+```yaml
+llm:
+  mode: gemini
+
+embedding:
+  mode: gemini
+
+gemini:
+  api_key: <your_gemini_api_key>                # You could skip this configuration and use the GEMINI_API_KEY env var instead
+  model: <gemini_model_to_use>                  # Optional model to use. Default is models/gemini-pro"
+  embedding_model: <gemini_embeddings_to_use>   # Optional model to use. Default is "models/embedding-001"
+```
+
+And run PrivateGPT loading that profile you just created:
+
+`PGPT_PROFILES=gemini make run`
+
+or
+
+`PGPT_PROFILES=gemini poetry run python -m private_gpt`
+
+When the server is started it will print a log *Application startup complete*.
+Navigate to http://localhost:8001 to use the Gradio UI or to http://localhost:8001/docs (API section) to try the API.
+