feat(llm - embed): Add support for Azure OpenAI (#1698)

* Add support for Azure OpenAI * fix: wrong default api_version Should be dashes instead of underscores. see: https://learn.microsoft.com/en-us/azure/ai-services/openai/reference * fix: code styling applied "make check" changes * refactor: extend documentation * mention azopenai as available option and extras * add recommended section * include settings-azopenai.yaml configuration file * fix: documentation
2025-12-22 10:45:42 +01:00 · 2024-03-15 16:49:50 +01:00 · 2024-03-15 16:49:50 +01:00 · 1efac6a3fe
commit 1efac6a3fe
parent 258d02d87c
9 changed files with 415 additions and 6 deletions
--- a/fern/docs/pages/manual/llms.mdx
+++ b/fern/docs/pages/manual/llms.mdx
@ -98,6 +98,43 @@ to run an OpenAI compatible server. Then, you can run PrivateGPT using the `sett

 `PGPT_PROFILES=vllm make run`

+### Using Azure OpenAI
+
+If you cannot run a local model (because you don't have a GPU, for example) or for testing purposes, you may
+decide to run PrivateGPT using Azure OpenAI as the LLM and Embeddings model.
+
+In order to do so, create a profile `settings-azopenai.yaml` with the following contents:
+
+```yaml
+llm:
+  mode: azopenai
+
+embedding:
+  mode: azopenai
+
+azopenai:
+  api_key: <your_azopenai_api_key>  # You could skip this configuration and use the AZ_OPENAI_API_KEY env var instead
+  azure_endpoint: <your_azopenai_endpoint> # You could skip this configuration and use the AZ_OPENAI_ENDPOINT env var instead
+  api_version: <api_version> # The API version to use. Default is "2023_05_15"
+  embedding_deployment_name: <your_embedding_deployment_name> # You could skip this configuration and use the AZ_OPENAI_EMBEDDING_DEPLOYMENT_NAME env var instead
+  embedding_model: <openai_embeddings_to_use> # Optional model to use. Default is "text-embedding-ada-002" 
+  llm_deployment_name: <your_model_deployment_name> # You could skip this configuration and use the AZ_OPENAI_LLM_DEPLOYMENT_NAME env var instead
+  llm_model: <openai_model_to_use> # Optional model to use. Default is "gpt-35-turbo"
+```
+
+And run PrivateGPT loading that profile you just created:
+
+`PGPT_PROFILES=azopenai make run`
+
+or
+
+`PGPT_PROFILES=azopenai poetry run python -m private_gpt`
+
+When the server is started it will print a log *Application startup complete*.
+Navigate to http://localhost:8001 to use the Gradio UI or to http://localhost:8001/docs (API section) to try the API.
+You'll notice the speed and quality of response is higher, given you are using Azure OpenAI's servers for the heavy
+computations.
+
 ### Using AWS Sagemaker

 For a fully private & performant setup, you can choose to have both your LLM and Embeddings model deployed using Sagemaker.