feat(llm - embed): Add support for Azure OpenAI (#1698)

* Add support for Azure OpenAI

* fix: wrong default api_version

Should be dashes instead of underscores.
see: https://learn.microsoft.com/en-us/azure/ai-services/openai/reference

* fix: code styling

applied "make check" changes

* refactor: extend documentation

* mention azopenai as available option and extras
* add recommended section
* include settings-azopenai.yaml configuration file

* fix: documentation
This commit is contained in:
Otto L 2024-03-15 16:49:50 +01:00 committed by GitHub
parent 258d02d87c
commit 1efac6a3fe
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
9 changed files with 415 additions and 6 deletions

View file

@ -98,6 +98,43 @@ to run an OpenAI compatible server. Then, you can run PrivateGPT using the `sett
`PGPT_PROFILES=vllm make run`
### Using Azure OpenAI
If you cannot run a local model (because you don't have a GPU, for example) or for testing purposes, you may
decide to run PrivateGPT using Azure OpenAI as the LLM and Embeddings model.
In order to do so, create a profile `settings-azopenai.yaml` with the following contents:
```yaml
llm:
mode: azopenai
embedding:
mode: azopenai
azopenai:
api_key: <your_azopenai_api_key> # You could skip this configuration and use the AZ_OPENAI_API_KEY env var instead
azure_endpoint: <your_azopenai_endpoint> # You could skip this configuration and use the AZ_OPENAI_ENDPOINT env var instead
api_version: <api_version> # The API version to use. Default is "2023_05_15"
embedding_deployment_name: <your_embedding_deployment_name> # You could skip this configuration and use the AZ_OPENAI_EMBEDDING_DEPLOYMENT_NAME env var instead
embedding_model: <openai_embeddings_to_use> # Optional model to use. Default is "text-embedding-ada-002"
llm_deployment_name: <your_model_deployment_name> # You could skip this configuration and use the AZ_OPENAI_LLM_DEPLOYMENT_NAME env var instead
llm_model: <openai_model_to_use> # Optional model to use. Default is "gpt-35-turbo"
```
And run PrivateGPT loading that profile you just created:
`PGPT_PROFILES=azopenai make run`
or
`PGPT_PROFILES=azopenai poetry run python -m private_gpt`
When the server is started it will print a log *Application startup complete*.
Navigate to http://localhost:8001 to use the Gradio UI or to http://localhost:8001/docs (API section) to try the API.
You'll notice the speed and quality of response is higher, given you are using Azure OpenAI's servers for the heavy
computations.
### Using AWS Sagemaker
For a fully private & performant setup, you can choose to have both your LLM and Embeddings model deployed using Sagemaker.