mirror of
https://github.com/zylon-ai/private-gpt.git
synced 2025-12-22 10:45:42 +01:00
feat(llm - embed): Add support for Azure OpenAI (#1698)
* Add support for Azure OpenAI * fix: wrong default api_version Should be dashes instead of underscores. see: https://learn.microsoft.com/en-us/azure/ai-services/openai/reference * fix: code styling applied "make check" changes * refactor: extend documentation * mention azopenai as available option and extras * add recommended section * include settings-azopenai.yaml configuration file * fix: documentation
This commit is contained in:
parent
258d02d87c
commit
1efac6a3fe
9 changed files with 415 additions and 6 deletions
|
|
@ -98,6 +98,43 @@ to run an OpenAI compatible server. Then, you can run PrivateGPT using the `sett
|
|||
|
||||
`PGPT_PROFILES=vllm make run`
|
||||
|
||||
### Using Azure OpenAI
|
||||
|
||||
If you cannot run a local model (because you don't have a GPU, for example) or for testing purposes, you may
|
||||
decide to run PrivateGPT using Azure OpenAI as the LLM and Embeddings model.
|
||||
|
||||
In order to do so, create a profile `settings-azopenai.yaml` with the following contents:
|
||||
|
||||
```yaml
|
||||
llm:
|
||||
mode: azopenai
|
||||
|
||||
embedding:
|
||||
mode: azopenai
|
||||
|
||||
azopenai:
|
||||
api_key: <your_azopenai_api_key> # You could skip this configuration and use the AZ_OPENAI_API_KEY env var instead
|
||||
azure_endpoint: <your_azopenai_endpoint> # You could skip this configuration and use the AZ_OPENAI_ENDPOINT env var instead
|
||||
api_version: <api_version> # The API version to use. Default is "2023_05_15"
|
||||
embedding_deployment_name: <your_embedding_deployment_name> # You could skip this configuration and use the AZ_OPENAI_EMBEDDING_DEPLOYMENT_NAME env var instead
|
||||
embedding_model: <openai_embeddings_to_use> # Optional model to use. Default is "text-embedding-ada-002"
|
||||
llm_deployment_name: <your_model_deployment_name> # You could skip this configuration and use the AZ_OPENAI_LLM_DEPLOYMENT_NAME env var instead
|
||||
llm_model: <openai_model_to_use> # Optional model to use. Default is "gpt-35-turbo"
|
||||
```
|
||||
|
||||
And run PrivateGPT loading that profile you just created:
|
||||
|
||||
`PGPT_PROFILES=azopenai make run`
|
||||
|
||||
or
|
||||
|
||||
`PGPT_PROFILES=azopenai poetry run python -m private_gpt`
|
||||
|
||||
When the server is started it will print a log *Application startup complete*.
|
||||
Navigate to http://localhost:8001 to use the Gradio UI or to http://localhost:8001/docs (API section) to try the API.
|
||||
You'll notice the speed and quality of response is higher, given you are using Azure OpenAI's servers for the heavy
|
||||
computations.
|
||||
|
||||
### Using AWS Sagemaker
|
||||
|
||||
For a fully private & performant setup, you can choose to have both your LLM and Embeddings model deployed using Sagemaker.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue