mirror of
https://github.com/zylon-ai/private-gpt.git
synced 2025-12-22 04:30:11 +01:00
Support n_batch to improve inference performance
This commit is contained in:
parent
52eb020256
commit
ad661933cb
3 changed files with 5 additions and 2 deletions
|
|
@ -26,6 +26,7 @@ MODEL_TYPE: supports LlamaCpp or GPT4All
|
|||
PERSIST_DIRECTORY: is the folder you want your vectorstore in
|
||||
MODEL_PATH: Path to your GPT4All or LlamaCpp supported LLM
|
||||
MODEL_N_CTX: Maximum token limit for the LLM model
|
||||
MODEL_N_BATCH: Number of tokens in the prompt that are fed into the model at a time. Optimal value differs a lot depending on the model (8 works well for GPT4All, and 1024 is better for LlamaCpp)
|
||||
EMBEDDINGS_MODEL_NAME: SentenceTransformers embeddings model name (see https://www.sbert.net/docs/pretrained_models.html)
|
||||
TARGET_SOURCE_CHUNKS: The amount of chunks (sources) that will be used to answer a question
|
||||
```
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue