This commit introduces several improvements to the prompt formatting logic in `private_gpt/components/llm/prompt_helper.py`:
1. **Llama3PromptStyle**:
* Implemented tool handling capabilities, allowing for the formatting of tool call and tool result messages within the Llama 3 prompt structure.
* Ensured correct usage of BOS, EOT, and other Llama 3 specific tokens.
2. **MistralPromptStyle**:
* Refactored the `_messages_to_prompt` method for more robust handling of various conversational scenarios, including consecutive user messages and initial assistant messages.
* Ensured correct application of `<s>`, `</s>`, and `[INST]` tags.
3. **ChatMLPromptStyle**:
* Corrected the logic for handling system messages to prevent duplication and ensure accurate ChatML formatting (`<|im_start|>role\ncontent<|im_end|>`).
4. **TagPromptStyle**:
* Addressed a FIXME comment by incorporating `<s>` (BOS) and `</s>` (EOS) tokens, making it more suitable for Llama-based models like Vigogne.
* Fixed a minor bug related to enum string conversion.
5. **Unit Tests**:
* Added a new test suite in `tests/components/llm/test_prompt_helper.py`.
* These tests provide comprehensive coverage for all modified prompt styles, verifying correct prompt generation for various inputs, edge cases, and special token placements.
These changes improve the correctness, robustness, and feature set of the supported prompt styles, leading to better compatibility and interaction with the respective language models.
* feat: change ollama default model to llama3.1
* chore: bump versions
* feat: Change default model in local mode to llama3.1
* chore: make sure last poetry version is used
* fix: mypy
* fix: do not add BOS (with last llamacpp-python version)
* Extract optional dependencies
* Separate local mode into llms-llama-cpp and embeddings-huggingface for clarity
* Support Ollama embeddings
* Upgrade to llamaindex 0.10.14. Remove legacy use of ServiceContext in ContextChatEngine
* Fix vector retriever filters
As discussed on Discord, the decision has been made to remove the system prompts by default, to better segregate the API and the UI usages.
A concurrent PR (#1353) is enabling the dynamic setting of a system prompt in the UI.
Therefore, if UI users want to use a custom system prompt, they can specify one directly in the UI.
If the API users want to use a custom prompt, they can pass it directly into their messages that they are passing to the API.
In the highlight of the two use case above, it becomes clear that default system_prompt does not need to exist.