-
Notifications
You must be signed in to change notification settings - Fork 4.9k
Description
Environment
🐧 Linux
System
Firefox 132, Arch Linux
Version
1.12.7 'staging' (9b38e3f)
Desktop Information
- Node v23.1.0
- KoboldCPP
- Staging
- Mistral Small 22B
Describe the problem
Conditions to replicate:
- KoboldCPP or other CPU backend with Context Shift that avoids prompt reprocessing unless something changes in the context
- Context filled up with messages
- Instruct format with non-blank User Filler Message
- Consecutive (2 and more) assistant replies present in the context
When SillyTavern removes old messages from the chat history and the first message occurs to be from the assistant and not the user, it will create the filler message and cause the full chat history to be reprocessed.
I understand this might be intended behavior, since that's the purpose of User Filler Message. However, it might cause trouble to users with slow prompt processing while not being directly obvious. I suggest a workaround that when the context is full, the User Filler Message is to be ignored, simply placing the assistant message first. That does break the instruct format, but I don't think a small disrepancy at the beginning of the chat history will be significant. Otherwise, if the instruct format is to be strictly followed, the mechanism for removing old messages could be modified to ensure a User message is always first. Or, the simplest way would be to place a warning for users in the documentation/app to avoid User Filler Message unless they can afford frequent prompt reprocessing.
Additional info
No response
Please tick the boxes
- I have explained the issue clearly, and I included all relevant info
- I have checked that this issue hasn't already been raised
- I have checked the docs