How are you running AnythingLLM?
Docker (local)
What happened?
We do use Whisper (Whisper-WebUI) for transcription services. We do not use the integrated transcribe feature because users cannot upload mp3, wav, mp4 and the integrated transcribing cannot be configured for Condition On Previous Text disabled and Repetition Penalty 2 or 3.
We are facing the following issue with AnythingLLM, Docker, Azure OpenAI, GPT-4o, Chat Model Token Limit 128,000:
If we use the chat window for long transcriptions (text) and post them directly into the chat, the chat window stays at generating response forever. No error whatsoever. We can stop generating response, yes. But we should not exceed the token limit of 128,000.
Even if we split into multiple parts same thing.
We tried to remove all CR-LF/LF same thing.
Some examples:
Full: Characters 89219 words 16336 sentences 81 paragraphs 7240 Spaces 9096
Split Part 1: Characters 15604 words 2743 sentences 21 paragraphs 886 Spaces 1858
Does anyone have a clue what's going on there?
Are there known steps to reproduce?
Paste long text directly into chat window and send to llm.