Skip to content

[BUG]: Token should not be exceeded (Azure OpenAI, GPT-4o, Chat Model Token Limit 128,000) but llm ist generating response forever #2709

@DediCATeD88

Description

@DediCATeD88

How are you running AnythingLLM?

Docker (local)

What happened?

We do use Whisper (Whisper-WebUI) for transcription services. We do not use the integrated transcribe feature because users cannot upload mp3, wav, mp4 and the integrated transcribing cannot be configured for Condition On Previous Text disabled and Repetition Penalty 2 or 3.

We are facing the following issue with AnythingLLM, Docker, Azure OpenAI, GPT-4o, Chat Model Token Limit 128,000:

If we use the chat window for long transcriptions (text) and post them directly into the chat, the chat window stays at generating response forever. No error whatsoever. We can stop generating response, yes. But we should not exceed the token limit of 128,000.

Even if we split into multiple parts same thing.

We tried to remove all CR-LF/LF same thing.

Some examples:
Full: Characters 89219 words 16336 sentences 81 paragraphs 7240 Spaces 9096
Split Part 1: Characters 15604 words 2743 sentences 21 paragraphs 886 Spaces 1858

Does anyone have a clue what's going on there?

Are there known steps to reproduce?

Paste long text directly into chat window and send to llm.

Metadata

Metadata

Assignees

No one assigned

    Labels

    possible bugBug was reported but is not confirmed or is unable to be replicated.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions