[BUG] Auto-generated titles don't match app language #1531

crypdick · 2024-09-13T19:45:53Z

Bug Description
Sometimes, auto-generated titles for English conversations are in Chinese.

Expected Results
According to the nameConversation prompt, the title should always be in English.

Screenshots

Desktop:

Operating System: Ubuntu 23.10
Application Version: 1.4.1

Additional Context
This issue makes me suspect that my conversations in the closed source app are not truly private and are being sent to a custom model.

The text was updated successfully, but these errors were encountered:

creesch · 2024-09-18T08:12:23Z

I just noticed that the releases listed on github stop at 1.3.10, if you download from the website you are served 1.4.2. In the readme it does mention this:

This is the repository for the Chatbox Community Edition, open-sourced under the GPLv3 license. For most users, I recommend using the Chatbox Official Edition (closed-source).

But it does not clearly state that the download buttons below actually point to this closed source version. Which I think is a bit of a dark pattern and not cool at all.

I tried decompiling it and looking at the source code, but because Terser is used during packaging the code is obfuscated making it really difficult to see if anything shady is going on in this closed source version.

creesch · 2024-09-18T08:54:26Z

So I found this previous discussion which gives a bit of context: #803
Having read the discussion I find it less likely some malice is involved, although I can't of course rule it out entirely.

I still think that the below section of the readme should be clarified:

At the very least, it should say "Closed source download for ".

creesch · 2024-09-18T16:27:18Z

Alright, one last reply. I had a look with tcpview open and when you do open up chatbox there is some traffic visible. This is to be expected given the update check and all that.

The traffic goes to 170.106.175.29 which turns out to simply be chatboxai.app.

When I click “new chat” I see activity to that address as well. Oddly enough that seems to be the only UI element causing traffic, I suspect some sort of analytics is going on here. Mind you, at this point I have only clicked the button, not typed in any prompt.

When I actually type something in the chat and send it off towards the LLM, I do not see any activity towards chatboxai.app. The only other traffic I see is towards the LLM provider I use, which is what I would expect.

So it looks like no data about your chats is being sent while chatting. The traffic I see on application startup is also not enough to indicate that previous chats are being sent somewhere. The traffic when clicking new chat is still a bit odd to me.

Overall, it looks like your data is safe. The behavior with the generated titles might simply because of a bug in the closed source version.

crypdick · 2024-09-20T14:12:09Z

Thank you for the detective work @creesch !

Bin-Huang · 2024-10-07T14:31:05Z

Don't worry, your data is safe—Chatbox really values your privacy. As for why the closed-source edition's code is obfuscated, it's because I need to protect it. Honestly, with Electron, there's almost no way to safeguard the source code besides code obfuscation. Thanks to @creesch for the review and confirmation!

Getting back to the original issue with title generation, I don't think that's going to happen. Which model are you using? Does your system prompt or context include any Chinese text? I'm really curious about this issue. If you could provide more details, that'd be great! @crypdick

crypdick · 2024-10-08T02:46:04Z

@Bin-Huang My system prompts and context are always written in English. I use a mix of OpenAI and Anthropic endpoints and I have seen this issue across both model providers.

Bin-Huang · 2024-10-08T06:23:51Z

@crypdick Thanks for the extra detail. Are the endpoints you mentioned official APIs from OpenAI and Anthropic? Also, which version of the Chatbox app are you using, and on what OS?

crypdick · 2024-10-11T14:03:07Z

That's right, nothing custom, official endpoints only.

Operating System: Ubuntu 23.10
Application Version: 1.4.1

Bin-Huang · 2024-10-12T03:05:13Z

This is indeed a very interesting bug, thanks for bringing it to my attention. I think I've found the root cause: after multiple tests, I've discovered that the title generation prompt Chatbox ultimately sends to the model doesn't have any issues, meaning it doesn't contain any hints to generate Chinese titles. However, I've noticed that the model (gpt-4o) itself has a tendency to generate Chinese titles. In my case, I tried to have gpt-4o generate a title for a purely English conversation, and gpt-4o suddenly produced a Chinese title. After detailed testing, I found that gpt-4o has a certain probability (about less than 10%) of this occurring. It's pretty clear this is a case of the model hallucinating.

For anyone interested in this issue, you can reproduce my findings with the following code:

import openai

openai.api_key = 'your-api-key'

content = "Name the conversation based on the chat records.\nPlease provide a concise name, within 10 characters and without quotation marks.\nPlease use the speak language in the conversation.\nYou only need to answer with the name.\nThe following is the conversation:\n\n```\nis there any npm packages that can help me make a auto-resized textarea\n\n---------\n\n\n```\n\nPlease provide a concise name, within 10 characters and without quotation marks.\nPlease use the speak language in the conversation.\nYou only need to answer with the name.\nThe conversation is named:"

for i in range(40):
    response = openai.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "user", "content": content}
        ]
    )
    response_content = response.choices[0].message.content or ''
    if any(ord(char) > 127 for char in response_content):
        print(response_content)

To fix this issue, I've tweaked the prompt for auto-generating titles, making sure it uses the language set in the app. This fix will be rolled out with the next update.

Thanks again for bringing this bug to my attention! It's hands down the most interesting bug I've fixed lately.

crypdick · 2024-10-15T19:13:33Z

This is an interesting bug. I think that this is caused by how the prompt is phrased. For example, the sentence "please use the speak language in the conversation" is not how a native speaker would write it; a more natural phrasing might be "please use the same language used in the conversation." This phrasing is a subtle signal to the model that the prompter is Chinese, which is why the summary sometimes includes Chinese characters, even though the prompt specifies to use the conversation's language.

Bin-Huang · 2024-10-16T03:18:46Z

Thanks for your insights! I think you're right. This prompt was probably shared by someone else online, and I didn't really look at its tone or style. I've now tried writing a new prompt myself, which will fix those issues.

Based on the chat history, give this conversation a name.
Keep it short - 10 characters max, no quotes.
Use ${language}.
Just provide the name, nothing else.

Here's the conversation:
{history}

Name this conversation in 10 characters or less.
Use ${language}.
Only give the name, nothing else.

The name is:

@crypdick Could you take a look and let me know what you think?

crypdick · 2024-10-16T21:44:00Z

@Bin-Huang much better, although it is redundant. I would delete everything after Here's the conversation: {history}

Bin-Huang changed the title ~~[BUG] Autogenerated titles are sometimes in Chinese. Are my conversations actually private?~~ [BUG] Auto-generated titles don't match app language Oct 8, 2024

Bin-Huang closed this as completed Oct 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Auto-generated titles don't match app language #1531

[BUG] Auto-generated titles don't match app language #1531

crypdick commented Sep 13, 2024

creesch commented Sep 18, 2024

creesch commented Sep 18, 2024

creesch commented Sep 18, 2024 •

edited

Loading

crypdick commented Sep 20, 2024

Bin-Huang commented Oct 7, 2024 •

edited

Loading

crypdick commented Oct 8, 2024

Bin-Huang commented Oct 8, 2024

crypdick commented Oct 11, 2024

Bin-Huang commented Oct 12, 2024

crypdick commented Oct 15, 2024

Bin-Huang commented Oct 16, 2024

crypdick commented Oct 16, 2024

[BUG] Auto-generated titles don't match app language #1531

[BUG] Auto-generated titles don't match app language #1531

Comments

crypdick commented Sep 13, 2024

creesch commented Sep 18, 2024

creesch commented Sep 18, 2024

creesch commented Sep 18, 2024 • edited Loading

crypdick commented Sep 20, 2024

Bin-Huang commented Oct 7, 2024 • edited Loading

crypdick commented Oct 8, 2024

Bin-Huang commented Oct 8, 2024

crypdick commented Oct 11, 2024

Bin-Huang commented Oct 12, 2024

crypdick commented Oct 15, 2024

Bin-Huang commented Oct 16, 2024

crypdick commented Oct 16, 2024

creesch commented Sep 18, 2024 •

edited

Loading

Bin-Huang commented Oct 7, 2024 •

edited

Loading