Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong generated prompt length #75

Open
PietroFerr opened this issue Oct 18, 2024 · 0 comments
Open

Wrong generated prompt length #75

PietroFerr opened this issue Oct 18, 2024 · 0 comments

Comments

@PietroFerr
Copy link

I think there is an error when calculating the number of input tokens at prompt-time generation in the function llmperf.utils.randomly_sample_sonnet_lines_prompt(). Line 112: line_to_add = line_to_add[: int(math.ceil(remaining_prompt_tokens))]. If I understand correctly, you want to make sure the prompt you generate is of a specific length in terms of tokens., therefore if the new line to add to the prompt is too long, you want to cut it and get rit of the extra tokens. The code checks whether this happens and then attempts to remove the extra chars from line_to_add. The way this is done seems wrong to be, because it reasons in terms of characters and not tokens, which is a mistake. The correct approach would be to count the number of tokens in line_to_add and keep only the prefix which has lenght equal to remaining_prompt_tokens in terms of tokens, and not of characters as it is now.

Does this make sense to you?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant