Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not work Phrase Query using non english retrieval #298

Closed
po3rin opened this issue Apr 17, 2022 · 0 comments · Fixed by #340
Closed

Not work Phrase Query using non english retrieval #298

po3rin opened this issue Apr 17, 2022 · 0 comments · Fixed by #340
Labels
bug Something isn't working

Comments

@po3rin
Copy link

po3rin commented Apr 17, 2022

Describe the bug
I did a phrase search via a Japanese tokenizer(sudachi). But phrase search doesn't work as intended.
In my opinion, if include an unindexed term in a phrase query, the term will be ignored in phrase query.

example code is below.
https://github.com/po3rin/python_playground/blob/master/try-pyterrier/sudachi.ipynb

dependencies

[tool.poetry.dependencies]
python = ">=3.8,<3.11"
pandas = "^1.4.0"
SudachiDict-core = ">=20210802"
SudachiPy = ">=0.6.2,<0.7.0"

I referred to the code below when searching Japanese.
https://colab.research.google.com/github/terrier-org/pyterrier/blob/master/examples/notebooks/non_en_retrieval.ipynb

Expected behavior
If it does not contain a term that is not indexed, we expect the search to end without a hit.

Documentation and Issues

@po3rin po3rin added the bug Something isn't working label Apr 17, 2022
@po3rin po3rin changed the title Now work Phrase Query Not work Phrase Query using non english retrieval Apr 17, 2022
@cmacdonald cmacdonald linked a pull request Nov 2, 2022 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant