Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Zerochan] JSONDecodeError: Unescaped " inside strings cause failure #6632

Open
dominikmau opened this issue Dec 9, 2024 · 2 comments
Open

Comments

@dominikmau
Copy link

my verbose output:

E:\gallery>dl.exe "https://www.zerochan.net/4354955" --verbose
[gallery-dl][debug] Version 1.28.1:2024.12.07 - Executable (dev/windows)
[gallery-dl][debug] Python 3.12.7 - Windows-11-10.0.22631-SP0
[gallery-dl][debug] requests 2.32.3 - urllib3 2.2.3
[gallery-dl][debug] Configuration Files ['E:\gallery\gallery-dl.conf']
[gallery-dl][debug] Starting DownloadJob for 'https://www.zerochan.net/4354955'
[zerochan][debug] Using ZerochanImageExtractor for 'https://www.zerochan.net/4354955'
[cookies][debug] Extracting cookies from C:\Users\Dominik\AppData\Roaming\Mozilla\Firefox\Profiles\kxk6mo99.default-release-1728442392740\cookies.sqlite
[cookies][debug] Only loading cookies not belonging to any container
[cookies][info] Extracted 43 cookies from Firefox
[urllib3.connectionpool][debug] Starting new HTTPS connection (1): www.zerochan.net:443
[urllib3.connectionpool][debug] https://www.zerochan.net:443 "GET /4354955 HTTP/11" 200 None
[zerochan][debug] Sleeping 1.27 seconds (request)
[urllib3.connectionpool][debug] https://www.zerochan.net:443 "GET /4354955?json HTTP/11" 200 None
[zerochan][error] An unexpected error occurred: JSONDecodeError - Expecting ',' delimiter: line 12 column 22 (char 500). Please run gallery-dl again with the --verbose flag, copy its output and report this issue on https://github.com/mikf/gallery-dl/issues .
[zerochan][debug]
Traceback (most recent call last):
File "gallery_dl\job.py", line 151, in run
File "gallery_dl\extractor\booru.py", line 38, in items
File "gallery_dl\extractor\zerochan.py", line 257, in posts
File "gallery_dl\extractor\zerochan.py", line 98, in _parse_entry_api
File "json\decoder.py", line 337, in decode
File "json\decoder.py", line 353, in raw_decode
json.decoder.JSONDecodeError: Expecting ',' delimiter: line 12 column 22 (char 500)

the reason - similar cases occur quite often:

{
"primary": "Miles "Tails" Prower",
}

@mikf
Copy link
Owner

mikf commented Dec 9, 2024

There was a previous issue regarding ASCII control characters in "JSON" API responses (#5892), but that was quite easy to fix with some regex. It should be possible to somehow replace the offending " characters with \" using a more complicated regex pattern, but maybe it would be simpler to just extract the values manually.

@dominikmau
Copy link
Author

for my specific usecase, it is sufficient if I use

text = re.sub(r'“((?:[^”\\]|\.)*?)“‘, lambda m: ’”{}“‘.format(m.group(1).replace(’”', '\"')), text)

but it does not work with all configurations and is therefore not a solution. but i am also not a programmer. sorry.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants