Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Piped's fork of nanojson #981

Open
wants to merge 2 commits into
base: dev
Choose a base branch
from
Open

Conversation

Stypox
Copy link
Member

@Stypox Stypox commented Nov 24, 2022

  • I carefully read the contribution guidelines and agree to them.
  • I have tested the API against NewPipe.
  • I agree to create a pull request for NewPipe as soon as possible to make it compatible with the changed API.

Our fork of nanojson was initially introduced in #317. That PR contains no description, but if I remember correctly our own fork was created in order to be able to parse some particular Json format returned by some service.

Anyway, I have run the tests with Piped's / @FireMasterK's fork and there does not seem to be any problem. The upstream library (i.e. com.grack.nanojson) cannot be used directly since it returns null instead of new JsonObject() in some methods, causing NPEs (fixed by 7056f30 in our fork). In any case, @FireMasterK's PR is up-to-date with upstream (last upstream commit is Aug 9), contains 7056f30, and contains one more commit related to performance (https://github.com/FireMasterK/nanojson/commits/master). Since our fork changed some things related to how the JSON is parsed, it turned out to be thousands of times slower than upstream: see this analysis by @FireMasterK.

These are the only tests that fail: they are unrelated to the changes in this PR, since they are the same as here.

@Stypox Stypox mentioned this pull request Nov 24, 2022
3 tasks
Also, our fork of nanojson was outdated, while FireMasterK's is not
@TobiGr
Copy link
Member

TobiGr commented Nov 24, 2022

You also need to update

implementation "com.github.TeamNewPipe:nanojson:$nanojsonVersion"

@Stypox
Copy link
Member Author

Stypox commented Nov 24, 2022

Yeah sorry, I was about to do that

@FireMasterK
Copy link
Member

I also include fastutil (which is not a problem for Piped since I use all over in its code), but is an additional dependency, maybe we don't want to include that commit here?

build.gradle Outdated Show resolved Hide resolved
@TacoTheDank
Copy link
Member

TacoTheDank commented Nov 25, 2022

Dang bro that's pretty slow. I'm not really sure why I was requested though.

But anyway, if it runs fine and better, then it looks good to me 👍

It'll probably also need to be changed here too, right?

@TobiGr
Copy link
Member

TobiGr commented Nov 25, 2022

I don't understand why the mocks are missing. That's an indicator for changes in the requests, most likely those which have a JSON body. We need to update the mocks and should have a brief look on how the requests changed exactly

TobiGr
TobiGr previously requested changes Nov 25, 2022
Copy link
Member

@TobiGr TobiGr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see above

@FireMasterK
Copy link
Member

I don't understand why the mocks are missing. That's an indicator for changes in the requests, most likely those which have a JSON body. We need to update the mocks and should have a brief look on how the requests changed exactly

I had a look into this, and now keys for the objects generated just aren't sorted:
Old:

{"cpn":"uGaaNCVq9tAJ9K8j","contentCheckOk":true,"playbackContext":{"contentPlaybackContext":{"referer":"https://www.youtube.com/watch?v=INVALID_ID_","signatureTimestamp":"19215"}},"context":{"request":{"internalExperimentFlags":[],"useSsl":true},"client":{"hl":"en-GB","gl":"GB","clientName":"WEB","originalUrl":"https://www.youtube.com","clientVersion":"2.20220809.02.00","platform":"DESKTOP"},"user":{"lockedSafetyMode":false}},"racyCheckOk":true,"videoId":"INVALID_ID_"}

New:

{"context":{"client":{"hl":"en-GB","gl":"GB","clientName":"WEB","clientVersion":"2.20220809.02.00","originalUrl":"https://www.youtube.com","platform":"DESKTOP"},"request":{"internalExperimentFlags":[],"useSsl":true},"user":{"lockedSafetyMode":false}},"playbackContext":{"contentPlaybackContext":{"signatureTimestamp":"19215","referer":"https://www.youtube.com/watch?v=INVALID_ID_"}},"cpn":"uGaaNCVq9tAJ9K8j","videoId":"INVALID_ID_","contentCheckOk":true,"racyCheckOk":true}

YouTube doesn't order fields, so I feel this may indeed be a good thing

I think we should regenerate the mocks

No clue what they changed for this to happen tho: TeamNewPipe/nanojson@0185616...FireMasterK:nanojson:master~2

@AudricV
Copy link
Member

AudricV commented Nov 27, 2022

I had a look into this, and now keys for the objects generated just aren't sorted

Wouldn't this make mocks useless?

No clue what they changed for this to happen tho: TeamNewPipe/nanojson@0185616...FireMasterK:nanojson:master~2

The only change which could make a difference in my opinion is FireMasterK/nanojson@6b8c1ef.

Also, why not updating our fork instead of using a fork of our fork?

@FireMasterK
Copy link
Member

FireMasterK commented Nov 27, 2022

Wouldn't this make mocks useless?

No, because now they're in the order that we add them in the builder I believe

The only change which could make a difference in my opinion is FireMasterK/nanojson@6b8c1ef.

Also, why not updating our fork instead of using a fork of our fork?

That could be done as well, but how would you remove the old commits? I'm not sure what's the team's policy on force pushing...

Edit: Or actually, we could create a new branch and rename this one

@AudricV
Copy link
Member

AudricV commented Nov 27, 2022

No, because now they're in the order that we add them in the builder I believe

Then, that's fine for me!

@Stypox
Copy link
Member Author

Stypox commented Nov 27, 2022

Also, why not updating our fork instead of using a fork of our fork?

That would be fine, but since FireMasterK's fork is actively maintained and serves almost the same purpose, I think it would make more sense to just use that. It wouldn't be difficult to switch away if the needs ever diverged. We could archive our fork in the meantime.

@TobiGr TobiGr dismissed their stale review March 20, 2023 18:41

Question ansered

@fynngodau
Copy link
Member

if I remember correctly our own fork was created in order to be able to parse some particular Json format returned by some service

Hi, the context is here: #232 (comment). I believe that Bandcamp has switched to valid JSON in the meantime, without any comments. :)

@wb9688
Copy link
Contributor

wb9688 commented May 28, 2023

I would also be in favor of updating our fork instead. I also think using lazy parsing of e.g. numbers would be great.


That PR contains no description, but if I remember correctly our own fork was created in order to be able to parse some particular Json format returned by some service.

There were 2 reasons for our fork of nanojson:

  1. To return an empty JsonObject instead of null to prevent NullPointerExceptions
  2. Bandcamp returned what I call "semi-JSON" i.e. { what: 'ever' } (instead of { "what": "ever" }) and we needed to be able to parse that

@FireMasterK
Copy link
Member

FireMasterK commented May 29, 2023

I would also be in favor of updating our fork instead. I also think using lazy parsing of e.g. numbers would be great.

That PR contains no description, but if I remember correctly our own fork was created in order to be able to parse some particular Json format returned by some service.

There were 2 reasons for our fork of nanojson:

  1. To return an empty JsonObject instead of null to prevent NullPointerExceptions
  2. Bandcamp returned what I call "semi-JSON" i.e. { what: 'ever' } (instead of { "what": "ever" }) and we needed to be able to parse that
  1. You're right, I have cherry-picked that one commit from the TeamNewPipe repository.
  2. This is no longer the case and is no longer necessary. I don't think modifying the JSON parser should have been the solution for this, as in this case, we created significant performance regressions. I think using regexes would have been a better solution here.

I'd like to upstream as many changes as possible from my fork! It becomes lesser work for me to maintain my fork then anyway.

A lot of my changes could probably be useful for TeamNewPipe's fork, but not upstream nanojson due to some limitations by my changes.

I've made a lot of performance-oriented changes in my fork in comparison to upstream.

I've also added support for what I call "Lazy String" parsing in my fork. This could probably even be upstreamed to the original repository even. (I don't have the bandwidth for that currently)

I'd like to express my dislike for the nanojson library for doing all the parsing eagerly. If feasible, I would even like to replace it in the extractor with a more performant and well-built library like Jackson.

@TobiGr TobiGr added the dependencies Pull requests that update a dependency file label May 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants