Skip to content

Conversation

@afontenot
Copy link

@afontenot afontenot commented Nov 2, 2024

When the subscriber count extraction fails to find the Regex pattern, a RegexException is thrown. This is not a fatal error in most cases, for example when downloading the channel page in order to update the user's subscriptions, and so the correct behavior is to return UNKNOWN_SUBSCRIBER_COUNT.

Related issue: TeamNewPipe/NewPipe#11353.

This bug comprises two issues: subscribers cannot be extracted for channels with pronoun tags, and when the RegexException is thrown, channel subscriptions will fail to update because the exception is uncaught. This commit fixes the latter aspect of the issue.

Please see my comment on the issue for suggestions on how the former issue with scraping the pronoun tags might be fixed.

  • I carefully read the contribution guidelines and agree to them.
  • I have tested the API against NewPipe.
  • I agree to create a pull request for NewPipe as soon as possible to make it compatible with the changed API.

Note that the last point above is not applicable to this PR, as the API does not change.

When the subscriber count extraction fails to find the Regex pattern,
a RegexException is thrown. This is not a fatal error in most cases,
for example when downloading the channel page in order to update the
user's subscriptions, and so the correct behavior is to return
UNKNOWN_SUBSCRIBER_COUNT.

Related issue: TeamNewPipe/NewPipe#11353.

This bug comprises two issues: subscribers cannot be extracted for
channels with pronoun tags, and when the RegexException is thrown,
channel subscriptions will fail to update because the exception is
uncaught. This commit fixes the latter aspect of the issue.
@afontenot
Copy link
Author

My comment where I identify the probable source of the parsing issue is here: TeamNewPipe/NewPipe#11353 (comment)

@ShareASmile ShareASmile added bug Issue is related to a bug youtube service, https://www.youtube.com/ labels Nov 3, 2024
@afontenot
Copy link
Author

I had a look at the CI failures and they seem clearly unrelated to my changes. Probably just flaky tests.

@afontenot afontenot changed the title Catch RegexException in subscriber count extractor [YouTube] Catch RegexException in subscriber count extractor Nov 22, 2024
@ShareASmile ShareASmile added the ready for review Most of the work is done, PR is now ready for a review label Dec 22, 2024
@afontenot
Copy link
Author

Is there anything I can do to move this PR forward? It's a straightforward fix for a crash that continues to exist in the latest versions of NewPipe.

@Stypox
Copy link
Member

Stypox commented Feb 16, 2025

Sorry for not looking at this earlier and thank you for the ping. Thank you for looking into this; I am closing this PR for the two reasons below, though you are welcome to open a new PR for the first one :-)

subscribers cannot be extracted for channels with pronoun tags

This should be fixed by detecting pronoun tags and returning UNKNOWN_SUBSCRIBER_COUNT in that case. Please open a separate PR for this, if it has not already been fixed in the meantime :-)

channel subscriptions will fail to update because the exception is uncaught

The exception is not actually uncaught, but is caught here. The error is added to a list of non-critical errors. NewPipe already handles those errors nicely:

  • collecting all of the errors from a single channel in a list: see here
  • then saving in the database the obtained items independently of whether there were errors or not: see here
  • finally, if there are errors, the subscription is marked as outdated as one of the errors could indicate that not all channel tabs or not all items were extracted: see here

Therefore I think this is already working as intended, but maybe we should change the UI for when a subscription was partially loaded (i.e. non-critical errors) vs when it was not loaded at all (i.e. critical error).

@Stypox Stypox closed this Feb 16, 2025
@ShareASmile ShareASmile removed the ready for review Most of the work is done, PR is now ready for a review label Feb 17, 2025
@afontenot
Copy link
Author

Hey @Stypox, thanks for the review. I think there are several misunderstandings to correct though.

subscribers cannot be extracted for channels with pronoun tags

This should be fixed by detecting pronoun tags and returning UNKNOWN_SUBSCRIBER_COUNT in that case.

It's not that it's impossible to extract the subscriber count for channels with pronoun tags, it's that the extraction code is broken as I detailed on the bug report page. It only works for most channels right now by coincidence.

The error is added to a list of non-critical errors. NewPipe already handles those errors nicely:

Maybe I'm misunderstanding you, because the handling doesn't seem nice at all. The result is not just that the subscription is marked as outdated, but also there's a full crash log pop up that appears in the user's notifications as if a major error has occurred. This is annoying and probably results in useless issues being sent.

@Stypox
Copy link
Member

Stypox commented Feb 20, 2025

Sorry, I didn't see your comments in #11353, thank you for the analysis!


it's that the extraction code is broken as I detailed on the bug report page

Then it needs to be fixed, even if fixing it can still lead to something that is flaky. There is already a mechanism that doesn't make the whole extraction fail if just the subscriber count extraction fails, as I said, so it's not considered a critical error.


the handling doesn't seem nice at all

As I said:

we should change the UI for when a subscription was partially loaded (i.e. non-critical errors) vs when it was not loaded at all (i.e. critical error)

I agree it is not a good user experience. What I meant with "nice" was that the error is already properly handled in the correct way (i.e. the subscriptions, even the broken channel, do load fine, since the code handles the non-critical error as non-critical and proceeds).


So to sum up, this is not the way to solve this bug, but you can do this instead:

  • fixing the bug in the extractor (e.g. by looking for the word "subscriptions" when there is more than one array with two items)
  • adding a third state to a subscription in NewPipe: "partially fetched". When non-critical errors occur, the subscriptions should be shown as partially fetched. And in any case, any non-critical error unrelated to the channel's getTabs() should be ignored (you should be able to distinguish this by checking if getTabs() returns something or not).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Issue is related to a bug youtube service, https://www.youtube.com/

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants