[YouTube] Catch RegexException in subscriber count extractor #1238

afontenot · 2024-11-02T17:43:38Z

When the subscriber count extraction fails to find the Regex pattern, a RegexException is thrown. This is not a fatal error in most cases, for example when downloading the channel page in order to update the user's subscriptions, and so the correct behavior is to return UNKNOWN_SUBSCRIBER_COUNT.

Related issue: TeamNewPipe/NewPipe#11353.

This bug comprises two issues: subscribers cannot be extracted for channels with pronoun tags, and when the RegexException is thrown, channel subscriptions will fail to update because the exception is uncaught. This commit fixes the latter aspect of the issue.

Please see my comment on the issue for suggestions on how the former issue with scraping the pronoun tags might be fixed.

I carefully read the contribution guidelines and agree to them.
I have tested the API against NewPipe.
I agree to create a pull request for NewPipe as soon as possible to make it compatible with the changed API.

Note that the last point above is not applicable to this PR, as the API does not change.

When the subscriber count extraction fails to find the Regex pattern, a RegexException is thrown. This is not a fatal error in most cases, for example when downloading the channel page in order to update the user's subscriptions, and so the correct behavior is to return UNKNOWN_SUBSCRIBER_COUNT. Related issue: TeamNewPipe/NewPipe#11353. This bug comprises two issues: subscribers cannot be extracted for channels with pronoun tags, and when the RegexException is thrown, channel subscriptions will fail to update because the exception is uncaught. This commit fixes the latter aspect of the issue.

afontenot · 2024-11-02T17:49:25Z

My comment where I identify the probable source of the parsing issue is here: TeamNewPipe/NewPipe#11353 (comment)

afontenot · 2024-11-04T21:58:16Z

I had a look at the CI failures and they seem clearly unrelated to my changes. Probably just flaky tests.

afontenot · 2025-02-13T22:56:41Z

Is there anything I can do to move this PR forward? It's a straightforward fix for a crash that continues to exist in the latest versions of NewPipe.

Stypox · 2025-02-16T21:25:36Z

Sorry for not looking at this earlier and thank you for the ping. Thank you for looking into this; I am closing this PR for the two reasons below, though you are welcome to open a new PR for the first one :-)

subscribers cannot be extracted for channels with pronoun tags

This should be fixed by detecting pronoun tags and returning UNKNOWN_SUBSCRIBER_COUNT in that case. Please open a separate PR for this, if it has not already been fixed in the meantime :-)

channel subscriptions will fail to update because the exception is uncaught

The exception is not actually uncaught, but is caught here. The error is added to a list of non-critical errors. NewPipe already handles those errors nicely:

collecting all of the errors from a single channel in a list: see here
then saving in the database the obtained items independently of whether there were errors or not: see here
finally, if there are errors, the subscription is marked as outdated as one of the errors could indicate that not all channel tabs or not all items were extracted: see here

Therefore I think this is already working as intended, but maybe we should change the UI for when a subscription was partially loaded (i.e. non-critical errors) vs when it was not loaded at all (i.e. critical error).

afontenot · 2025-02-20T21:05:34Z

Hey @Stypox, thanks for the review. I think there are several misunderstandings to correct though.

subscribers cannot be extracted for channels with pronoun tags

This should be fixed by detecting pronoun tags and returning UNKNOWN_SUBSCRIBER_COUNT in that case.

It's not that it's impossible to extract the subscriber count for channels with pronoun tags, it's that the extraction code is broken as I detailed on the bug report page. It only works for most channels right now by coincidence.

The error is added to a list of non-critical errors. NewPipe already handles those errors nicely:

Maybe I'm misunderstanding you, because the handling doesn't seem nice at all. The result is not just that the subscription is marked as outdated, but also there's a full crash log pop up that appears in the user's notifications as if a major error has occurred. This is annoying and probably results in useless issues being sent.

Stypox · 2025-02-20T22:11:25Z

Sorry, I didn't see your comments in #11353, thank you for the analysis!

it's that the extraction code is broken as I detailed on the bug report page

Then it needs to be fixed, even if fixing it can still lead to something that is flaky. There is already a mechanism that doesn't make the whole extraction fail if just the subscriber count extraction fails, as I said, so it's not considered a critical error.

the handling doesn't seem nice at all

As I said:

we should change the UI for when a subscription was partially loaded (i.e. non-critical errors) vs when it was not loaded at all (i.e. critical error)

I agree it is not a good user experience. What I meant with "nice" was that the error is already properly handled in the correct way (i.e. the subscriptions, even the broken channel, do load fine, since the code handles the non-critical error as non-critical and proceeds).

So to sum up, this is not the way to solve this bug, but you can do this instead:

fixing the bug in the extractor (e.g. by looking for the word "subscriptions" when there is more than one array with two items)
adding a third state to a subscription in NewPipe: "partially fetched". When non-critical errors occur, the subscriptions should be shown as partially fetched. And in any case, any non-critical error unrelated to the channel's getTabs() should be ignored (you should be able to distinguish this by checking if getTabs() returns something or not).

ShareASmile added bug Issue is related to a bug youtube service, https://www.youtube.com/ labels Nov 3, 2024

afontenot changed the title ~~Catch RegexException in subscriber count extractor~~ [YouTube] Catch RegexException in subscriber count extractor Nov 22, 2024

ShareASmile added the ready for review Most of the work is done, PR is now ready for a review label Dec 22, 2024

Stypox closed this Feb 16, 2025

ShareASmile removed the ready for review Most of the work is done, PR is now ready for a review label Feb 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[YouTube] Catch RegexException in subscriber count extractor #1238

[YouTube] Catch RegexException in subscriber count extractor #1238

Uh oh!

afontenot commented Nov 2, 2024 •

edited

Loading

Uh oh!

afontenot commented Nov 2, 2024

Uh oh!

afontenot commented Nov 4, 2024

Uh oh!

afontenot commented Feb 13, 2025

Uh oh!

Stypox commented Feb 16, 2025 •

edited

Loading

Uh oh!

afontenot commented Feb 20, 2025

Uh oh!

Stypox commented Feb 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

[YouTube] Catch RegexException in subscriber count extractor #1238

[YouTube] Catch RegexException in subscriber count extractor #1238

Uh oh!

Conversation

afontenot commented Nov 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

afontenot commented Nov 2, 2024

Uh oh!

afontenot commented Nov 4, 2024

Uh oh!

afontenot commented Feb 13, 2025

Uh oh!

Stypox commented Feb 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

afontenot commented Feb 20, 2025

Uh oh!

Stypox commented Feb 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

afontenot commented Nov 2, 2024 •

edited

Loading

Stypox commented Feb 16, 2025 •

edited

Loading