-
Notifications
You must be signed in to change notification settings - Fork 213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] PRAW Error Code 429 #910
Comments
Please supply the logs. They are not optional |
I got this error as well while downloading a certain subreddit. The program quit prematurely right after it showed the error. Command I used: bdfr-download . -s "traps" |
I'm dealing with this new error too. |
same error here: D:\Videos>bdfr download D:\Videos --opts fortnite.yaml |
I run BDFR within a docker container. I have 'fixed' some issues with the docker version, very poorly (as a newbie) by injecting some commands in. Those commands being: sudo docker exec bdfr apt-get update -y and most importantly, this one: Sadly, I am still getting errors.
To be clear, I get those errors WITH or WITHOUT upgrading praw. I suspect something has broken. |
This is because of the new Reddit policy, 429 means 'Too Many Requests', it's the rate limiting error. |
Well that sounds like we'd need some way to reduce the requests. I'm ok with that too, I can just leave it run overnight. |
The same essentially happens if you try process a bunch of links with JDownloader. Limiting each batch to <100 and spacing them out fixes the problem. Similarly, a bulk download of a subreddit with lots of videos never hit the 429 error compared to one filled with smaller files I assume because the time spent downloading the videos spaced things out enough to not hit the request limit. It looks like with the new API changes bdfr needs to slow down how much it downloads/minute either with a delay between each request or pausing after 90 or so requests with a wait timer before doing the next 90 requests. |
I am also having this issue, but limiting doesn't help. Even authenticated, I get extremely rate limited, with downloading only a couple text posts (no comments) giving the error. C:\Users\piguy.DESKTOP-73IAHP5>py -m bdfr download D:\HFY-Stories\NoP\A_Recipe_For_Disaster --user YakiTapioca --submitted --authenticate --file-scheme {TITLE}{REDDITOR}{POSTID}_{DATE} -L 10 -v I did do a massive archive in the past, but it has been several days with no downloads since. I have tried waiting hours between downloads of 10 posts, but it always fails at least once, usually at the first post. I can load posts on my account and computer fine, but this application has lots of trouble. Would changing User agent or API access work? Providing an option to add time between post downloads could also help, but I don't know how extreme it would have to be. |
Yes, I'm going to work on this on my tomorrow. Basically Reddit has instituted rate limits that are per app. That means that while you're using the provided client ID and secret, every single user of the BDFR has a combined quota. All of you using it are making it extremely hard for anyone to use it at all. I'll be working on a message that displays when this happens but you can change the client ID on your own. The configuration file has that. If you register your own application with Reddit, the BDFR will work much better, though there will still be quotas on your number of requests. I'll be adding logic to wait for the reset and handle these errors better and tightening up the number of requests where I can but, for now, registering your own application would be best. |
This is the same problem Apollo and other 3rd party apps are having, right? Do we use "https://old.reddit.com/prefs/apps/" or "https://support.reddithelp.com/hc/en-us/requests/new?ticket_form_id=14868593862164" to register an app? Also, thank you for working on this. With how bad reddit is shutting down access, I am glad people are still helping others get data out. |
You'll need to go to https://old.reddit.com/prefs/apps/ - I used a localhost URL for the about and redirect, and selected web app. Then go to the default_config.cfg file, which for me is in c:\users[username]\appdata\local\bdfr\bdfr, and replace the id, which is at the top of the app page in reddit under the name, and the secret, which is visible when you create the app. |
@markl181 which exact URL did you use for the redirect and about? |
The BDFR can do the registration itself. You just need to get the client ID and secret here and then put that into the configuration file. For the website, set http://localhost:7634/ as the URL, that's what the BDFR will return to. The implementation is in bdfr/oauth2.py. |
Thanks a lot it works now! |
Can you tell the exact command you used thats working now? I am getting same error |
This effort is very much appreciated, honestly I expected with all this work you might just drop it. |
Hopefully it won't be too much work, but due to the way the repository is set up, it'll limit when I can make a new master release. I don't have the power to change the secrets we use for tests, nor do I have the power to override a merge block into master because of said tests. I'll fix the problem and then see if I can get Ali to change it over. I did bring up moving the repository to an organisational account, but that's its own set of problems, not least that it would effectively create a new repository that this one would have to link to, so it's not an immediate fix anyway. |
So...fixing this will take a fair bit of work. The way PRAW works is that it uses lazy variables. That means any time that the code calls a variable, it could potentially call Reddit. Any call could then return a 429 error and, of course, PRAW doesn't offer any way to deal with this. I'll have to go through all of the core code and refactor it into more atomic parts. Then I'll have to wrap all of them in code to make it wait and not lose too much work so...hopefully it won't take too long. |
many thanks everyone. I condensed the steps for anyone like me having issues:
|
This still isn't working for me. I get this response in cmd: The link on reddit then says: bad request (reddit.com) you sent an invalid request I have done everything in your post correctly. What's the problem? |
For those having problems with using their own How to use bdfr with your own
|
Remove the trailing '/' in the redirect URI section of the reddiu web app - see the screenshots above from @Gavriik and note that they is no slash at the end of the redirect and about URIs |
I get the error below during the authentication part. The auth part seems to work just fine on the reddit side, it's just gives this error 500 on the return. I am running bdfr on a separate server and doing the auth on my desktop. I've done the same previously with other programs that use reddit api without issue though. [2023-07-17 16:26:00,363 - bdfr.oauth2 - WARNING] - Authentication action required before the program can proceed |
I'm still getting 429s pretty quickly after setting that up and using --authenticate. |
follow this : #910 (comment) |
Today I was getting 429s. Then I logged into Reddit in a browser (on a different PC, but same public IP) and the 429s stopped. Think this is intended by Reddit? |
Maybe they did not follow your instructions correctly, or they are not getting the my_config.cfg file correctly when running the script. In my case the error 429 stopped appearing with your fix, so I confirm that your fix works perfect :D I only get errors when the script encounters a twitter or youtube post I always get an error and it never downloads those videos, but I think that has always happened to me. |
I'm seeing the same thing. Stuck here |
This did fix the 429 error for me, but I was still running into the endless hang issues reported in #911 (now closed which is why I'm commenting here). Ended up just giving up on what I was doing but just in case it helps anyone else. |
Before the fix with the swcript I was getting approximately 100 videos (I run the script every 24 hours), with your fix I am getting only 20 videos or less, is there something in the custom .cfg file that can cause this? |
I don't think the issue is related to the custom .cfg, as the only thing that changes from the default .cfg used by bdfr is the |
for those who still see issues with BDFR after making an app in prefs
|
I still get some 429 errors when running as a personal use script (#910 (comment), using |
@Root-FTW that's not the command that you used. The command that you used is |
I'm sorry maybe I didn't explain correctly or rather, I didn't put everything in a complete way: Months ago I created a .bat file which is executed daily in my windows every 24 hours, this is the .bat when I finish running the script is when I get this message: |
@Root-FTW you don't seem to be specifying the location of the configuration file. I think youve put rhe file into BDFRs working dir but id still specify the path. Apologies if this doesn't help |
Yep, definitely the issue is how you are executing your bdfr command. I'm not familiar with bat scripts, but as @WARl0-01 suggested try passing the absolute path to your .cfg, relative path doesn't seem to work in your case. If that does not solve your problem then mostly like the issue is your command syntax, |
@Gavriik @WARl0-01 Thank you very much, your solution solved my problem perfectly, thank you. |
I've pushed changes to the development branch that should hopefully fix this issue. There's an option in praw to extend the timeout. Hopefully that will fix the problem well enough. |
hello, is it possible that this script can download videos only if they last 15 seconds or more? I can't find a variable for that. |
No, that isn't possible. |
I know this has nothing to do with this project, but it's the best project to get quality content fast and you are the only option I know of that could do something like what I'm looking for. Is there any chance that this sccript will support tiktok in the future? I mean I can put for example the hashtag #fortnite and it will bring me the latest videos from tiktok with the hashtag #fortnite, like it does with reddit but with tiktok. Or if it will never support something like that, do you know of any project that can do that? |
No, that isn't something that we can do. The BDFR is for Reddit specifically. It's possible that we could download Tiktok videos that are linked on Reddit but that's it. Check out yt-dlp. That's what we use for most of the backend video links, they might be able to download specific Tiktok videos. |
Thanks for your answer, one last question, is it possible that I don't get this error? It seems that sometimes it is found with twitter links that contain video like this: And I get this error: |
Thats simply the logging module doing its job, that part of the code is in bdfr/downloader.py. If you don't want to see that error you can try using the |
Thanks to all the great comments I am back in business with subreddit downloads using BDFR. Many thanks to all. Since I am now authenticating, I have not found a way to download posts from users. Does anyone have any tips? Thanks in advance! See below for example when I try to download posts from Limp-Plankton5931 C:\BDFR>c:\users\me\appdata\local\programs\python\python310\python -m bdfr download v:\bdfr\Limp-Plankton5931\ -L 65 --authenticate --config "c:\bdfr\my_config.cfg" --user Limp-Plankton5931 --no-dupes --verbose [2023-08-14 13:42:13,502 - bdfr.connector - DEBUG] - Setting maximum download wait time to 120 seconds |
Authentication shouldn't disrupt the process in any way of downloading other users. You need to specify what from them that you want to download, such as using the |
Thanks Serene-Arc! That did the trick! |
Hopefully the development has the 429 error fixed, but I won't be able to merge it into master until Ali changes the secrets. I don't have the power to override the master branch protections, so the tests HAVE to be passing before it can be merged. At the moment, The repository secrets have the old client ID which means that the tests take too long and time out. I don't have the power to change that either. For everyone coming here, pull and use the development branch for the moment and I'll get it merged to master as soon as possible. |
Not to sound like an idiot, but ive never pulled from a development branch with pip before. |
You have to specify the link and then pip will handle it. The command that you want is this: pip install -U git+https://github.com/aliparlakci/bulk-downloader-for-reddit.git@development For anyone wondering about updates, I've contacted Ali and am waiting for him to respond. Unfortunately this is as much as I can do at the moment. |
Updated to the development branch and now I get |
Any luck getting ahold of Ali? |
Not as of yet unfortunately |
Description
I am trying to rip a reddit user's submitted posts but keep getting blocked by PRAW, I've also tried updating with no success. It does work sometimes but it much slower than usual.
Command
py -3 -m bdfr download C:\bdfrips\tti --user TameTheseImpalas --submitted
Environment (please complete the following information)
Logs
py -3 -m bdfr download C:\bdfrips\tti --user TameTheseImpalas --submitted
[2023-07-11 19:55:00,020 - bdfr.connector - ERROR] - User TameTheseImpalas failed to be retrieved due to a PRAW exception: received 429 HTTP response
The text was updated successfully, but these errors were encountered: