-
Notifications
You must be signed in to change notification settings - Fork 504
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Timeout api.colabfold.com server #606
Comments
Could you share (or email me) the IP from where you are sending? Generally it should not time-out but either return a 403 or 429 HTTP error (instantly) if you are banned or temporarily banned. |
yes certainly, the IP out from us should be: |
I don’t think I have had to ban a danish IP before. I don’t think that’s the problem (not in front of a computer to check right now though). what does it’s most likely a DNS error, not idea why though |
@vader9 ~]$ dig api.colabfold.com ; <<>> DiG 9.11.36-RedHat-9.11.36-11.el8_9 <<>> api.colabfold.com ;; OPT PSEUDOSECTION: ;; ANSWER SECTION: ;; Query time: 473 msec |
I don't see any reason why it should time-out. The DNS response also looks fine. Does |
[rtk@vader9 ~]$ curl https://api.colabfold.com/queue So yes it seems to work fine. It has nothing to do with your local IT department. like it look like a potential DDOS attack or something like that when I submit 50 jobs at once? Or is that maybe standard practice or maybe even a low amount of runs compared to others? |
If you submit 50 jobs at once you should start getting HTTP 429 error that ColabFold will understand to automatically retry later. It should never time out. That behavior is very puzzling. I have not asked our network management team, but I would not expect this to be an issue, since there are heavier API users than this. |
we normally saw this: 0%| | 0/150 [elapsed: 00:00 remaining: ?] but if we are not among the top heavy api-users with 50 calls, then I will try to increase the 5 runs to maybe 10 and see if that works. 10 should be more than enough for now. |
Ah, that makes more sense. That's not a timeout, but a rate limit and intended behavior. So how the system currently works is that you get 20 "tokens" for job submissions and the tokens are replenished at a rate of 0.01111111111111 per second (or 1 per 90s), where you can submit another job. It doesn't replenish above 20. Thus you can use the API for 40-60 MSAs per hour. We have the |
So What I wrote in my last comment was what we normally saw when submitting 50 runs at one time. But what we got recently was what I wrote in the original post, which was a timeout a run that was left idle for a long time. Sorry for the confusion! But what you just wrote with 20 tokens and replenish makes a lot of sense for what we normally see. But for now the timeout problem is not an issue as long as we don't go to high in run numbers. |
I've actually been running into a similar issue myself. When I try to run ColabFold, I get a timeout error when trying to contact the MSA server. The text I see in the log for each job is "Timeout while submitting to the MSA server. Retrying...." This problem started for me abruptly a couple of weeks ago and I've been trying to figure out what the issue is. I've done a number of the troubleshooting steps suggested above and in other similar threads. When I try "curl https://api.colabfold.com" I also get a timeout error: "curl: (7) Failed connect to api.colabfold.com:443; connection timed out". I see this behavior when I'm on the compute node that has been running the jobs (IP 134.174.140.55). When I run this curl command from other locations on the same network, the command works properly, so it's not a general problem. Here's the output of dig api.colabfold.com: dig api.colabfold.com ; <<>> DiG 9.11.4-P2-RedHat-9.11.4-26.P2.el7_9.16.tuxcare.els1 <<>> api.colabfold.com ;; OPT PSEUDOSECTION: ;; ANSWER SECTION: ;; Query time: 0 msec I'm not seeing an obvious problem. I do note that the IP address in the SERVER field is different from the public IP I find for the compute node I'm on - I assume this has something to do with how the cluster I'm using has been configured..... Any suggestions you might have would be appreciated! |
Hi. I also have this problem using colabfold_batch. Now: 2024-08-29 11:41:26,919 Error while fetching result from MSA server. Retrying... (1/5) ; <<>> DiG 9.18.28-0ubuntu0.20.04.1-Ubuntu <<>> api.colabfold.com ;; OPT PSEUDOSECTION: ;; ANSWER SECTION: ;; Query time: 0 msec curl https://api.colabfold.com/queue Any help gratefully received! |
Same here... I reinstalled and still the same. curl https://api.colabfold.com/queue Any suggesition will be highly appreciated, |
Something is definitely wrong on our side, I get single-digit kilobyte/s download speeds from the server currently. I will try to resolve this with our IT. |
Ok thanks a lot for your very quick answer!!
Fabian
Fabian Glaser, PhD
Technion Center for Structural Biology (TCSB) - Computational Section, Head
Technion Human Health Initiative (THHI)
Technion - Israel Institute of Technology, Haifa, Israel
***@***.***
+972 733783701
… On 29 Aug 2024, at 14:43, Milot Mirdita ***@***.***> wrote:
Something is definitely wrong on our side, I get single-digit kilobyte/s download speeds from the server currently. I will try to resolve this with our IT.
—
Reply to this email directly, view it on GitHub <#606 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACSBVSFTHNELTIS2ST3S25DZT4CPDAVCNFSM6AAAAABGKXDQO6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMJXGQYTCMBRGM>.
You are receiving this because you commented.
|
Hi all, since yesterday, I have the same issue. I am running localcolabfold on one of our clusters. I started 5 predictions, 4 failed and one finished as expected. Today, all 3 predictions failed with the time-out error. This is the error log: I tried the following suggestions, to identify the issue:
But based on the information found in this thread, this looks "normal". Is there any issue on the API side and I just have to wait? Best |
Lately, when we try to submit multiple jobs (max 50 per run) to api.colabfold.com (via the alphapulldown package using mmseqs2) we are hit with:
W0416 18:39:07.143900 139828968597312 colabfold.py:86] Timeout while submitting to MSA server. Retrying...
for all of the runs and none of them are able to connect within hours (I canceled the run after 10 hours).
While such a job is running, and I type: "nmap -Pn -p 80 (or 443) api.colabfold.com" it shows that port 80 and 443 are filtered.
PORT STATE SERVICE
80/tcp filtered http
We have been informed by our IT department that they are not filtering port 443 or 80, which is also what we can see when the above job is not running, then we get (here for 443 but same for 80):
PORT STATE SERVICE
443/tcp open https
Today I tried submitting 50 jobs again, same problem, but if I instead submitted one job at a time the server did not throw the timeout error.
So is there a maximum number of jobs we can submit simultaneously? if so what is that number?
Is it maybe possible to have our IP whitelisted to allow us to submit larger jobs, than whatever the limit is?
please let me know if you need any other information from me.
The text was updated successfully, but these errors were encountered: