-
Notifications
You must be signed in to change notification settings - Fork 394
Description
Hello, guys!
I have 20 nginx upstreams with 60 php-fpm backends each. When I started using upsync, I got high CPU utilization by php-fpm processes on servers. I tested nginx before using upsync and after and got a stunning result.
With the standard upstreams configuration:
upstrem test {
server 10.0.134.116:8080 weight=15 max_fails=0;
server 10.0.134.121:8080 weight=13 max_fails=0;
server 10.0.137.52:8080 weight=14 max_fails=0;
server 10.0.137.51:8080 weight=14 max_fails=0;
server 10.0.135.159:8080 weight=14 max_fails=0;
server 10.0.136.45:8080 weight=14 max_fails=0;
server 10.0.137.57:8080 weight=15 max_fails=0;
server 10.0.137.58:8080 weight=15 max_fails=0;
server 10.0.136.28:8080 weight=14 max_fails=0;
server 10.0.134.192:8080 weight=15 max_fails=0;
server 10.0.134.179:8080 weight=13 max_fails=0;
server 10.0.137.60:8080 weight=15 max_fails=0;
server 10.0.135.139:8080 weight=14 max_fails=0;
server 10.0.137.36:8080 weight=14 max_fails=0;
server 10.0.136.65:8080 weight=14 max_fails=0;
server 10.0.134.212:8080 weight=13 max_fails=0;
server 10.0.137.92:8080 weight=14 max_fails=0;
server 10.0.134.118:8080 weight=15 max_fails=0;
server 10.0.137.61:8080 weight=14 max_fails=0;
server 10.0.137.122:8080 weight=14 max_fails=0;
server 10.0.137.243:8080 weight=13 max_fails=0;
server 10.0.136.39:8080 weight=14 max_fails=0;
server 10.0.137.195:8080 weight=13 max_fails=0;
server 10.0.134.122:8080 weight=13 max_fails=0;
server 10.0.137.171:8080 weight=13 max_fails=0;
server 10.0.134.123:8080 weight=13 max_fails=0;
server 10.0.137.54:8080 weight=15 max_fails=0;
server 10.0.137.168:8080 weight=13 max_fails=0;
server 10.0.136.51:8080 weight=14 max_fails=0;
server 10.0.137.31:8080 weight=14 max_fails=0;
server 10.0.137.156:8080 weight=14 max_fails=0;
server 10.0.135.158:8080 weight=14 max_fails=0;
server 10.0.137.23:8080 weight=14 max_fails=0;
server 10.0.134.127:8080 weight=13 max_fails=0;
server 10.0.137.170:8080 weight=13 max_fails=0;
server 10.0.137.173:8080 weight=13 max_fails=0;
server 10.0.134.98:8080 weight=14 max_fails=0;
server 10.0.137.71:8080 weight=14 max_fails=0;
server 10.0.135.140:8080 weight=14 max_fails=0;
server 10.0.137.77:8080 weight=14 max_fails=0;
server 10.0.136.49:8080 weight=14 max_fails=0;
server 10.0.137.73:8080 weight=14 max_fails=0;
server 10.0.136.38:8080 weight=14 max_fails=0;
server 10.0.137.35:8080 weight=14 max_fails=0;
server 10.0.137.138:8080 weight=14 max_fails=0;
server 10.0.137.162:8080 weight=14 max_fails=0;
server 10.0.136.43:8080 weight=14 max_fails=0;
server 10.0.137.144:8080 weight=14 max_fails=0;
server 10.0.134.124:8080 weight=13 max_fails=0;
server 10.0.134.128:8080 weight=13 max_fails=0;
server 10.0.136.48:8080 weight=14 max_fails=0;
server 10.0.137.32:8080 weight=14 max_fails=0;
server 10.0.137.169:8080 weight=13 max_fails=0;
server 10.0.136.26:8080 weight=14 max_fails=0;
server 10.0.136.68:8080 weight=14 max_fails=0;
server 10.0.137.74:8080 weight=14 max_fails=0;
server 10.0.137.81:8080 weight=14 max_fails=0;
server 10.0.137.254:8080 weight=14 max_fails=0;
server 10.0.137.172:8080 weight=13 max_fails=0;
server 10.0.136.64:8080 weight=14 max_fails=0;
}
server {
listen 80;
location / {
include fastcgi_params;
fastcgi_pass test;
fastcgi_param SCRIPT_FILENAME $document_root/index.php;
}
}
I have this result with distribute requests between backends:
for t in $(for i in {1..10}; do date "+%d/%b/%Y:%H:%M" --date "-$i min"; done); do grep -r "*$t.*GET \/ " /var/log/nginx/access.log | sed -r 's/.*upstream_addr:\s(.*):8080.*/\1/g'; done | sort | uniq -c | sort -nr
10 10.0.137.60
10 10.0.137.58
10 10.0.137.57
9 10.0.137.54
9 10.0.134.192
9 10.0.134.118
9 10.0.134.116
8 10.0.137.77
8 10.0.137.74
8 10.0.137.73
8 10.0.137.71
8 10.0.137.61
8 10.0.137.52
8 10.0.137.51
8 10.0.137.35
8 10.0.137.32
8 10.0.137.23
8 10.0.135.159
8 10.0.135.158
8 10.0.135.140
8 10.0.135.139
8 10.0.134.98
7 10.0.137.92
7 10.0.137.81
7 10.0.137.36
7 10.0.137.31
7 10.0.137.254
7 10.0.136.43
7 10.0.136.28
6 10.0.137.162
6 10.0.137.156
6 10.0.137.144
6 10.0.137.138
6 10.0.137.122
6 10.0.136.68
6 10.0.136.64
6 10.0.136.51
6 10.0.136.49
6 10.0.136.48
6 10.0.136.39
6 10.0.136.38
6 10.0.136.26
6 10.0.134.124
6 10.0.134.123
5 10.0.137.243
5 10.0.137.195
5 10.0.137.173
5 10.0.137.172
5 10.0.137.171
5 10.0.137.170
5 10.0.137.169
5 10.0.137.168
5 10.0.136.45
5 10.0.134.212
5 10.0.134.179
5 10.0.134.128
5 10.0.134.127
5 10.0.134.122
5 10.0.134.121
Now enable upsync:
upstream test {
upsync 127.0.0.1:2379/v2/keys/upsync/test upsync_interval=5s upsync_timeout=5m upsync_type=etcd strong_dependency=off;
upsync_dump_path /etc/nginx/conf.d/upsync/test.inc;
include /etc/nginx/conf.d/upsync/test.inc;
}
server {
listen 80;
location / {
include fastcgi_params;
fastcgi_pass test;
fastcgi_param SCRIPT_FILENAME $document_root/index.php;
}
}
Adding entries with upstreams to etcd, run my test again and see the following result:
for t in $(for i in {1..10}; do date "+%d/%b/%Y:%H:%M" --date "-$i min"; done); do grep -r "*$t.*GET \/ " /var/log/nginx/access.log | sed -r 's/.*upstream_addr:\s(.*):8080.*/\1/g'; done | sort | uniq -c | sort -nr
45 10.0.137.54
30 10.0.137.60
29 10.0.137.57
23 10.0.134.192
21 10.0.134.118
19 10.0.134.116
17 10.0.137.58
12 10.0.137.36
12 10.0.137.35
10 10.0.137.61
8 10.0.137.32
8 10.0.137.23
7 10.0.137.81
7 10.0.137.77
7 10.0.137.74
7 10.0.137.162
7 10.0.136.28
6 10.0.137.92
6 10.0.135.159
6 10.0.135.139
5 10.0.137.144
5 10.0.135.158
4 10.0.137.73
4 10.0.137.71
4 10.0.137.254
4 10.0.137.156
4 10.0.137.122
4 10.0.136.51
4 10.0.136.43
4 10.0.136.26
3 10.0.137.138
3 10.0.136.68
3 10.0.136.48
3 10.0.136.39
3 10.0.136.38
2 10.0.137.52
2 10.0.137.51
2 10.0.137.31
2 10.0.137.173
2 10.0.137.170
2 10.0.137.168
2 10.0.136.65
2 10.0.136.64
2 10.0.136.49
2 10.0.136.45
2 10.0.135.140
2 10.0.134.98
2 10.0.134.212
2 10.0.134.127
2 10.0.134.124
1 10.0.137.171
1 10.0.137.169
1 10.0.134.179
1 10.0.134.123
1 10.0.134.122
1 10.0.134.121
As you can see, nginx with upsync forward a lot more requests to several servers than to others. If I specify weight=1 for every backend, then load will be approximately equal. But this does not suit me, because I have different CPU and RAM configurations on different servers under high load. I need exactly the values of weights that I had without upsync. I have a suspicion that upsync does not work correctly with weights and needs the fix.