535 PGCon 2019 Borodin Odyssey

Download as pdf or txt
Download as pdf or txt
You are on page 1of 66

Odyssey

Advanced multi-threaded PostgreSQL connection pooler and request router

Andrey Borodin, software engineer


Andrey Borodin

› Contributing to Postgres since 2016


› Yekaterinburg database meetup organizer

▌ Working on
› disaster recovery system WAL-G
› connection pooler Odyssey
› interested in anything related to indexing

3
Yandex and PostgreSQL

▌ Yandex.Mail
› some hundreds of millions of users
› 1+ trillion rows, 1+ million requests per second
▌ Yandex.Cloud
› ~2Pb of Postgres (May 2019)
And many other services like taxi, maps, weather forecast, carsharing,
food delivery etc.

4
Cluster in the cloud
Network
WAL Object
RW Queries Storage

Backup

Sync replication Async replication

RO Queries

5
Node in a cluster

Connection Pooler

Client Queries HA Control


Monitoring

6
Why should we pool
connections?
Why should we pool connections?

▌ 1 backend == 1 process

8
Why should we pool connections?

▌ 1 backend == 1 process
▌ Caches per backend
› Relations cache
› Compiled PL\pgSQL
› Plans

9
Why should we pool connections?

▌ 1 backend == 1 process
▌ Caches per backend
› Relations cache
› Compiled PL\pgSQL
› Plans
▌ HA node fencing

10
OLTP throughput

11
Where can we pool connections?
1 Application-side pool

2 Between app and DB

4 DB built-in pooling

7 Combinations
Application-side connection pooler

App backend

13
Backend under load balancer

14
In multiple availability zones

DC1

DC2

15
Proxy poolers

▌ Pgpool II
▌ Crunchy-Proxy
› Diverse functionality beyond pooling
› Only session pooling
▌ PgBouncer
› Lightweight tool
› Transaction pooling

16
▎PgBouncer FTW

on our workload
Houston,

we have a problem
Diagnostics is complicated
miscdb01d/postgres M # SELECT client_addr, count(*)

FROM pg_stat_activity GROUP BY client_addr;

client_addr | count

-------------+-------

127.0.0.1 | 127

::1 | 136

(2 rows)

Time: 2.209 ms

miscdb01d/postgres M #
19
Diagnostics is complicated

Hard to trace

› Network problems
› Client driver problems
Hard to trace events of single session

20
application_name_add_host

miscdb01d/postgres M # SELECT client_addr, client_port, application_name

FROM pg_stat_activity LIMIT 1;

-[ RECORD 1 ]----+------------------------------------------------

client_addr | 127.0.0.1

client_port | 42051

application_name | app - [2a02:6b8:0:f12:225:90ff:fe94:155c]:50184

Time: 2.716 ms

miscdb01d/postgres M #
21
application_name_add_host

22
max_client_pool_conn

No way to limit connection count for specific database+user

key | value

-------------------+-------

max_client_conn | 20000

default_pool_size | 500

min_pool_size | 0

reserve_pool_size | 0
23
max_client_pool_conn

One client is opening max_client_conn connections and others will wait

2017-03-13 10:36:11.671 28152 LOG C-0x1350dd0:


(nodb)/(nouser)@[2a02:6b8:0:1a71::21a0]:55760 closing because: no more
connections allowed (max_client_conn) (age=0)

2017-03-13 10:36:11.671 28152 WARNING C-0x1350dd0:


(nodb)/(nouser)@[2a02:6b8:0:1a71::21a0]:55760 Pooler Error: no more
connections allowed (max_client_conn)

24
max_client_pool_conn

So, we patched PgBouncer


key | value

----------------------+-------

max_client_conn | 20000

max_client_pool_conn | 4000

default_pool_size | 500

min_pool_size | 0

reserve_pool_size | 0
25
Pgbouncer cannot connect to server

We can limit user in PostgreSQL:

› ALTER ROLE XXX WITH CONNECTION LIMIT 200;


› ALTER ROLE YYY WITH CONNECTION LIMIT 10;

26
Pgbouncer cannot connect to server

2017-03-13 10:48:23.995 24408 ERROR S: login failed: FATAL: too many connections for role
"YYY"

psycopg2.OperationalError: ERROR: pgbouncer cannot connect to server

>>> try:

... conn = psycopg2.connect("port=6432 ...")

... except psycopg2.Error as e:

... print(e.pgcode)

...

None

>>>

27
What’s going on?

29
▎We need more gold PgBouncers
HAProxy

Client

HAProxy PgBouncer
PgBouncer PostgreSQL
PgBouncer

31
HAProxy

+ Pros - Cons

› Transparent for client › No client IP again


› Existing tools › One more moving part
› HAProxy does not speak proto3
› Problems with depleted
sockets
SO_REUSEPORT

https://lwn.net/Articles/542629/

+ if (af != AF_UNIX && cf_listen_reuseport == 1) {


+ int val = 1;
+ errpos = "setsockopt";
+ res = setsockopt(sock, SOL_SOCKET, SO_REUSEPORT, &val, sizeof(val));
+ if (res < 0)
+ goto failed;
+ }

33
SO_REUSEPORT

Clients PgBouncer
PgBouncer
PgBouncer PostgreSQL

34
SO_REUSEPORT

› Transparent for clients


› No extra moving parts

› Fragmentation of idle connections among PgBouncers

35
TLS

36
TLS

$ pgbench -C -T 30 -j 300 -c 300 -S


postgresql://127.0.0.1:6432/pgbench?sslmode=disable
<…>
latency average: 26.101 ms
tps = 11484.521542 (including connections establishing)

$ pgbench -C -T 30 -j 300 -c 300 -S


postgresql://127.0.0.1:6432/pgbench?sslmode=require
<…>
latency average: 523.895 ms
tps = 566.809760 (including connections establishing)

37
TLS

38
TLS

When the node is opened – connections startups are coordinated


› TLS hadshake explosion
Some clients have small connect_timeout
› Clients retry, pgbouncer burns CPU

39
TLS

40
Cascading PgBouncers

Client

TLS

PgBouncer :6432
PgBouncer
PgBouncer PgBouncer :7432
PgBouncer PostgreSQL
PgBouncer
PgBouncer
PgBouncer
PgBouncer
PgBouncer
PgBouncer
PgBouncer

41
Cascading PgBouncers

› Still transparent for client


› Withstand any load peak
› Control over idle connection count
› Smooth restart

› Maintenance is difficult
› No control over distribution of load by instances of PgBouncers

42
Looks OK.
How to open source this?
Cancel running query

Client of healthy user


› Opens new connection w\o auth
› Call PQcancel, with secret token from backend
› postgresql.org/docs/current/static/libpq-cancel.html
Smoker’s client
› Just send TCP reset

github.com/pgbouncer/pgbouncer/pull/79

44
What do we want?

› Controllable CPU scaling


› Flexible tuning
› Tracing client session
› Mixed pooling types
› Better error codes forwarding

45
Odyssey
Compilation

› Linux x86, x86_64


› С99
› cmake, gcc/clang
› Depends only on openssl
› One config file
./odyssey <config_file>

47
Internal architecture
main()
thread accept()
worker pool
system

worker
router servers
thread

attach/detach
console cron server
client
coroutine
coroutine
48
Multithreading

› Machinarium: workers and coroutines


› Independent epoll(7) context for each
worker

49
Multithreading details

› Accept(2) in separate thread


› Pipelining small packets
› Cache-friendly pipelining
› Optimization for special case workers = 1

50
Odyssey features

› Enhanced transaction pooling


CANCEL queries that no one waits

51
Enhanced transaction pooling

› Trying to keep server connection


› Automatic ROLLBACK
› Automatic CANCEL
› Optimization of parameter setup (SET, DISCARD)

52
Odyssey features

› Replication support
Clients can migrate FROM your cloud managed services

53
Odyssey features

› PgBouncer console compatibility


Does your monitoring look into ‘SHOW SERVERS’?

54
Odyssey features

› Error forwarding
Easier to handle overload

55
Logging and error forwarding
client_fwd_error off
$ psql "dbname=test host=localhost port=6432"
psql: ERROR: odyssey: c9259d96414b9: failed to connect to
remote server sce469f2305d9

client_fwd_error on
$ psql "dbname=test host=localhost port=6432"
psql: FATAL: odyssey: cbde3e23d9aa2: database "test"
does not exist
56
Logging and error forwarding

log_format "%p %t %l [%i %s] (%c) %m \n"

4249 17 Jun 17:32:27.604 info [cbde3e23d9aa2 none] (startup) new client connection [::1]:50676

4249 17 Jun 17:32:27.604 info [cbde3e23d9aa2 none] (startup) route 'test.pmwkaa' to 'default.default'

4249 17 Jun 17:32:27.604 info [cbde3e23d9aa2 sa6a53e6ec6d7] (setup) new server connection

127.0.0.1:5432

4249 17 Jun 17:32:27.607 error [cbde3e23d9aa2 sa6a53e6ec6d7] (startup) FATAL 3D000 database "test"

does not exist

57
Logging and error forwarding
client_fwd_error off
$ psql "dbname=test host=localhost port=6432"
psql: ERROR: odyssey: c9259d96414b9: failed to connect to
remote server sce469f2305d9

client_fwd_error on
$ psql "dbname=test host=localhost port=6432"
psql: FATAL: odyssey: cbde3e23d9aa2: database "test"
does not exist
58
Route settings

storage "postgres_server" {

type "remote"
host "127.0.0.1"
port 5432
tls "disable"

59
Route settings
database "test" {
user "test" {
storage "postgres_server"
authentication "none"
client_max 100
pool "transaction"
pool_size 10
pool_cancel yes
pool_rollback yes
}
user default {
authentication "block"
}
}

60
Route settings

database default {
user default {
authentication "block"
}
}

61
pgbench Odyssey RPS PgBouncer RPS

160000 149 622

120000 117 616


95 023
80000 73 269
58 391
46 711
40000 37 105
23 296

0
1 worker 2 workers 4 workers 8 workers

*Benchmark results depend on software, hardware and weather on the moon. Do not trust them.
**We optimized scaling, not throughput. 62
How we test

› PostgreSQL make install-check


› Drivers tests: pq, node-postgres, pgjdbc, psycopg2
› Unit-tests

63
How we test

› make install-check -> Odyssey -> PostgreSQL


› make install-check -> PgBouncer -> Odyssey -> PostgreSQL

64
Roadmap

› SCRAM authentication
› Forward read-only queries to replica
› Online restart
› Pause server
› …
› Pull requests are welcome!

65
Andrey Borodin

Waiting for questions J

[email protected]
x4mmm

You might also like