-
Notifications
You must be signed in to change notification settings - Fork 36.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tracing: network connection tracepoints #25832
base: master
Are you sure you want to change the base?
Conversation
nice, utACK 7252c60
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code Coverage & BenchmarksFor details see: https://corecheck.dev/bitcoin/bitcoin/pulls/25832. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
Note to self: might be worth checking if we can also pass AS information and not only the net_group. |
7252c60
to
8528781
Compare
This is possible by accessing the CNodeStats, but don't think we should do that in the tracepoint. We have the IP and can lookup the AS in a tracing scirpt, if necessary. Fixed up the linter complains and rebased. This is ready for further review. |
Concept ACK |
8528781
to
ac438b1
Compare
ac438b1
to
6de4fc7
Compare
The linter started complaining about "E275 missing whitespace after keyword". Fixed this up in the individual commits. ac438b1 -> 6de4fc7 (2022-05-connection-tracepoints.pr.1 -> 2022-05-connection-tracepoints.pr.2; compare) |
ACK 6de4fc7 Code reviewed tracepoints, tests, and demo script. Tracepoint placement looks good, so do tests. Successfully ran functional tests and the demo script. Some points:
|
re-ACK 5b59cfc 🌞 Show signatureSignature:
|
tested ACK 🔥 Successfully ran demo scripts and functional tests!! I have been using these tracepoints for 3-4 months from now and I find it very useful. They help you keep track of your peers, also I am working on a tool that detects spy peers and for that these tracepoints are very useful. Great work @0xB10C. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approach ACK 5b59cfc
int trace_inbound_connection(struct pt_regs *ctx) { | ||
struct NewConnection inbound = {}; | ||
bpf_usdt_readarg(1, ctx, &inbound.conn.id); | ||
bpf_usdt_readarg_p(2, ctx, &inbound.conn.addr, MAX_PEER_ADDR_LENGTH); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't find the doc of bpf_usdt_readarg_p()
. What would it do if the argument is longer than the supplied buffer? I presume the last arg is the length of the buffer, so it is not going to overflow, I guess. But would it add a trailing '\0'
? That is - could it copy MAX_PEER_ADDR_LENGTH
from the input to the output and leave it like that? If yes, would that be a problem - a string without a terminating '\0'
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would it do if the argument is longer than the supplied buffer? I presume the last arg is the length of the buffer, so it is not going to overflow, I guess.
Correct, it would cut it off.
But would it add a trailing '\0'?
No. However there is bpf_probe_read_user_str() which does that. This generally seems to be the better suited methodology here. I'll use that for the changes in this PR. There are other places in our code we can update in a potential follow up.
That is - could it copy MAX_PEER_ADDR_LENGTH from the input to the output and leave it like that? If yes, would that be a problem - a string without a terminating '\0'?
I think, for us, in this test here, this isn't a problem. The Python ctype mapping for Connection
defined below has the same size for an address (("addr", ctypes.c_char * MAX_PEER_ADDR_LENGTH)
).
🚧 At least one of the CI tasks failed. HintsTry to run the tests locally, according to the documentation. However, a CI failure may still
Leave a comment here, if you need help tracking down a confusing failure. |
A bpftrace script that logs information from the net:*_connection tracepoints. I've tested this script with bpftrace version 0.14.1 and v0.20.2.
A v3 onion address with a `:` and a five digit port has a length of 68 chars. As noted in bitcoin#25832 (comment) peers e.g. added via hostname might have a longer CNode::m_addr_name. These might be cut off in tracing tools.
A v3 onion address with a `:` and a five digit port has a length of 68 chars. As noted in bitcoin#25832 (comment) peers e.g. added via hostname might have a longer CNode::m_addr_name. These might be cut off in tracing tools.
5b59cfc
to
15dfd72
Compare
🚧 At least one of the CI tasks failed. HintsTry to run the tests locally, according to the documentation. However, a CI failure may still
Leave a comment here, if you need help tracking down a confusing failure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 15dfd72 (modulo to lint failure)
uint64_t conn_type_pointer; | ||
uint64_t address_pointer; | ||
bpf_usdt_readarg(1, ctx, &inbound.conn.id); | ||
bpf_usdt_readarg(2, ctx, &address_pointer); | ||
bpf_usdt_readarg(3, ctx, &conn_type_pointer); | ||
bpf_usdt_readarg(4, ctx, &inbound.conn.network); | ||
bpf_usdt_readarg(5, ctx, &inbound.existing); | ||
bpf_probe_read_user_str(&inbound.conn.addr, sizeof(inbound.conn.addr), (void *) address_pointer); | ||
bpf_probe_read_user_str(&inbound.conn.type, sizeof(inbound.conn.type), (void *) conn_type_pointer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Non blocker comment: this cast from uint64_t
to (void*)
looks odd, maybe it will produce compiler warning in some cases? Also, on 32 bit systems that should be uint32_t
. Can these variables be declared as void*
instead? Like:
uint64_t conn_type_pointer; | |
uint64_t address_pointer; | |
bpf_usdt_readarg(1, ctx, &inbound.conn.id); | |
bpf_usdt_readarg(2, ctx, &address_pointer); | |
bpf_usdt_readarg(3, ctx, &conn_type_pointer); | |
bpf_usdt_readarg(4, ctx, &inbound.conn.network); | |
bpf_usdt_readarg(5, ctx, &inbound.existing); | |
bpf_probe_read_user_str(&inbound.conn.addr, sizeof(inbound.conn.addr), (void *) address_pointer); | |
bpf_probe_read_user_str(&inbound.conn.type, sizeof(inbound.conn.type), (void *) conn_type_pointer); | |
void* conn_type_pointer; | |
void* address_pointer; | |
bpf_usdt_readarg(1, ctx, &inbound.conn.id); | |
bpf_usdt_readarg(2, ctx, &address_pointer); | |
bpf_usdt_readarg(3, ctx, &conn_type_pointer); | |
bpf_usdt_readarg(4, ctx, &inbound.conn.network); | |
bpf_usdt_readarg(5, ctx, &inbound.existing); | |
bpf_probe_read_user_str(&inbound.conn.addr, sizeof(inbound.conn.addr), address_pointer); | |
bpf_probe_read_user_str(&inbound.conn.type, sizeof(inbound.conn.type), conn_type_pointer); |
This adds five new tracepoints with documentation and tests for network connections:
net:inbound_connection
andnet:outbound_connection
net:closed_connnection
net:evicted_inbound_connection
net:misbehaving_connection
I've been using these tracepoints for a few months now to monitor connection lifetimes, re-connection frequency by IP and netgroup, misbehavior, peer discouragement, and eviction and more. Together with the two existing P2P message tracepoints they allow for a good overview of local P2P network activity. Also sort-of addresses #22006 (comment).
I've been back and forth on which arguments to include. For example,
net:evicted_connection
could also include some of the eviction metrics like e.g.last_block_time
,min_ping_time
, ... but I've left them out for now. If wanted, this can be added here or in a follow-up. I've tried to minimize a potential performance impact by measuring executed instructions withgdb
where possible (method described here). I don't think a few hundred extra instructions are too crucial, as connection opens/closes aren't too frequent (compared to e.g. P2P messages). Note: e.g.CreateNodeFromAcceptedSocket()
usually executes between 80k and 90k instructions for each new inbound connection.Also added a bpftrace (tested with v0.14.1)
log_p2p_connections.bt
example script that produces output similar to: