DTrace Network Providers

The following is a design proposal for a collection of DTrace Networking Providers. These aim to provide networking observability and troubleshooting information for Solaris users. A prototype TCP provider was demonstrated at CEC 2006.

I originally posted this at http://www.opensolaris.org/os/community/dtrace/NetworkProvider/ . This project was later completed and merged into the Solaris kernel.

This document will list the probes that may be made available, their arguments, and examples of their proposed usage. Feedback is welcome, please post to dtrace-discuss. NOTE: this is a work in progress, and this network provider is not yet available! The purpose of this site is to flesh out ideas for a future network provider.

Who is talking to my web server?

# dtrace -n 'tcp:::receive /args[2]->tcp_dport == 80/ {
        @pkts[args[1]->ip_daddr] = count();
}'
dtrace: description 'tcp:::receive' matched 1 probe
^C

  192.168.1.8                                                       9
  fe80::214:4fff:fe3b:76c8                                         12
  192.168.1.51                                                     32
  10.1.70.16                                                       83
  192.168.7.3                                                     121
  192.168.101.101                                                 192

What ports are people connecting to?

# dtrace -n 'tcp:::accept-established { @[args[2]->tcp_dport] = count(); }'
dtrace: description 'tcp:::accept-established' matched 1 probe
^C

       79                2
       22               14
       80              327

Author: Brendan Gregg and others, 16-Oct-2006

Contents

Aims

The main aims of the network providers are:

Future enhancements and additions to the providers include:

Providers

The Network provider is a collection of several DTrace providers, such as ip and tcp (there is no monolithic net provider). They are:

providerdescription
gldThis is the generic LAN device layer, and shows link layer activity such as Ethernet frames. The probes allow frame by frame tracing.
arpThis allows tracing of ARP and RARP packets.
icmpThis allows tracing of ICMP packets, and provides the type and code from the ICMP header.
ipThis is the IP layer, which provides tracing for IPv4 and IPv6. The probes allow packet by packet tracing, and are placed near the interface to GLD.
tcpThis is the TCP layer, and shows what TCP activity is occuring. The probes are placed close to the IP interface, so that TCP events passed down to or received from IP can be traced.
udpThis allows tracing of UDP events.
sctpThis allows tracing of SCTP events.
socketThis allows tracing at the socket layer, close to the application. These probes fire in the same context as the corresponding process, and show data sent to or received from sockets.

Other providers such as icmp and igmp may also be added (and many more).

Modules

The module and function names are from the kernel code, and are usually ignored unless deeper code-based analysis is needed.

Probes

tcp:::accept-established
tcp:::accept-refused
tcp:::connect-request
tcp:::connect-established
tcp:::connect-refused
tcp:::state-bound
tcp:::state-close-wait
tcp:::state-closed
tcp:::state-closing
tcp:::state-established
tcp:::state-fin-wait1
tcp:::state-fin-wait2
tcp:::state-idle
tcp:::state-last-ack
tcp:::state-listen
tcp:::state-syn-received
tcp:::state-syn-sent
tcp:::state-time-wait
tcp:::drop-saturation
tcp:::data-send
tcp:::data-resend
tcp:::data-receive
tcp:::ack-send
tcp:::ack-receive
tcp:::fuse-send
tcp:::fuse-receive
tcp:::send
tcp:::receive
udp:::send
udp:::receive
sctp:::send
sctp:::receive
ip:::send
ip:::receive
arp:::send
arp:::receive
icmp:::send
icmp:::receive
gld:::send
gld:::receive

Send and receive probes fire whenever that layer sends or receives. They can be used for observability and measuring network latency, but their placement is not sufficient for kernel code path latency measurements.

TCP fuse probes fire for localhost traffic (TCP fusion), IP probes do not.

drop-saturation probes fire when packets are dropped due to load; other varients may include drop-error.

TCP Probes

To start with, send and receive fire for all traffic that is TCP; this includes the TCP handshake, TCP ACKs, and also for invalid packets such as those with corrupt flags. Since the TCP flags are available in the probe arguments, any particular types of TCP traffic can be matched using DTrace predicates.

Additional probes have been provided for convienence in addition to send and receive. This includes the data-* probes for matching on TCP packets containing data, connect-* probes for client connections, accept-* probes for servers accepting connections, state-* for RFC 793 state changes, and so on.

Outbound TCP Connections

For outbound connections, connect-request fires before the first SYN, connect-established fires after the SYN-ACK as been received, and the final ACK is being sent.

In summary:

TCP eventDTrace probe
SYN senttcp:::connect-request
SYN-ACK received, ACK senttcp:::connect-established
SYN send, RST receivedtcp:::connect-refused

An outbound connection to an open port will fire a connect-request followed by a connect-established. An outbound connection to a closed port will fire a connect-refused.

A possible connect-closed probe (normal close) has been dropped from this design due to difficulties encountered in the tcp code; the state-closed probe may serve a similar purpose.

Inbound TCP Connections

For inbound connections, accept-established fires after the final ACK was received in the three way handshake in response to an inbound SYN. accept-refused fires if a connection was refused, such as a SYN to a closed port.

In summary:

TCP eventDTrace probe
SYN-ACK sent, ACK receivedtcp:::accept-established
RST sent, port closedtcp:::accept-refused

An inbound connection to an open port will fire an accept-established, and an inbound connection to a closed port will fire an accept-refused.

A possible accept-closed probe (normal close) has been dropped from the design due to difficulties encountered in the tcp code; the state-closed probe may serve a similar purpose.

Note: TCP events such as the three way handshake can also be traced using tcp:::receive and tcp:::send, and examination of the TCP flags (args[2]->tcp_flags).

Arguments

The following information is made available for the probes, where possbile and appropriate:

probesargs[0]args[1]args[2]args[3]
gld:::send
gld:::receive
gldstateinfo_t *ipinfo_t *gldinfo_t *NULL
ip:::send
ip:::receive
ipstateinfo_t *ipinfo_t *ipv4info_t *ipv6info_t
tcp:::send
tcp:::receive
tcpstateinfo_t *ipinfo_t *tcpinfo_t *NULL
tcp:::accept-*
tcp:::connect-*
tcpstateinfo_t *ipinfo_t *tcpinfo_t *NULL
tcp:::state-*
tcpstateinfo_t *NULLNULLNULL
tcp:::fuse-*
tcpstateinfo_t *tcpfuseinfo_t *size_tNULL
udp:::send
udp:::receive
udpstateinfo_t *ipinfo_t *udpinfo_t *NULL
udp:::stream-*
udpstateinfo_t *NULLNULLNULL
icmp:::send
icmp:::receive
icmpstateinfo_t *ipinfo_t *icmpinfo_t *NULL

For IPv4 traffic, the ip probes provide header details in args[2] and args[3] is NULL; for IPv6 traffic, the header details are in args[3] and args[2] is NULL. For both traffic, summary header details are in args[1] - ipinfo_t.

Argument Details

The arguments contain the following members:

argtypecontentstypesnotes
args[0]tcpstateinfo_t * args[0]->tcps_cid
args[0]->tcps_loopback
args[0]->tcps_active
args[0]->tcps_state
args[0]->tcps_statestr
uint64_t
int
int
int
string
tcps_sid is a pointer to a unique connection-ID.
tcps_loopback is a boolean to identify loopback traffic.
tcps_active is a boolean for active opens - host initiated connections.
tcps_state is the numeric TCP state.
tcps_statestr is a string form of the TCP state.
args[1]ipinfo_t *
args[1]->ip_ver
args[1]->ip_plength
args[1]->ip_saddr
args[1]->ip_daddr
uint8_t
uint16_t
string
string
These fields are valid for both IPv4 and IPv6, and have been
provided for convienence. ip_saddr and ip_daddr are
stringified versions of the IPv4 or IPv6 addresses.
args[2]etherinfo_t * args[2]->e_source
args[2]->e_dest
args[2]->e_type
args[2]->e_length
args[2]->e_ifname
ether_addr_t
ether_addr_t
uint16_t
uint16_t
char *
e_ifname is the interface name, such as hme0.
args[2]ipv4info_t * args[2]->ipv4_ver
args[2]->ipv4_tos
args[2]->ipv4_length
args[2]->ipv4_ident
args[2]->ipv4_flags
args[2]->ipv4_offset
args[2]->ipv4_ttl
args[2]->ipv4_protocol
args[2]->ipv4_src
args[2]->ipv4_dst
args[2]->ipv4_hdr
uint8_t
uint8_t
uint16_t
uint16_t
uint8_t
uint16_t
uint8_t
uint8_t
ipaddr_t
ipaddr_t
ipha_t
See RFC 791

IPv4 traffic only,
/args[1]->ip_ver == 4/
args[3]ipv6info_t * args[3]->ipv6_ver
args[3]->ipv6_tclass
args[3]->ipv6_flow
args[3]->ipv6_plen
args[3]->ipv6_next
args[3]->ipv6_hlim
args[3]->ipv6_src
args[3]->ipv6_dst
args[3]->ipv6_hdr
uint8_t
uint8_t
uint32_t
uint16_t
uint8_t
uint8_t
in6_addr_t
in6_addr_t
struct ip6_hdr
See RFC 2460

IPv6 traffic only,
/args[1]->ip_ver == 6/
args[2]tcpinfo_t * args[2]->tcp_sport
args[2]->tcp_dport
args[2]->tcp_seq
args[2]->tcp_ack
args[2]->tcp_offset
args[2]->tcp_flags
args[2]->tcp_window
args[2]->tcp_data
in_port_t
in_port_t
uint32_t
uint32_t
uint8_t
uint8_t
uint16_t
uintptr_t
See RFC 793
args[2]udpinfo_t * args[2]->udp_sport
args[2]->udp_dport
args[2]->udp_length
args[2]->udp_data
in_port_t
in_port_t
uint16_t
uintptr_t
See RFC 768
args[2]icmpinfo_t * args[2]->icmp_type
args[2]->icmp_code
uchar_t
uchar_t
See RFC 792

The info structures are new, and will be provided by DTrace. They have been based on the RFCs so that their contents are widely understood.

Example Usage

The following are demonstrations how the net provider could be used to answer various common observability questions.

  1. New inbound TCP connections by port
  2. New inbound TCP connections by source IP address
  3. Inbound TCP packets by local port
  4. Inbound TCP packets by remote address
  5. Inbound TCP packets by remote address, for a specific port, eg: port 80 (HTTP)
  6. Sent TCP bytes by local port
  7. TCP bytes by address and port
  8. TCP bytes by address and port, per second
  9. Detect TCP connect() scan by IP address
  10. Detect TCP SYN stealth scan by IP address
  11. Detect TCP null scan by IP address
  12. Detect TCP FIN stealth scan by IP address
  13. Detect TCP Xmas scan by IP address
  14. Network Utilization by TCP Port
  15. Connect Latency
  16. 1st Byte Latency
  17. Throughput Latency
  18. Connection Lifespan
  19. TCP Saturation Drops by IP address

Examples: General

New inbound TCP connections by port

# dtrace -n 'tcp:::accept-established { @[args[2]->tcp_dport] = count(); }'
dtrace: description 'tcp:::accept-established' matched 1 probe
^C

       79                1
       22                2
      515                5
       80               94

This one-liner shows new inbound TCP connections by port. The sample output shows 94 new connections occured on port 80, HTTP.

New inbound TCP connections by source IP address

# dtrace -n 'tcp:::accept-established { @[args[1]->ip_saddr] = count(); }'
dtrace: description 'tcp:::accept-established' matched 1 probe
^C

  192.168.1.31                                                      1
  fe80::214:4fff:fe3b:76c8                                          2
  192.168.1.2                                                       7
  192.168.1.51                                                      9
  192.168.1.8                                                      11
  192.168.7.100                                                    17
  192.168.101.101                                                  31

This one-liner shows new inbound connections by source ip address, which is the string form of either an IPv4 or IPv6 address. Here the host 192.168.101.101 has made 31 successful connections to our server while DTrace was tracing.

Inbound TCP packets by local port

# dtrace -n 'tcp:::receive { @pkts[args[2]->tcp_dport] = count(); }'
dtrace: description 'tcp:::receive' matched 1 probe
^C

       79                2
      515                9
       22               32
     2049              121
       80             8462

This one-liner shows inbound TCP packets by port. The sample output shows that 8462 packets arrived for port 80, HTTP.

Inbound TCP packets by remote address

# dtrace -n 'tcp:::receive { @pkts[args[1]->ip_daddr] = count(); }'
dtrace: description 'tcp:::receive' matched 1 probe
^C

  192.168.1.8                                                       9
  fe80::214:4fff:fe3b:76c8                                         12
  192.168.1.51                                                     32
  192.168.1.8                                                      54
  192.168.7.3                                                     121
  192.168.101.101                                                 192

This one-liner shows which hosts TCP packets have been received from.

Inbound TCP packets by remote address, for a specific port, eg: port 80 (HTTP)

# dtrace -n 'tcp:::receive /args[2]->tcp_dport == 80/ {
        @pkts[args[1]->ip_daddr] = count();
}'
dtrace: description 'tcp:::receive' matched 1 probe
^C

  192.168.1.8                                                       9
  fe80::214:4fff:fe3b:76c8                                         12
  192.168.1.51                                                     32
  192.168.1.8                                                      54
  192.168.7.3                                                     121
  192.168.101.101                                                 192

This short script shows which hosts are sending packets to TCP port 80, the web server. This script can easily be tweaked to show which hosts are sending packets to any of the local services, such as DNS, NFS, etc. RPC services can be resolved by comparing output with an "rpcinfo -p".

Sent TCP bytes by local port

# dtrace -n 'tcp:::send { 
        @bytes[args[2]->tcp_sport] = sum(args[1]->ip_plength - args[2]->tcp_offset);
}'
dtrace: description 'tcp:::sent' matched 1 probe
^C

      515               16
       79              204
       22              427
     2049             8264
       80           108268

The DTrace oneliner subtracts the TCP header length from the IP payload length, which provides is with the TCP payload length. The output shows that 108,268 bytes (106 Kbytes) were sent from port 80, HTTP, during the time DTrace was tracing.

TCP bytes by address and port

# dtrace -n '
tcp:::receive {
        @bytes[args[1]->ip_saddr, args[2]->tcp_dport] = 
            sum(args[1]->ip_plength - args[2]->tcp_offset);
}
tcp:::send {
        @bytes[args[1]->ip_daddr, args[2]->tcp_sport] =
            sum(args[1]->ip_plength - args[2]->tcp_offset);
}'
dtrace: description 'tcp:::receive' matched 2 probes
^C

  192.168.1.8                                           22                    68
  fe80::214:4fff:fe3b:76c8                              22                   582
  192.168.1.6                                           80                  1216
  192.168.101.101                                      512                  2082
  192.168.1.1                                           22                  2187
  192.168.1.51                                          80                  3763
  192.168.3.5                                          109                  4080
  192.168.1.8                                           80                  8039
  192.168.7.3                                           80                 10921
  192.168.1.1                                         2049                 54080
  192.168.101.101                                       80                287623

This provides a report of IP address, TCP port and bytes transferred. This provides an immediate view of which hosts are communicating with our web server, and which ports they are using. During the above sample, host 192.168.101.101 transferred 287623 bytes with our port 80, HTTP.

TCP bytes by address and port, per second

# dtrace -qn '
tcp:::receive {
        @bytes[args[1]->ip_saddr, args[2]->tcp_dport] =
            sum(args[1]->ip_plength - args[2]->tcp_offset);
}
tcp:::send {
        @bytes[args[1]->ip_daddr, args[2]->tcp_sport] =
            sum(args[1]->ip_plength - args[2]->tcp_offset);
}
profile:::tick-1sec {
        printf("\n   %-32s %16s\n", "HOST", "BYTES/s");
        printa("   %-32s %@16d\n", @bytes);
        trunc(@bytes);
}
'
dtrace: description 'tcp:::receive' matched 3 probes

   HOST                                     BYTES/s
   192.168.1.8                                   22
   fe80::214:4fff:fe3b:76c8                      54
   192.168.3.5                                  516
   192.168.2.11                                8142

   HOST                                     BYTES/s
   fe80::214:4fff:fe3b:76c8                      54
   192.168.2.11                                 148
   192.168.3.5                                  686
   192.168.101.101                            60872

   HOST                                     BYTES/s
   192.168.1.8                                    6
   192.168.3.5                                 8044
   192.168.101.101                           826306
...

This DTrace script provides a rolling summary of which hosts have transferred how many bytes by TCP, per second. This makes for a useful tool to immediately identify which host is responsible for consuming network bandwidth. Various enhancements to this script would include adding a field for the destination port, and printing in Kbytes instead of bytes.

Examples: Security Monitoring

While other options such as installing a NIDS (such as snort) would usually offer a better long term traffic monitoring solution, DTrace can be handy for short term traffic analysis.

Detect TCP connect() scan by IP address

# dtrace -n 'tcp:::accept-refused { @num[args[1]->ip_saddr] = count(); }'
dtrace: description 'tcp:::accept-refused' matched 1 probe
^C

  192.168.1.1                                                         1
  220.241.6.37                                                      307

The accept-refused probe fires when a connection is attempted to a closed port. Host 220.241.6.37 attempted 307 times to connect to closed ports, which is likely to be malicious activity such as port scanning. Host 192.168.1.1 had only 1 refused connection, which may be normal activity depending on what their OS is doing.

Detect TCP SYN stealth scan by IP address

# dtrace -n '
tcp:::receive
/syn[args[0]->tcps_cid] && (args[2]->tcp_flags & TH_RST)/
{
        @num["TCP_stealth_scan", args[1]->ip_saddr] = sum(1);
}
tcp:::receive
{
        syn[args[0]->tcps_cid] = 0;
}
tcp:::receive
/args[2]->tcp_flags & TH_SYN && ! (args[2]->tcp_flags & TH_ACK)/
{
        syn[args[0]->tcps_cid]++;
}
'
dtrace: description 'tcp:::receive' matched 2 probes
^C

  192.168.3.5                                                         1
  220.241.6.37                                                        4

This script (or one similar to this) detects half connections - SYNs that are followed by a RST, a sign of a TCP SYN stealth scan. In the above output, host 220.241.6.37 has made 4 half connections, and is most probably scanning our host (or this is a spoofed IP address). The host 192.168.3.5 made 1 half connection, which is not suspicious as this may have completed had we traced for longer. If you are being stealth scanned, you will see many more accept-refused events that appear like a connect scan, and only a few half connections for the ports you have open.

Detect TCP null scan by IP address

# dtrace -n '
tcp:::receive
/args[2]->tcp_flags == 0/
{
        @num[args[1]->ip_saddr] = sum(1);
}'
dtrace: description 'tcp:::receive' matched 1 probe
^C

  220.241.6.37                                                      197

The net provider can be easily used to analyse esoteric TCP traffic, such as null scans. Here null scan packets by host are reported, identifying 220.241.6.37 as sending 197 TCP null packets.

Detect TCP FIN stealth scan by IP address

# dtrace -n '
tcp:::receive
/args[0]->tcp_state != TCPS_ESTABLISHED && args[2]->tcp_flags & TH_FIN/
{
        @num[args[1]->ip_saddr] = sum(1);
}'
dtrace: description 'tcp:::receive' matched 1 probe
^C

  220.241.6.37                                                       45

Here stealth FIN scans are identified by matching on packets to non-established connections (using the TCP state from args[0]), and also matching on the TCP FIN flag. Our evil host sent 45 of these packets. There may certainly be other ways available to identify these scans.

Detect TCP Xmas scan by IP address

# dtrace -n '
tcp:::receive
/args[2]->tcp_flags == (TH_URG + TH_PUSH + TH_FIN)/
{
        @num[args[1]->ip_saddr] = sum(1);
}'
dtrace: description 'tcp:::receive' matched 1 probe
^C

  220.241.6.37                                                      167

There are variations on the Xmas scan, and many different ways to write predicates to match the TCP flags. Here we found 167 likely Xmas packets.

It would be straightforward to write a DTrace script to monitor for many types of malicious TCP traffic, and produce a report of host and traffic type; and as such can serve as a temporary Host based NIDS (Network Intrusion Detection System). Producing such reports without the DTrace net provider usually involves installing additional software, which either sets the network interface to promiscuous mode and analysip_saddr)] = sum(1); }'s every packet, such as snort; or processes the output of network capture files, such as snoop's RFC1761 or libpcap.

Examples: Performance Monitoring

Network Utilization by TCP Port

# dtrace -n '
tcp:::receive {
        /* assume 1 packet with standard ether + IP + TCP header */
        @util[args[2]->tcp_dport] = sum(args[1]->ip_plength + 14);
}
tcp:::send {
        /* assume 1 packet with standard ether + IP + TCP header */
        @util[args[2]->tcp_sport] = sum(args[1]->ip_plength + 14);
}
profile:::tick-1sec {
        /* many assumptions, including 1 NIC @ 100 Mbps */
        normalize(@util, 10000000)
        printf("   %12s %12s\n", "PORT", "%UTIL");
        printa("   %12d %@12d\n", @util);
        trunc(@util);
}'
dtrace: description 'tcp:::receive' matched 3 probes
^C

           PORT       %UTIL
             22           0
           2049           2
             80          12

           PORT       %UTIL
           2049           0
             22           1
             80          23
...

This is a short and crude script based on numerous assumptions, however it will serve to provide an approximation for a single 100 Mbps NIC's utilization by port. While we prefer things to be 100% accurate, this is far better than nothing at all - so long as we bear in mind that the values are an approximation.

Connect Latency

# dtrace -n '
tcp:::connect-request
{
        start[args[0]->tcps_cid] = timestamp;
}

tcp:::connect-established
/start[args[0]->tcps_cid]/
{
        @latency["Connect Latency (ns)", args[1]->ip_daddr] =
            quantize(timestamp - start[args[0]->tcps_cid]);
        start[args[0]->tcps_cid] = 0;
}'
dtrace: description 'tcp:::connect-request' matched 2 probes
^C

  Connect Latency (ns)                                 192.168.101.101

           value  ------------- Distribution ------------- count    
           32768 |                                         0        
           65536 |@@@@@@@                                  2        
          131072 |@@@@@@@@@@@@@@@                          4        
          262144 |@@@@                                     1        
          524288 |@@@@@@@@@@@                              3        
         1048576 |@@@@                                     1        
         2097152 |                                         0        

  Connect Latency (ns)                                  10.1.1.7

           value  ------------- Distribution ------------- count    
         8388608 |                                         0        
        16777216 |@@@@@                                    3        
        33554432 |@@@@@@@@@@@@@@@@                         9        
        67108864 |@@@@@@@@@@@@@@@                          8        
       134217728 |@@@@                                     2        
       268435456 |                                         0        

This script measures outbound connection latency by IP address, and presents this as a distribution plot by nanosecond and count of occurances. We define this to be the time from the SYN to the SYN-ACK (matched by the first receive on the same conn_s *), so most of the TCP handshake costs. The output shows that one IP address is slower to connect than the other. This metric includes network latency and destination kernel overheads.

1st byte Latency

# dtrace -n '
tcp:::connect-established
{
        start[args[0]->tcps_cid] = timestamp;
}

tcp:::receive
/start[args[0]->tcps_cid] && (args[1]->ip_plength - args[2]->tcp_offset) > 0/
{
        @latency["1st Byte Latency (ns)", args[1]->ip_saddr] =
            quantize(timestamp - start[args[0]->tcps_cid]);
        start[args[0]->tcps_cid] = 0;
}'
dtrace: description 'tcp:::send' matched 2 probes
^C

  1st Byte Latency (ns)                                192.168.101.101

           value  ------------- Distribution ------------- count    
           32768 |                                         0        
           65536 |@@@@@@                                   2        
          131072 |@@@@@                                    1       
          262144 |@@@@@@@@@@@@@@@                          4       
          524288 |@@@@@@@@@@@                              3        
         1048576 |@@@@                                     1        
         2097152 |                                         0        

This measures a 1st byte latency on an outbound connection, defined as the time from the TCP handshake completing, to the time that the 1st application byte is returned. This metric includes network latency, destination kernel scheduling latency, and application latency.

Throughput Latency

...

Connection Lifespan

The following script measures the lifespan of a connection in milliseconds.

# cat tcplifespan1.d
#!/usr/sbin/dtrace -qs

dtrace:::BEGIN
{
        printf("Tracing... Hit Ctrl-C to end.\n");
}

tcp:::accept-established
{
        raddr[args[0]->tcps_cid] = args[1]->ip_saddr;
        lport[args[0]->tcps_cid] = args[2]->tcp_dport;
        stime[args[0]->tcps_cid] = timestamp;
}

tcp:::state-closed,
tcp:::state-time-wait
/stime[args[0]->tcps_cid]/
{
        @lifespan[raddr[args[0]->tcps_cid], lport[args[0]->tcps_cid]] =
           quantize((timestamp - stime[args[0]->tcps_cid]) / 1000000);
        raddr[args[0]->tcps_cid] = 0;
        lport[args[0]->tcps_cid] = 0;
        stime[args[0]->tcps_cid] = 0;
}

dtrace:::END
{
        printf("Inbound Connection Lifespan Report (ms),\n\n");
        printa("   %-32I %21P %@d\n", @lifespan);
}

And now running this script:

# ./tcplifespan1.d
^C
Inbound Connection Lifespan Report (ms),

   deimos                                            http
           value  ------------- Distribution ------------- count
               1 |                                         0
               2 |@@@@@@@@@@@@@@@@@@@@@@@                  7
               4 |@@@@@@@@@@                               3
               8 |@@@                                      1
              16 |                                         0
              32 |@@@                                      1
              64 |                                         0

   deimos                                          finger
           value  ------------- Distribution ------------- count
               8 |                                         0
              16 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@         4
              32 |@@@@@@@@                                 1
              64 |                                         0

   deimos                                             ssh
           value  ------------- Distribution ------------- count
            2048 |                                         0
            4096 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 1
            8192 |                                         0

tcplifespan1.d measures the lifespan of an inbound TCP connection and prints a distribution plot. In the above output, we captured a dozen HTTP connections, five finger connections and an SSH connection. One HTTP connection was much slower than the others as it was in the 32 to 63 ms bucket, most of the rest were from 2 to 3 ms. The SSH connection was the slowest (naturally, as this was an interactive session), which was in the 4 to 8 second bucket.

TCP Saturation Drops by IP address

# dtrace -n 'tcp:::drop-saturation {
        @drops[args[1]->ip_saddr, args[1]->ip_daddr] = count();
}'
dtrace: description 'tcp:::drop-saturation' matched 1 probe
^C
  10.1.1.17                         192.168.1.1                             1
  192.168.1.1                       192.168.6.2                             5
  192.168.6.2                       192.168.1.1                             9

This measures TCP drops caused by saturation, by IP address. Nine packets from 192.168.6.2 to 192.168.1.1 were dropped.

Solved Problems

Some of the (numerous) network obserability problems that this provide will solve include:

Comments?

Is this a provider that customers would like? Which impromenents to the design can we make? Please post comments to the dtrace-discuss mailing list.

Last Updated: 16-Oct-2006