Skip to content

Instantly share code, notes, and snippets.

@deusebio
Created September 19, 2023 08:46
Show Gist options
  • Save deusebio/e3cd7a1286144eac5bba475cbfe64fcf to your computer and use it in GitHub Desktop.
Save deusebio/e3cd7a1286144eac5bba475cbfe64fcf to your computer and use it in GitHub Desktop.
Parse tcpdump output
import pandas as pd
import re
space_splitter = re.compile("\s+")
regex = re.compile("\s*(.*)\s*>\s*(.*?):\s.*")
def parse_line(line):
try:
elements = space_splitter.split(line)
source_dest = regex.match(" ".join(elements[4:])).groups()
except Exception:
return None
return tuple(elements[:4]) + tuple(x.strip() for x in source_dest)
with open("./output.txt", "r") as fid:
lines = fid.readlines()
items = [parsed + (idx,) for idx, line in enumerate(lines) if (parsed := parse_line(line))]
non_records = [idx for idx, line in enumerate(lines) if not parse_line(line)]
df = pd.DataFrame.from_records(items, columns=["time", "network", "direction", "class", "source", "target", "idx"])
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment