Skip to content

Conversation

@rasool-thesis
Copy link
Collaborator

first step just change the paths on the file with a static command

Rmoradi added 2 commits May 26, 2025 13:20
… a structured and maintainable Python implementation:

•	Reimplements the experiment runner logic in Python (experiment_runner.py)
•	Supports dynamic kernel switching and remote rebooting
•	Executes and monitors TRex experiments using TrexDriverCLI4_gpt.py
•	Logs CPU load and traffic data in a structured format
•	Automatically builds test output filenames based on config
•	Simplifies future extensions and config reuse
This update improves readability, cross-platform compatibility, and integration with larger Python-based systems.

Signed-off-by: Rmoradi <[email protected]>
Rmoradi and others added 11 commits May 28, 2025 11:56
- Extracts kernel list, test parameters, and paths into a separate config module
- Allows users to modify testbed settings without touching the main logic
- Supports flexibility for changing duration, packet rate, output directories

Signed-off-by: Rmoradi <[email protected]>
- Converts nested kernel-based experiment control to Python
- Supports dynamic experiment configurations through a config file
- Handles kernel switching, rebooting, and SUT initialization via SSH
- Clean and readable JSON output capturing TX/RX and CPU stats
- Strong structure for reproducibility and multi-kernel testing

Signed-off-by: Rmoradi <[email protected]>
Merge dual-threaded TREX + netrace control into a unified execution thread

- Replaces separate threads with a single CombinedExperiment thread
- Executes TREX locally and netrace remotely in a coordinated flow
- Fetches structured CPU load data via SSH from the remote server
- Parses final CPU results from remote output as JSON
- Outputs total TX/RX and CPU metrics clearly with random ID tagging
- Ensures the code is modular and SSH-secure using Paramiko
Refactor cpu_load_netrace_gpt.py to use JSON and dictionary-based logging

- Parses structured CPU usage data from JSON-formatted netrace output
- Computes mean and standard deviation for EVENT_NET_RX_SOFTIRQ
- Organizes output using a dictionary for cleaner downstream usage
- Adds error handling for malformed JSON or missing fields
- Saves output in a dedicated directory relative to the script location
feat(netrace): convert netrace runner from Bash to Python with JSON config support

- Replaces legacy start_netrace.sh Bash script with structured Python logic
- Reads duration and experiment ID from external JSON configuration
- Creates output directory automatically and saves logs with consistent naming
- Starts netrace using subprocess, handles duration and graceful SIGINT
- Improves portability, maintainability, and ease of customization
Copy link
Contributor

@plungaroni plungaroni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @rasool-thesis I saw your code and made some comments.
See if you can address some issues.

Maybe @skorpion17 can take a look in the meantime.

private_key = paramiko.RSAKey.from_private_key_file(self.pkey_path)
ssh.connect(self.remote, port=self.port, username=self.user, pkey=private_key)

script_dir = "/users/Rmoradi/pastrami/scripts"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you hardcode the path?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ohh I did not see it, ok I changed it in updates.


script_dir = "/users/Rmoradi/pastrami/scripts"
command = f"cd {script_dir} && python3 cpu_load_netrace_gpt.py {self.cpu_id} {self.duration} {self.rnd_id} dummy_path"
stdin, stdout, stderr = ssh.exec_command(command)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you checked that this command is not blocking?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes absolutely, you are right if it goes more than 30 seconds there will be problem. I solved it in new update with its commit.

driver = TrexDriver(self.server, self.txPort, self.rxPort, self.pcap, self.rate, self.duration)
self.output_local = driver.run()

self.output_remote = stdout.read().decode() + stderr.read().decode()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the code on line 40 is not blocking, how are you sure the data is available?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

solved


print("cpu_load_output:")

last_brace_open = output_remote.rfind('{')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you do this?
if output_remote is json isn't there a simpler way to check if it exists and parse it?


args = parser.parse_args()

combined_thread = CombinedExperiment(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I made two threads was because both cpu_netrace and Trex were blocking.

If it is still like that I have the impression that, as the code is written, first it executes cpu_netrace (and the program blocks) finished netrace starts Trex.

So, if I am right, the two measurements are sequential and not parallel.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it in update now they work in parallel.

"python3", os.path.join(SCRIPTS_DIR, "TrexDriverCLI4_gpt.py"),
"-s", "127.0.0.1",
"-r", IP_REMOTE,
"-c", "4",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The -c parameter represents the CPU_ID of the cpu to monitor.

In our implementation we chose to use cpu number 4 for our experiments in cloudlab, but this is not true in general.

It would be good if this parameter was editable and not hardcoded.

"-s", "127.0.0.1",
"-r", IP_REMOTE,
"-c", "4",
"-o", "22",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same thing for parameters -o, -u, --txPort, --txPort
(lines 17, 18, 19, 20)

print(result)

pid = extract_value(result, "random_id:")
duration_sec = extract_value(result, "duration_sec:")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see TrexDriverCLI4_gpt_onethread.py leaving duration_sec: as a result.

Can you check, please?

result = subprocess.check_output(cmd, text=True)
print(result)

pid = extract_value(result, "random_id:")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this pid?

And why do you associate it with random_id?

exp_record = [
[rate, 4],
{
"PID": pid,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are getting a bit confused here.

What do we do with this pid, which is a random id?

Don't we have to take the values ​​of transmitted and received packets from Trex?

(txPackets, rxPackets)

Please check this part carefully and, if it can help you, also compare with my original program in bash.

Here I don't see how the two most important parameters of the whole experiment are captured and recorded.

- Extended the user-space netrace tool to report per-CPU processing delay and average delay for network softirqs.
- struct record updated to include arrays for latest delay, total delay, and delay count per CPU.
- stats_print() now outputs per-CPU load, last observed delay, and average delay (all in ms).
- Updated skeleton and header includes to match netrace-com BPF changes.
- Improved allocation and cleanup logic to handle new statistics fields.
- All changes fully commented for maintainability and team review.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you load a binary file?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as before when we start nerace we use it for start. in start_netrace we use this with command /root/netrace | tee cpu_load_exp_id_${EXP_ID}.txt & to start , so I compile the file and load it to use instead of the netrace file as before.

creating the IPV4 python file for testing 64B packets
The output of the test in SCV format
script of the IPV6 for testing the 64B packets
creating the test script act in L2 with 64B packets
create script in IPV4 and 1500B packets
create the script in L2 with 1500B packets.
output of the test script in 64B packets in L2
output of IPV6 with 64B packets
output of test in IPV4 1500B
test output in 1500B L2
With this Python file, each test script can run easily through Trex
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants