This folder contains various reproducible and interactive Jupyter notebooks created for the project. These notebooks are also available for viewing on our public JupyterHub instance. The notebook organizational structure is described in detail below.
The notebooks defined in the data-sources
folder, explore and analyze the Sippy and TestGrid data sources, as well as log output from OpenShift CI.
- Sippy - A Continuous Integration Private Investigator tool to process the job results from https://testgrid.k8s.io/. It reports on which tests fail most frequently along different dimensions such as by job, by platform, by sig etc.
- TestGrid - A highly-configurable, interactive dashboard for viewing your test results in a grid.
This folder contains:
-
a
Sippy
folder, which consists of:sippy_failure_correlation.ipynb
- Notebook which analyzes the available Sippy CI data and determine which test failures appear to be correlated with each other- a
stage
folder which contains some new features and exploratory work we are trying out. It consists of:sippy_EDA.ipynb
- Notebook which uncovers and understands the Sippy data setsippy-analysis.ipynb
- Notebook analyzing the OpenShift CI test/job data from the testgrid dashboards that Sippy provides
-
a
TestGrid
folder which consists of:Metrics
- This folder contains notebooks that define, calculate, and save several KPIs that we believe are relevant to various personas (developer, manager, etc.) involved in the CI process.testgrid_EDA.ipynb
- Notebook which explores the existing TestGrid data at testgrid.k8s.io, giving specific attention to Red Hat's OpenShift CI dashboards.testgrid_indepth_EDA.ipynb
- Notebook which follows up on the above notebook and provides additional insights to the testgrid data.testgrid_metadata_EDA.ipynb
- Notebook which explores metadata present at a Test level within the existing TestGrid data at testgrid.k8s.io.- a
background
folder which contains:testgrid_feature_confirmation.ipynb
- Notebook determining if the testgrid features analzed in the testgrid_EDA.ipynb notebook are uniform across grids.
-
a
gcsweb-ci
folder which consists of:- a
build-logs
folder to work with build logs from OpenShift CI jobs:build_log_EDA.ipynb
- Notebook that can download build log data and provides an overview of it.build_log_term_freq.ipynb
- Notebook that applies term frequency analysis to build logs
- a
The notebooks defined in the failure-type-classification
folder focuses on addressing the problem of automating the task of test failure classification with TestGrid data. Failures which occur in a test can be legitimate or due to some other issues like an infrastructure flake, install flake, flaky test, or some other type of failure. Unsupervised machine learning methods and heuristics are explored in these notebooks to classify the test failures. The notebooks are organized into:
-
a
background
folder which consists of:testgrid_flakiness_detection.ipynb
- Notebook which tries to detect one of the test failure types such as a Flaky test
-
a
stage
folder which consists of:failure_type_classifier.ipynb
- Notebook for analyzing testgrids and generating a report that will identify the tests and dates where 4 different types of failures may have occurred