The following files provide convenience when running different experiments:
-
build_color_summaries.jl
- Implements abuild_experiments
function which builds summaries for allDataGraphs
included in the specifiedExperimentParams
. These are stored as.csv
files. -
convert_to_gcare.jl
- Implements aconvert_dataset
function which takes a.graph
file (following the Subgraph-Matching format) and converts it into a.txt
file (following the G-Care format). -
Experiments.jl
- Provides the overall Experiments harness of necessary packages and files. Also defines the timeout constant. -
gcare_summary.py
- Code which converts G-Care's custom output format into aDataFrame
easily parsed in Julia. -
get_true_cardinalities.jl
- Implements acalculate_true_cardinalities
function obtains the true cardinality for a given dataset from a cardinality file, or calculates the exact cardinality if the file does not exist. Also has a function to verify true cardinalities. -
graph_results.jl
- Provides different ways to graph results using the outputDataFrame
from an experiment. -
load_datasets.jl
- Implements aload_dataset
function which creates the appropriateDataGraph
for the given dataset. -
load_querysets.jl
- Implements aload_querysets
function which creates a dictionary mapping each input dataset to its corresponding list ofQueryGraph
s. -
run_estimators.jl
- Implements arun_estimation_experiments
function which for each givenExperimentParams
creates a.csv
file describing the results after running the experiment. -
run_submitted_experiments.sh
- This is a bash script used to run the experiments described in the submitted paper and save the corresponidng figures. -
utils.jl
- Defines enums for different datasets as well as theExperimentParams
used to describe parameter settings for different experiments.
Generally, to run an experiment, one simply needs to create ExperimentParams
describing the parameters for the experiment, call build_experiments
to build the summaries used in the experiment, call run_estimation_experiments
to obtain the results for the estimation, then choose an appropriate function from graph_results.jl
to display the results. Multiple examples of this process are included in Experiments/Scripts
.
There are a variety of Julia scripts provided in the Experiments/Scripts
folder which perform different experiments then save figures presenting the results to the Experiments/Results/Figures
folder.
For example, to find how using different maximum cycle sizes in the summary affects the cardinality estimation, the max_cycle_size.jl
script can be called from the main directory:
$ julia Experiments/Scripts/max_cycle_size.jl
The Experiments/Results
folder contains a folder for figures, but also any saved ColorSummary
or cardinality estimation result from using the experiment utility files will be saved here.
After calling build_experiments
, for each ExperimentParams
, a .csv
file storing information about the ColorSummary
-building process is stored in Experiments/Results
. After the first line with the header, each row of the file stores the dataset, partitioner, number of colors, build phase, time spent in that build phase, and the overall memory footprint. These values are used in figures evaluating the performance of the framework.
Meanwhile, the actual serialized summaries are stored in Experiments/SerializedSummaries
to be used in run_estimation_experiments
.
After calling run_estimation_experiments
, an estimation result .csv
file will be saved in Experiments/Results
. After the first line with the header, each row of the file stores the cardinality estimate, true cardinality, estimation time, query type (if included in the given query file), file path to the query file, boolean representing if the estimation failed to compute, and the path width of the query. These values are used in figures evaluating the accuracy of the framework.