Skip to content

Latest commit

 

History

History
 
 

integration

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 

Arrow integration testing

Our strategy for integration testing between Arrow implementations is as follows:

  • Test datasets are specified in a custom human-readable, JSON-based format designed for Arrow

  • Each implementation provides a testing executable capable of converting between the JSON and the binary Arrow file representation

  • The test executable is also capable of validating the contents of a binary file against a corresponding JSON file

Environment setup

The integration test data generator and runner is written in Python and currently requires Python 3.5 or higher. You can create a standalone Python distribution and environment for running the tests by using miniconda. On Linux this is:

MINICONDA_URL=https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
wget -O miniconda.sh $MINICONDA_URL
bash miniconda.sh -b -p miniconda
export PATH=`pwd`/miniconda/bin:$PATH

conda create -n arrow-integration python=3.6 nomkl numpy six
conda activate arrow-integration

If you are on macOS, instead use the URL:

MINICONDA_URL=https://repo.continuum.io/miniconda/Miniconda3-latest-MacOSX-x86_64.sh

After this, you can follow the instructions in the next section.

Running the existing integration tests

First, build the Java and C++ projects. For Java, you must run

mvn package

Now, the integration tests rely on two environment variables which point to the Java arrow-tool JAR and the build path for the C++ executables:

JAVA_DIR=$ARROW_HOME/java
CPP_BUILD_DIR=$ARROW_HOME/cpp/build

VERSION=0.11.0-SNAPSHOT
export ARROW_JAVA_INTEGRATION_JAR=$JAVA_DIR/tools/target/arrow-tools-$VERSION-jar-with-dependencies.jar
export ARROW_CPP_EXE_PATH=$CPP_BUILD_DIR/debug

Here $ARROW_HOME is the location of your Arrow git clone. The $CPP_BUILD_DIR may be different depending on how you built with CMake (in-source or out-of-source).

Once this is done, run the integration tests with (optionally adding --debug for additional output)

python integration_test.py

python integration_test.py --debug  # additional output