This directory contains automated integration tests for the Datadog Agent. It currently tests Agent packaging (installing, updating, and removing the Agent), as well as basic functionality. The tests are executed on virtual machines that span most of the supported Agent platforms. They're currently run on Azure VMs.
This is meant to be executed internally at Datadog and references our internal build structure.
However, people outside of Datadog can run it by editing the repository URL that they are uploaded to in the .kitchen-azure.yml file. You will also need to ensure that the repo branch is pipeline-$CI_PIPELINE_ID
.
Non-bundled dependencies:
Then install bundled gem dependencies:
bundle config set --local path '.'
bundle config set --local gemfile './Gemfile.local'
bundle install
Note: When building on macOS M1, you might run into an error when installing the ffi-yajl
gem. You should be able to get around that by setting the build ldflags
for this gem in bundler configuration (see this Github Issue):
bundle config build.ffi-yajl --with-ldflags="-Wl,-undefined,dynamic_lookup"
These tests are set up to be run on Azure. It is set up to use our automated testing. However, there are also provisions to run it locally.
You will need to create a service principle. You can do that with the following command on the azure cli tool:
az ad sp create-for-rbac --name <name>
Then create a file called azureid.sh
and export the relevant settings into environment variables:
The subscription ID is retrieved from the azure portal, by starting a azure shell. --or-- the "id" returned from az account list
The AZURE_CLIENT_ID is returned as the appId from the above command
the AZURE_CLIENT_SECRET is returned as the password from the above command
the AZURE_TENANT_ID is returned as tenant from the above command --or-- the "tenantId" returned from az account list
export AZURE_CLIENT_ID="$AZURE_CLIENT_ID"
export AZURE_CLIENT_SECRET="$AZURE_CLIENT_SECRET"
export AZURE_TENANT_ID="$AZURE_TENANT_ID"
export AZURE_SUBSCRIPTION_ID="$AZURE_SUBSCRIPTION_ID"
export CI_PIPELINE_ID="$CI_PIPELINE_ID"
export NOT_GITLAB="true"
If for some reason you need to find another version of a specific image you will need to use the Azure CLI.
This will list all the images available on Azure (and take ~10min to run)
az vm image list --all --output table
This will list all the images available on Azure for a specific OS (and take ~2min to run)
az vm image list --offer Ubuntu --all --output table
To see the rest of the rake commands, execute
rake -T
For finer control, you can invoke test-kitchen directly
To list all available tests, execute
kitchen list
And to run one of these tests, execute
kitchen test <test-instance>
For more kitchen commands, execute
kitchen help
Tests are composed of platforms and suites. A suite is a series of commands to execute (usually in the form of Chef recipes) followed by a series of tests that verify that the commands completed as expected. Platforms are systems environments in which the suites are executed. Each test suite is executed on each platform.
To generate a test suite, the platforms to be tested are passed in via the TEST_PLATFORMS environment variable. The exact format of the list that's supplied in this variable will vary depending upon which driver (provider) is being used.
In Azure (which is what is used in the ci pipeline), the list of platforms
is created in the top level .gitlab-ci.yaml, and the syntax is of the form:
short_name1,azure_full_qualified_name1|short_name2,azure_full_qualified_name2
Each driver (provider) will have a slightly different format, which is described in the driver-specific yaml (drivers/azure-driver.yml, drivers/ec2-driver, etc.)
The release candidate is the latest Agent built. In other words, they are the most up-to-date packages on our staging (datad0g.com) deb (unstable branch) and yum repositories.
Agent packaging methods tested:
chef
: Installing the latest release candidate using the chef-datadog. The recipe will determine if the base Agent should be installed instead of the regular one (based on the system version of Python).upgrade-agent6
: Installs the latest release Agent 6 (the latest publicly available Agent 6 version), then upgrades it to the latest release candidate using the platform's package manager.upgrade-agent5
: Installs the latest release Agent 5 (the latest publicly available Agent 5 version), then upgrades it to the latest Agent 6 release candidate using the platform's package manager.install-script
: Installs the latest release candidate using our Agent install script.step-by-step
: Installs the latest release candidate using our step-by-step installation instructions as listed on Dogweb.
For each platform, for each packaging method, the following Agent tests are run (in the order listed):
- The Agent is running:
- the Agent is running after installation is complete
- the configuration has been correctly placed (where applicable)
- the info command output does not contain 'ERROR' and exits with a non-zero value
- The Agent stops:
- the
stop
command does not error when the Agent is running - no Agent processes are running after the Agent has been stopped
- the Agent starts after being stopped
- the
- The Agent restarts:
- the
restart
command does not error when the Agent is running - the Agent is running after a previously-running Agent it is restarted
- the
restart
command does not error when the Agent is stopped - the Agent is running after a previously-stopped Agent it is restarted
- the
- The Agent is removed (if installed using a package manager):
- removing the Agent using the package manager should not error
- the Agent should not be running after it is removed
- the Agent binary should not be present after removal
For the upgrade
method, the version of the agent after the upgrade is tested.
Be sure to set the DD_AGENT_EXPECTED_VERSION
environment variable to the version the agent
should be upgraded to (for instance export DD_AGENT_EXPECTED_VERSION='5.5.0+git.213.59ac9da'
).
This suite covers the installing, upgrading, and removing the Agent on most platforms. The following, among other aspects, are not covered:
- Updates from versions of the Agent prior to the latest publicly available
version. This can easily be added by setting the ['datadog']['agent_version']
attribute to the version that you wish to upgrade from in the
upgrade
suite - Not all supported operating systems are tested. Missing operating systems
include:
- Mac OS 10.x
- Amazon Linux
- Memory leaks or any obscurities that result from the Agent running for a long time, and so short term tests like these will not catch them
- Sending metrics/events over Dogstatsd via client libraries
- Changes made to an environment by a user (no sudo present, system Python removed, supervisor installed etc.)
For basic test-kitchen usage, see its Getting Started guide.
Tests are executed on Azure in ci. Platforms are defined in the OS dependent files kitchen testing definitions in /.gitlab/kitchen_testing/.yaml
Suites define the commands that should be run on a platform as Chef recipes,
and tests that verify the outcome of these commands written in RSpec. Suites
are defined in specific yaml files in the test-definitions
directory.
Add new cookbooks by describing them in Berksfile
. If you want to write your
own cookbook that is specific to this repository, place it in the
site-cookbooks
directory. New cookbooks will not be available in your tests
until rake berks
is executed.
Tests should be placed in test/integration/<suite-name>/rspec/
. They will be
copied and executed on a platform after the suite's recipes have been applied
to the environment. They are not managed by Berks
. Code that can be shared
between suites is in test/integration/common/rspec/
.
Tests do not need to be written in RSpec; Busser, the framework that executes the tests, supports several different testing frameworks (including Bats and minitest).
To support running multiple kitchen suites using multiple back-end providers, the kitchen configuration file (kitchen.yaml) is created dynamically by combining three provided yaml files.
A complete kitchen configuration (kitchen.yaml) is created by taking one of the driver files, adding the common configuration options platforms-common, and then adding the desired test suite(s) from the test-definitions directory.
There is an invoke task for creating the completed kitchen.yml
. The usage for
the invoke task is:
invoke kitchen.genconfig --platform=platform --osversions=osversions --provider=provider --arch=arch --testfiles=testfiles
where: platform is an index into the platforms.json. It is (currently) one of:
- centos
- debian
- suse
- ubuntu
- windows
- amazonlinux
osversions is an index into the per-provider dictionary for the given platform. Examples include
- win2012r2 (which is in both windows/azure and windows/ec2)
- ubuntu-20-04 (which is in ubuntu/azure)
Provider is the kitchen driver to be used. Currently supported are
- azure
- ec2
- vagrant
- hyperv (with a user-supplied platforms file)
arch is:
- x86_64
- arm64
Testfiles is the name of the test-specific file(s) (found in test-definitions ) to be added. The testfiles define the tests that are run, on what OS, on the given provider.
An example command would be invoke kitchen.genconfig --platform ubuntu --osversions all --provider azure --arch x86_64 --testfiles install-script-test
This will generate a kitchen.yml which executes the install-script-test
on all of the defined ubuntu
OS images in azure.
When run in CI, the gitlab job will set up the TEST_PLATFORMS
environment variable,
and then concatenate the azure driver, platforms-common,
and the desired test suite file (for example upgrade7-test).
Test kitchen then expands out each of the TEST_PLATFORMS
into a kitchen platform, and runs
each suite on each provided platform
To run kitchen locally, either to do broad tests on a given build or to develop the kitchen tests themselves, additional driver files are provided for using AWS EC2, Hyper-V, or vagrant as the back end.
To create a kitchen file, take the same steps as above. Combine the desired driver file,
the common file, and the desired test suite(s). Using erb
as a manual step will generate
a kitchen file that can be reused.
At present, the EC2 and Hyper-V driver files have only been tested for Windows targets.