Stateful testing
With @given
, your tests are still something that
you mostly write yourself, with Hypothesis providing some data.
With Hypothesis’s stateful testing, Hypothesis instead tries to generate
not just data but entire tests. You specify a number of primitive
actions that can be combined together, and then Hypothesis will
try to find sequences of those actions that result in a failure.
Tip
Before reading this reference documentation, we recommend reading How not to Die Hard with Hypothesis and An Introduction to Rule-Based Stateful Testing, in that order. The implementation details will make more sense once you’ve seen them used in practice, and know why each method or decorator is available.
Note
This style of testing is often called model-based testing, but in Hypothesis is called stateful testing (mostly for historical reasons - the original implementation of this idea in Hypothesis was more closely based on ScalaCheck’s stateful testing where the name is more apt). Both of these names are somewhat misleading: You don’t really need any sort of formal model of your code to use this, and it can be just as useful for pure APIs that don’t involve any state as it is for stateful ones.
It’s perhaps best to not take the name of this sort of testing too seriously. Regardless of what you call it, it is a powerful form of testing which is useful for most non-trivial APIs.
You may not need state machines
The basic idea of stateful testing is to make Hypothesis choose actions as well as values for your test, and state machines are a great declarative way to do just that.
For simpler cases though, you might not need them at all - a standard test
with @given
might be enough, since you can use
data()
in branches or loops. In fact, that’s
how the state machine explorer works internally. For more complex workloads
though, where a higher level API comes into it’s own, keep reading!
Rule-based state machines
- class hypothesis.stateful.RuleBasedStateMachine[source]
A RuleBasedStateMachine gives you a structured way to define state machines.
The idea is that a state machine carries the system under test and some supporting data. This data can be stored in instance variables or divided into Bundles. The state machine has a set of rules which may read data from bundles (or just from normal strategies), push data onto bundles, change the state of the machine, or verify properties. At any given point a random applicable rule will be executed.
A rule is very similar to a normal @given
based test in that it takes
values drawn from strategies and passes them to a user defined test function,
which may use assertions to check the system’s behavior.
The key difference is that where @given
based tests must be independent,
rules can be chained together - a single test run may involve multiple rule
invocations, which may interact in various ways.
Rules can take normal strategies as arguments, but normal strategies, with
the exception of runner()
and
data()
, cannot take into account
the current state of the machine. This is where bundles come in.
A rule can, in place of a normal strategy, take a Bundle
.
A hypothesis.stateful.Bundle
is a named collection of generated values that can
be reused by other operations in the test.
They are populated with the results of rules, and may be used as arguments to
rules, allowing data to flow from one rule to another, and rules to work on
the results of previous computations or actions.
Specifically, a rule that specifies target=a_bundle
will cause its return
value to be added to that bundle. A rule that specifies an_argument=a_bundle
as a strategy will draw a value from that bundle. A rule can also specify
that an argument chooses a value from a bundle and removes that value by using
consumes()
as in an_argument=consumes(a_bundle)
.
Note
There is some overlap between what you can do with Bundles and what you can
do with instance variables. Both represent state that rules can manipulate.
If you do not need to draw values that depend on the machine’s state, you
can simply use instance variables. If you do need to draw values that depend
on the machine’s state, Bundles provide a fairly straightforward way to do
this. If you need rules that draw values that depend on the machine’s state
in some more complicated way, you will have to abandon bundles. You can use
runner()
and .flatmap()
to access the instance from a rule: the strategy
runner().flatmap(lambda self: sampled_from(self.a_list))
will draw from the instance variable a_list
. If you need something more
complicated still, you can use data()
to
draw data from the instance (or anywhere else) based on logic in the rule.
The following rule based state machine example is a simplified version of a
test for Hypothesis’s example database implementation. An example database
maps keys to sets of values, and in this test we compare one implementation of
it to a simplified in memory model of its behaviour, which just stores the same
values in a Python dict
. The test then runs operations against both the
real database and the in-memory representation of it and looks for discrepancies
in their behaviour.
import shutil
import tempfile
from collections import defaultdict
import hypothesis.strategies as st
from hypothesis.database import DirectoryBasedExampleDatabase
from hypothesis.stateful import Bundle, RuleBasedStateMachine, rule
class DatabaseComparison(RuleBasedStateMachine):
def __init__(self):
super().__init__()
self.tempd = tempfile.mkdtemp()
self.database = DirectoryBasedExampleDatabase(self.tempd)
self.model = defaultdict(set)
keys = Bundle("keys")
values = Bundle("values")
@rule(target=keys, k=st.binary())
def add_key(self, k):
return k
@rule(target=values, v=st.binary())
def add_value(self, v):
return v
@rule(k=keys, v=values)
def save(self, k, v):
self.model[k].add(v)
self.database.save(k, v)
@rule(k=keys, v=values)
def delete(self, k, v):
self.model[k].discard(v)
self.database.delete(k, v)
@rule(k=keys)
def values_agree(self, k):
assert set(self.database.fetch(k)) == self.model[k]
def teardown(self):
shutil.rmtree(self.tempd)
TestDBComparison = DatabaseComparison.TestCase
In this we declare two bundles - one for keys, and one for values.
We have two trivial rules which just populate them with data (k
and v
),
and three non-trivial rules:
save
saves a value under a key and delete
removes a value from a key,
in both cases also updating the model of what should be in the database.
values_agree
then checks that the contents of the database agrees with the
model for a particular key.
Note
While this could have been simplified by not using bundles, generating
keys and values directly in the save
and delete
rules, using bundles
encourages Hypothesis to choose the same keys and values for multiple
operations. The bundle operations establish a “universe” of keys and values
that are used in the rules.
We can now integrate this into our test suite by getting a unittest TestCase from it:
TestTrees = DatabaseComparison.TestCase
# Or just run with pytest's unittest support
if __name__ == "__main__":
unittest.main()
This test currently passes, but if we comment out the line where we call self.model[k].discard(v)
,
we would see the following output when run under pytest:
AssertionError: assert set() == {b''}
------------ Hypothesis ------------
state = DatabaseComparison()
var1 = state.add_key(k=b'')
var2 = state.add_value(v=var1)
state.save(k=var1, v=var2)
state.delete(k=var1, v=var2)
state.values_agree(k=var1)
state.teardown()
Note how it’s printed out a very short program that will demonstrate the
problem. The output from a rule based state machine should generally be pretty
close to Python code - if you have custom repr
implementations that don’t
return valid Python then it might not be, but most of the time you should just
be able to copy and paste the code into a test to reproduce it.
You can control the detailed behaviour with a settings object on the TestCase (this is a normal hypothesis settings object using the defaults at the time the TestCase class was first referenced). For example if you wanted to run fewer examples with larger programs you could change the settings to:
DatabaseComparison.TestCase.settings = settings(
max_examples=50, stateful_step_count=100
)
Which doubles the number of steps each program runs and halves the number of test cases that will be run.
Rules
As said earlier, rules are the most common feature used in RuleBasedStateMachine.
They are defined by applying the rule()
decorator
on a function.
Note that RuleBasedStateMachine must have at least one rule defined and that
a single function cannot be used to define multiple rules (this to avoid having
multiple rules doing the same things).
Due to the stateful execution method, rules generally cannot take arguments
from other sources such as fixtures or pytest.mark.parametrize
- consider
providing them via a strategy such as sampled_from()
instead.
- hypothesis.stateful.rule(*, targets=(), target=None, **kwargs)[source]
Decorator for RuleBasedStateMachine. Any Bundle present in
target
ortargets
will define where the end result of this function should go. If both are empty then the end result will be discarded.target
must be a Bundle, or if the result should be replicated to multiple bundles you can pass a tuple of them as thetargets
argument. It is invalid to use both arguments for a single rule. If the result should go to exactly one of several bundles, define a separate rule for each case.kwargs then define the arguments that will be passed to the function invocation. If their value is a Bundle, or if it is
consumes(b)
whereb
is a Bundle, then values that have previously been produced for that bundle will be provided. Ifconsumes
is used, the value will also be removed from the bundle.Any other kwargs should be strategies and values from them will be provided.
- hypothesis.stateful.consumes(bundle)[source]
When introducing a rule in a RuleBasedStateMachine, this function can be used to mark bundles from which each value used in a step with the given rule should be removed. This function returns a strategy object that can be manipulated and combined like any other.
For example, a rule declared with
@rule(value1=b1, value2=consumes(b2), value3=lists(consumes(b3)))
will consume a value from Bundle
b2
and several values from Bundleb3
to populatevalue2
andvalue3
each time it is executed.
- hypothesis.stateful.multiple(*args)[source]
This function can be used to pass multiple results to the target(s) of a rule. Just use
return multiple(result1, result2, ...)
in your rule.It is also possible to use
return multiple()
with no arguments in order to end a rule without passing any result.
- class hypothesis.stateful.Bundle(name, *, consume=False, draw_references=True)[source]
A collection of values for use in stateful testing.
Bundles are a kind of strategy where values can be added by rules, and (like any strategy) used as inputs to future rules.
The
name
argument they are passed is the they are referred to internally by the state machine; no two bundles may have the same name. It is idiomatic to use the attribute being assigned to as the name of the Bundle:class MyStateMachine(RuleBasedStateMachine): keys = Bundle("keys")
Bundles can contain the same value more than once; this becomes relevant when using
consumes()
to remove values again.If the
consume
argument is set to True, then all values that are drawn from this bundle will be consumed (as above) when requested.
Initializes
Initializes are a special case of rules, which are guaranteed to be run exactly once before any normal rule is called. Note if multiple initialize rules are defined, they will all be called but in any order, and that order will vary from run to run.
Initializes are typically useful to populate bundles:
- hypothesis.stateful.initialize(*, targets=(), target=None, **kwargs)[source]
Decorator for RuleBasedStateMachine.
An initialize decorator behaves like a rule, but all
@initialize()
decorated methods will be called before any@rule()
decorated methods, in an arbitrary order. Each@initialize()
method will be called exactly once per run, unless one raises an exception - after which only the.teardown()
method will be run.@initialize()
methods may not have preconditions.
import hypothesis.strategies as st
from hypothesis.stateful import Bundle, RuleBasedStateMachine, initialize, rule
name_strategy = st.text(min_size=1).filter(lambda x: "/" not in x)
class NumberModifier(RuleBasedStateMachine):
folders = Bundle("folders")
files = Bundle("files")
@initialize(target=folders)
def init_folders(self):
return "/"
@rule(target=folders, parent=folders, name=name_strategy)
def create_folder(self, parent, name):
return f"{parent}/{name}"
@rule(target=files, parent=folders, name=name_strategy)
def create_file(self, parent, name):
return f"{parent}/{name}"
Initializes can also allow you to initialize the system under test in a way that depends on values chosen from a strategy. You could do this by putting an instance variable in the state machine that indicates whether the system under test has been initialized or not, and then using preconditions (below) to ensure that exactly one of the rules that initialize it get run before any rules that depend on it being initialized.
Preconditions
While it’s possible to use assume()
in RuleBasedStateMachine rules, if you
use it in only a few rules you can quickly run into a situation where few or
none of your rules pass their assumptions. Thus, Hypothesis provides a
precondition()
decorator to avoid this problem. The precondition()
decorator is used on rule
-decorated functions, and must be given a function
that returns True or False based on the RuleBasedStateMachine instance.
- hypothesis.stateful.precondition(precond)[source]
Decorator to apply a precondition for rules in a RuleBasedStateMachine. Specifies a precondition for a rule to be considered as a valid step in the state machine, which is more efficient than using
assume()
within the rule. Theprecond
function will be called with the instance of RuleBasedStateMachine and should return True or False. Usually it will need to look at attributes on that instance.For example:
class MyTestMachine(RuleBasedStateMachine): state = 1 @precondition(lambda self: self.state != 0) @rule(numerator=integers()) def divide_with(self, numerator): self.state = numerator / self.state
If multiple preconditions are applied to a single rule, it is only considered a valid step when all of them return True. Preconditions may be applied to invariants as well as rules.
from hypothesis.stateful import RuleBasedStateMachine, precondition, rule
class NumberModifier(RuleBasedStateMachine):
num = 0
@rule()
def add_one(self):
self.num += 1
@precondition(lambda self: self.num != 0)
@rule()
def divide_with_one(self):
self.num = 1 / self.num
By using precondition()
here instead of assume()
, Hypothesis can filter the
inapplicable rules before running them. This makes it much more likely that a
useful sequence of steps will be generated.
Note that currently preconditions can’t access bundles; if you need to use preconditions, you should store relevant data on the instance instead.
Invariants
Often there are invariants that you want to ensure are met after every step in a process. It would be possible to add these as rules that are run, but they would be run zero or multiple times between other rules. Hypothesis provides a decorator that marks a function to be run after every step.
- hypothesis.stateful.invariant(*, check_during_init=False)[source]
Decorator to apply an invariant for rules in a RuleBasedStateMachine. The decorated function will be run after every rule and can raise an exception to indicate failed invariants.
For example:
class MyTestMachine(RuleBasedStateMachine): state = 1 @invariant() def is_nonzero(self): assert self.state != 0
By default, invariants are only checked after all
@initialize()
rules have been run. Passcheck_during_init=True
for invariants which can also be checked during initialization.
from hypothesis.stateful import RuleBasedStateMachine, invariant, rule
class NumberModifier(RuleBasedStateMachine):
num = 0
@rule()
def add_two(self):
self.num += 2
if self.num > 50:
self.num += 1
@invariant()
def divide_with_one(self):
assert self.num % 2 == 0
NumberTest = NumberModifier.TestCase
Invariants can also have precondition()
s applied to them, in which case
they will only be run if the precondition function returns true.
Note that currently invariants can’t access bundles; if you need to use invariants, you should store relevant data on the instance instead.
More fine grained control
If you want to bypass the TestCase infrastructure you can invoke these
manually. The stateful module exposes the function run_state_machine_as_test
,
which takes an arbitrary function returning a RuleBasedStateMachine and an
optional settings parameter and does the same as the class based runTest
provided.
This is not recommended as it bypasses some important internal functions,
including reporting of statistics such as runtimes and event()
calls. It was originally added to support custom __init__
methods, but
you can now use initialize()
rules instead.