Unit Testing Framework
A common misunderstanding about automated assessments is that people think they are trivial to create. However, my experience tells me that it is, in fact, a difficult task.
The first time I created an automated assessment to grade students’ programs, I wrote almost everything from scratch, thinking it might just require a few lines of code to compare the output from the student’s program against the expected output. However, I soon realized that I needed to write more than just lines to compare outputs. Additionally, I had to write a bunch of code to properly iterate through students’ programs, clean up any intermediate results and side effects, and collect grades and feedback. All these tasks significantly distracted me.
Furthermore, I realized that the assessment code should be organized flexibly. For instance, if an assignment contained two questions in a previous course offering, and you created assessment code to assess both questions, but later you decided to split the assignment into two smaller assignments, each containing only one question, you would want your assessment code to be conveniently reconstructed.
I’ll demonstrate how I managed to use the unit testing framework pytest to handle these problems.
pytest
I’m sure there are more things to discuss as the complexity of the assessment code increases, but here are several essential concepts about pytest that allowed me to write my assessment code efficiently. They are test discovery, fixture, fixture scope, teardown/cleanup, and overriding fixtures.
Test Discovery (see its pytest document)
In our case, we just need to remember that pytest implements standard Python
test discovery. That means it will search for test_*.py
or *_test.py
files,
and in those files, it will collect test
prefixed functions or methods outside
of classes, and test
prefixed functions or methods inside Test
prefixed
classes (without an __init__
method).
For example, consider the following folder structure and code in the files:
tests
├── a.py
└── test_a.py
# a.py
def test_a():
assert True
# test_a.py
def test_a():
assert True
class A:
def test_a(self):
assert True
class TestA:
def test_a(self):
assert True
Then the results of pytest -v tests
will be as follows, since a.py
and class
A
are skipped:
tests/test_a.py::test_a PASSED
tests/test_a.py::TestA::test_a PASSED
Notice that the methods only reference self
in the signature as a formality.
No state is tied to the actual test class. This is a difference from some other
unit testing frameworks.
Fixture (see its pytest document)
In testing, a fixture provides a defined, reliable, and consistent context for the tests. This could include an environment (for example, a database configured with known parameters) or content (such as a dataset).
pytest recognizes a particular function as a fixture if it is decorated with
@pytest.fixture
, and its returned object is a fixture. There can be more than
one fixture for a test. Fixtures can use (or depend on) other fixtures. If an
earlier fixture function has a problem and raises an exception, pytest will stop
executing fixture functions for that test; meanwhile, it will mark the test as
having an error, indicating that the test could not be attempted.
A fixture often returns something which can be later used in test functions. In
the following example, the my_obj
argument in the test_obj
function is the
fixture returned by the my_obj
fixture function.
import pytest
@pytest.fixture
def my_obj():
return "Assume this is an object"
def test_obj(my_obj):
assert my_obj == "Assume this is an object"
pytest has lots of useful built-in fixtures (see its list). Here are some of them which I think are very useful:
capsys
/capfd
: it allows you to access captured output from a test function without caring about setting/resetting output streams.tmp_path
: it provides a temporary directory unique to each test function.request
: it provides information for the requesting test function, see an example here.
Fixture Scope (see its pytest document)
By default, fixtures have the scope of function
, which means they are
destroyed (i.e., the cached objects are destroyed if any) at the end of the
test. However, we can use other scopes so that a fixture function is invoked
only once for multiple tests requiring it.
Fixtures are created when first requested by a test and are destroyed based on their scope:
function
: the default scope, the fixture is destroyed at the end of the test. -class
: the fixture is destroyed during teardown of the last test in the class. -module
: the fixture is destroyed during teardown of the last test in the module. -package
: the fixture is destroyed during teardown of the last test in the package where the fixture is defined, including sub-packages and sub-directories within it. -session
: the fixture is destroyed at the end of the test session.
I would also recommend going through the pytest documentation about fixture availability.
Teardown/Cleanup (see its pytest document)
If a test requires some necessary preparations by requiring one or more fixtures, we would like to have necessary clean-up so that those preparations—which are only necessary for the particular test—do not mess with any other tests.
pytest provides a very simple mechanism to achieve this, which is to use yield
instead of return
inside fixture functions. Any code placed after yield
is
considered teardown code. For example:
@pytest.fixture
def my_fixture():
my_obj = setup_code()
yield my_obj
teardown_code()
If setup_code
throws an exception, pytest will not try to run the
teardown_code
. Otherwise, pytest will always attempt to execute
teardown_code
.
It is enough for us to understand how pytest handles teardown, but it is recommended to read about safe teardown.
Overriding Fixture (see its pytest document)
Fixture functions can be defined in the same .py
file with their requiring
tests. However, it is likely to cause problems if we want to use fixtures
defined in different files. For example:
graph TD;
subgraph test_b.py
fixture_b
test_b
end
subgraph test_a.py
test_a
fixture_a
end
test_a-->|uses|fixture_b;
test_b-->|uses|fixture_a;
Therefore, pytest allows us to put fixture functions into a file called
conftest.py
so that tests from multiple test modules in the directory can
access those fixture functions.
graph TD;
subgraph test_b.py
test_b
end
subgraph test_a.py
test_a
end
subgraph conftest.py
fixture_a
fixture_b
end
test_a-->|uses|fixture_b;
test_b-->|uses|fixture_a;
Please do not mix the concept of fixture scope with this. In the following
example, the module scoped fixture function will return different fixtures for
test_a
and test_b
, but the session scoped fixture function will return the
same fixture.
# conftest.py
import random
import pytest
@pytest.fixture(scope="module")
def module_val():
"""Generate a random number."""
return random.randint(0, 9)
@pytest.fixture(scope="session")
def session_val():
"""Generate a random number."""
return random.randint(0, 9)
# test_a.py
def test_a(module_val, session_val):
# intentionally fail so we can see the values
assert 0, (module_val, session_val)
# test_b.py
def test_b(module_val, session_val):
# intentionally fail so we can see the values
assert 0, (module_val, session_val)
A sample output looks like:
FAILED test_a.py::test_a - AssertionError: (2, 4)
FAILED test_b.py::test_b - AssertionError: (6, 4)
The simplest approach to override fixtures is probably by creating a different
conftest.py
file. In the following example, we override the expected_len
fixture for special cases.
tests/
conftest.py
# content of tests/conftest.py
import pytest
@pytest.fixture
def expected_len():
return 8
test_username.py
# content of tests/test_username.py
def test_len(expected_len):
username = 'username' # ideally this is extracted from the student's file name
assert len(username) == expected_len
special_cases/
conftest.py
# content of tests/special_cases/conftest.py
import pytest
@pytest.fixture
def expected_len():
return 16 # override
test_username_special.py
# content of tests/special_cases/test_username_special.py
def test_len(expected_len):
username = 'special-username' # ideally this is extracted from the student's file name
assert len(username) == expected_len