Read this section to learn more about the basics of software testing.
Software Unit Test Policy
Introduction
Unit testing validates the implementation of all objects from the lowest level defined in the detailed design (classes and functions) up to and including the lowest level in the architectural design (equivalent to the WBS components). The next layer of testing, known as integration testing, tests the interfaces between the architectural design objects (i.e. WBS components). This Policy only addresses Unit Testing. Refer to Introduction into testing or why testing is worth the effort (John R. Phillips, 2006) for a short, but good discussion on Unit Testing and its transition into Integration Testing.
Types of Unit Tests
Unit tests should be developed from the detailed design of the baseline, i.e. from either the structure diagrams or the class/function definitions. The type of tests performed during unit testing include:
White-box Tests
These tests are designed by examining the internal logic of each module and defining the input data sets that force the execution of different paths through the logic. Each input data set is a test case.
Black-box Tests
These tests are designed by examining the specification of each module and defining input data sets that will result in different behavior (e.g. outputs). Black-box tests should be designed to exercise the software for its whole range of inputs. Each input data set is a test case.
- If the detailed design placed resource constraints on the performance of a module, compliance with these constraints should be tested. Each input data set is a test case.
Performance Tests
The collection of a module’s white-box, black-box, and performance tests is known as a Unit Test suite. A rough measure of a Unit Test suite’s quality is the percentage of the module’s code which is exercised when the test suite is executed.
Responsibility for Implementation
Due to the nature of white box testing, it is best if the original developer of an object creates the test suite validating that object. During later object modification, the current developer should update the test suite to validate the changed internals.
Important
LSST DM developers are responsible for creating test suites to unit test all objects they implement. Additionally, they are responsible for updating an object’s test suite after modifying the object’s source code.
The division of responsibility between DM developer groups is highlighted in the diagram below. In essence, the Applications group, which is responsible for all science algorithm development, also develops the unit testers for: algorithm components, a stage wrapped algorithm, and the sequences of stage wrapped algorithms comprising a simple-stage-tester algorithm pipeline (i.e. without parallel processing job management). The Middleware group, so named because it is responsible for all low level framework software, develops the unit testers for those framework modules. Finally, the SQA team is responsible for the higher level testers for integration, system performance, and acceptance testing.
Testing Frameworks
Test suite execution should be managed by a testing framework, also known as a test harness, which monitors the execution status of individual test cases.
C++: boost.test
LSST DM developers should use the single-header variant of the Boost Unit Test Framework. When unit testing C++ private functions using the Boost util test macros, refer to the standard methods described in private function testing.
Python:unittest
LSST DM developers should use Python’s unittest framework.
-
lsst.utils.tests
provides several utilities for writing Python tests that developers should make use of. In particular,lsst.utils.tests.MemoryTestCase
is used to detect memory leaks in C++ objects.MemoryTestCase
should be used in all tests, even if C++ code is not explicitly referenced.This example shows the basic structure of an LSST Python unit test module, including
lsst.utils.tests.MemoryTestCase
:import unittest
import lsst.utils.tests as utilsTests
class DemoTestCase(utilsTests.TestCase):
"""Demo test case."""
def testDemo(self):
assert True
def suite():
"""Returns a suite containing all the test cases in this module."""
utilsTests.init()
suites = []
# Test suites for this module here
suites += unittest.makeSuite(DemoTestCase)
# MemoryTestCase to find C++ memory leaks
suites += unittest.makeSuite(utilsTests.MemoryTestCase)
return unittest.TestSuite(suites)
def run(exit=False):
"""Run the tests"""
utilsTests.run(suite(), exit)
if __name__ == "__main__":
run(True)Note that
MemoryTestCase
must always be the final test suite.
Unit Testing Composite Objects
Data Management uses a bottom-up testing method where validated objects are tested with, then added to, a validated baseline. That baseline, in turn, is used as the new validated baseline for further iterative testing. When developing test suites for composite objects, the developer should first ensure that adequate test suites exist for the base objects.
Automated Nightly and On-Demand Testing
Jenkins is a system which automates the compile/load/test cycle required to validate code changes. In particular, Jenkins automatically performs unit builds and unit tests expedites the module’s repair and, hopefully, limits the time other developers are impacted by the failure. For details, refer to the workflow documentation on Testing with Jenkins.
Coverage Analysis
Verifying Test Quality
Since Unit Tests are used to validate the implementation of detailed design objects through comprehensive testing, it’s important to measure the thoroughness of the test suite. Coverage analysis does this by executing an instrumented code which records the complete execution path through the code and then calculating metrics indicative of the coverage achieved during execution.
Coverage analysis examines the output of a code instrumented to record every line executed, every conditional branch taken, and every block executed. It then generates metrics on:
- Percent of statements executed
- Percent of methods (and/or functions) executed
- Percent of conditional branches executed
- Percent of a method’s (and/or function’s) entry/exit branches taken.
The metrics give a general idea of the thoroughness of the unit tests. The most valuable aspect of most web-based coverage analysis tools is the color-coded report where the statements not exercised and the branches not taken are vividly evident. The color-coded coverage holes clearly show the developer where unit tests need improvement.
Using the coverage analysis reports, the LSST DM developer should determine code segments which have not been adequately tested and should then revise the unit test suite as appropriate. Coverage analysis reports should be generated in concert with the routine automated buildbot testing.
DM Coverage Analysis Metrics
Refer to Code Coverage Analysis, by Steve Cornett, for a discussion of coverage metrics and to Minimum Acceptable Code Coverage, also by Steve Cornett, for the companion discussion on determining ‘good-enough’ overall test coverage.
A specific metric for lines of code executed and/or metric for branch conditionals executed is expected to be defined for Construction.
Using Coverage Analysis Tools
C++
LSST scons builds will automatically instrument all object and link modules with coverage counters when invoked with:
sconsprofile=gcov
This passes --coverage
to all compile and link builds; this is equivalent to -fprofile-arcs -ftest-coverage
on
compile and -lgcov
on link.
Executing the instrumented program causes coverage output to be accumulated. For each instrumented object file, the associated files .gcda
and
.gcno
are created in the object file’s directory. Successive runs add to the .gcda
files
resulting in a cumulative picture of object coverage.
Use one of the following tools to create the coverage analysis reports to verify that your unit testing coverage is adequate. Editor’s preference is for either ggcov or tggcov since only the local source files are processed; see below for details.
gcov
gcov is the original coverage analysis tool delivered with the GNU C/C++ compilers. The coverage analysis output is placed in the current directory. The analysis is done on all source and include files to which the tool is directed so be prepared for reports on all accessed system header files if you use gcov.
Use the following to generate coverage analysis on the LSST <module>/src
directory:
cd<module> sconsprofile=gcov gcov-b-osrc/src/*.ccsrc.gcov>&src_gcov.log
ggcov
ggcov is an alternate coverage analysis tool to gcov which uses a GTK+ GUI. ggcov uses the same profiling data generated from a GCC instrumented code but uses its own analysis engine.
Use the following to bring up the ggcov GUI:
cd<module> sconsprofile=gcov ggcov-osrc/
tggcov
tggcov is the non-graphical interface to ggcov.
tggcov creates its output files in the same directory as the source files are located. It creates analysis files for only the local source files (i.e. not the system files).
Use the following for a comprehensive coverage analysis. Output files will be in src/*.cc.tggcov
:
cd<module> sconsprofile=gcov tggcov-a-B-H-L-N-osrc/src
gcov output files in git directories
gcov coverage output files should be identified as non-git files to avoid the git warning about untracked files. In order to permanently ignore all gcov output
files, add the extensions .gcno
and .gcda
,
to the .gitignore
file.
Python
Note No recommendations have been made for Python coverage analysis tools. The following are options to explore when time becomes available.
Coverage.py
Coverage.py, written by Ned Batchelder, is a Python module that measures code coverage during Python execution. It uses the code analysis tools and tracing hooks provided in the Python standard library to determine which lines are executable and which have been executed.
figleaf
figleaf, written by Titus Brown, is a Python code coverage analysis tool, built somewhat on the model of Ned Batchelder’s Coverage.py module. The goals of figleaf are to be a minimal replacement of Coverage.py that supports more configurable coverage gathering and reporting.
Java
No options have been researched.
Python & C++ Test Setup
DM developers frequently use the Python unittest framework to exercise C++ methods and functions. This scenario still supports the use of the C++ coverage analysis tools.
As usual, the developer instruments the C++ routines for coverage analysis at compilation time by building with scons profile=gcov. The C++ routines generated from the SWIG *.i
source
are also instrumented. Later when a Python unittester invokes an instrumented C++ routine, the coverage is recorded into the well-known coverage data files <src>.gcda
and
<src>.gcno
. Post-processing of the coverage data files is done by the developer’s choice of C++ coverage analysis tool.
Source: The LSST Project, https://developer.lsst.io/v/DM-5063/coding/unit_test_policy.html
This work is licensed under a Creative Commons Attribution 4.0 License.