Read this section to learn more about the basics of software testing.
Unit testing validates the implementation of all objects from the lowest level defined in the detailed design (classes and functions) up to and including the lowest level in the architectural design (equivalent to the WBS components). The next layer of testing, known as integration testing, tests the interfaces between the architectural design objects (i.e. WBS components). This Policy only addresses Unit Testing. Refer to Introduction into testing or why testing is worth the effort (John R. Phillips, 2006) for a short, but good discussion on Unit Testing and its transition into Integration Testing.
Unit tests should be developed from the detailed design of the baseline, i.e. from either the structure diagrams or the class/function definitions. The type of tests performed during unit testing include:
These tests are designed by examining the specification of each module and defining input data sets that will result in different behavior (e.g. outputs). Black-box tests should be designed to exercise the software for its whole range of inputs. Each input data set is a test case.
The collection of a module’s white-box, black-box, and performance tests is known as a Unit Test suite. A rough measure of a Unit Test suite’s quality is the percentage of the module’s code which is exercised when the test suite is executed.
Due to the nature of white box testing, it is best if the original developer of an object creates the test suite validating that object. During later object modification, the current developer should update the test suite to validate the changed internals.
LSST DM developers are responsible for creating test suites to unit test all objects they implement. Additionally, they are responsible for updating an object’s test suite after modifying the object’s source code.
The division of responsibility between DM developer groups is highlighted in the diagram below. In essence, the Applications group, which is responsible for all science algorithm development, also develops the unit testers for: algorithm components, a stage wrapped algorithm, and the sequences of stage wrapped algorithms comprising a simple-stage-tester algorithm pipeline (i.e. without parallel processing job management). The Middleware group, so named because it is responsible for all low level framework software, develops the unit testers for those framework modules. Finally, the SQA team is responsible for the higher level testers for integration, system performance, and acceptance testing.
Test suite execution should be managed by a testing framework, also known as a test harness, which monitors the execution status of individual test cases.
LSST DM developers should use the single-header variant of the Boost Unit Test Framework. When unit testing C++ private functions using the Boost util test macros, refer to the standard methods described in private function testing.
LSST DM developers should use Python’s unittest framework.
lsst.utils.tests provides several utilities for writing Python
tests that developers should make use of. In particular,
used to detect memory leaks in C++ objects.
MemoryTestCase should be used in all tests, even if C++ code is not explicitly referenced.
This example shows the basic structure of an LSST Python unit test module, including
import lsst.utils.tests as utilsTests
"""Demo test case."""
"""Returns a suite containing all the test cases in this module."""
suites = 
# Test suites for this module here
suites += unittest.makeSuite(DemoTestCase)
# MemoryTestCase to find C++ memory leaks
suites += unittest.makeSuite(utilsTests.MemoryTestCase)
"""Run the tests"""
if __name__ == "__main__":
MemoryTestCase must always be the final test suite.
Data Management uses a bottom-up testing method where validated objects are tested with, then added to, a validated baseline. That baseline, in turn, is used as the new validated baseline for further iterative testing. When developing test suites for composite objects, the developer should first ensure that adequate test suites exist for the base objects.
Jenkins is a system which automates the compile/load/test cycle required to validate code changes. In particular, Jenkins automatically performs unit builds and unit tests expedites the module’s repair and, hopefully, limits the time other developers are impacted by the failure. For details, refer to the workflow documentation on Testing with Jenkins.
Since Unit Tests are used to validate the implementation of detailed design objects through comprehensive testing, it’s important to measure the thoroughness of the test suite. Coverage analysis does this by executing an instrumented code which records the complete execution path through the code and then calculating metrics indicative of the coverage achieved during execution.
Coverage analysis examines the output of a code instrumented to record every line executed, every conditional branch taken, and every block executed. It then generates metrics on:
The metrics give a general idea of the thoroughness of the unit tests. The most valuable aspect of most web-based coverage analysis tools is the color-coded report where the statements not exercised and the branches not taken are vividly evident. The color-coded coverage holes clearly show the developer where unit tests need improvement.
Using the coverage analysis reports, the LSST DM developer should determine code segments which have not been adequately tested and should then revise the unit test suite as appropriate. Coverage analysis reports should be generated in concert with the routine automated buildbot testing.
Refer to Code Coverage Analysis, by Steve Cornett, for a discussion of coverage metrics and to Minimum Acceptable Code Coverage, also by Steve Cornett, for the companion discussion on determining ‘good-enough’ overall test coverage.
A specific metric for lines of code executed and/or metric for branch conditionals executed is expected to be defined for Construction.
LSST scons builds will automatically instrument all object and link modules with coverage counters when invoked with:
--coverage to all compile and link builds; this is equivalent to
-fprofile-arcs -ftest-coverage on
-lgcov on link.
Executing the instrumented program causes coverage output to be accumulated. For each instrumented object file, the associated files
.gcno are created in the object file’s directory. Successive runs add to the
resulting in a cumulative picture of object coverage.
Use one of the following tools to create the coverage analysis reports to verify that your unit testing coverage is adequate. Editor’s preference is for either ggcov or tggcov since only the local source files are processed; see below for details.
gcov is the original coverage analysis tool delivered with the GNU C/C++ compilers. The coverage analysis output is placed in the current directory. The analysis is done on all source and include files to which the tool is directed so be prepared for reports on all accessed system header files if you use gcov.
Use the following to generate coverage analysis on the LSST
cd<module> sconsprofile=gcov gcov-b-osrc/src/*.ccsrc.gcov>&src_gcov.log
ggcov is an alternate coverage analysis tool to gcov which uses a GTK+ GUI. ggcov uses the same profiling data generated from a GCC instrumented code but uses its own analysis engine.
Use the following to bring up the ggcov GUI:
cd<module> sconsprofile=gcov ggcov-osrc/
tggcov is the non-graphical interface to ggcov.
tggcov creates its output files in the same directory as the source files are located. It creates analysis files for only the local source files (i.e. not the system files).
Use the following for a comprehensive coverage analysis. Output files will be in
cd<module> sconsprofile=gcov tggcov-a-B-H-L-N-osrc/src
gcov coverage output files should be identified as non-git files to avoid the git warning about untracked files. In order to permanently ignore all gcov output
files, add the extensions
Note No recommendations have been made for Python coverage analysis tools. The following are options to explore when time becomes available.
Coverage.py, written by Ned Batchelder, is a Python module that measures code coverage during Python execution. It uses the code analysis tools and tracing hooks provided in the Python standard library to determine which lines are executable and which have been executed.
figleaf, written by Titus Brown, is a Python code coverage analysis tool, built somewhat on the model of Ned Batchelder’s Coverage.py module. The goals of figleaf are to be a minimal replacement of Coverage.py that supports more configurable coverage gathering and reporting.
No options have been researched.
DM developers frequently use the Python unittest framework to exercise C++ methods and functions. This scenario still supports the use of the C++ coverage analysis tools.
As usual, the developer instruments the C++ routines for coverage analysis at compilation time by building with scons profile=gcov. The C++ routines generated from the SWIG
are also instrumented. Later when a Python unittester invokes an instrumented C++ routine, the coverage is recorded into the well-known coverage data files
<src>.gcno. Post-processing of the coverage data files is done by the developer’s choice of C++ coverage analysis tool.
Source: The LSST Project, https://developer.lsst.io/v/DM-5063/coding/unit_test_policy.html
This work is licensed under a Creative Commons Attribution 4.0 License.