Software Unit Test Policy and Coverage Analysis

Read this section to learn more about the basics of software testing.


Software Unit Test Policy


Introduction

Unit testing validates the implementation of all objects from the lowest level defined in the detailed design (classes and functions) up to and including the lowest level in the architectural design (equivalent to the WBS components). The next layer of testing, known as integration testing, tests the interfaces between the architectural design objects (i.e. WBS components). This Policy only addresses Unit Testing. Refer to Introduction into testing or why testing is worth the effort (John R. Phillips, 2006) for a short, but good discussion on Unit Testing and its transition into Integration Testing.


Types of Unit Tests

Unit tests should be developed from the detailed design of the baseline, i.e. from either the structure diagrams or the class/function definitions. The type of tests performed during unit testing include:


White-box Tests
These tests are designed by examining the internal logic of each module and defining the input data sets that force the execution of different paths through the logic. Each input data set is a test case.


Black-box Tests

These tests are designed by examining the specification of each module and defining input data sets that will result in different behavior (e.g. outputs). Black-box tests should be designed to exercise the software for its whole range of inputs. Each input data set is a test case.


Performance Tests
If the detailed design placed resource constraints on the performance of a module, compliance with these constraints should be tested. Each input data set is a test case.

The collection of a module’s white-box, black-box, and performance tests is known as a Unit Test suite. A rough measure of a Unit Test suite’s quality is the percentage of the module’s code which is exercised when the test suite is executed.


Responsibility for Implementation

Due to the nature of white box testing, it is best if the original developer of an object creates the test suite validating that object. During later object modification, the current developer should update the test suite to validate the changed internals.

Important

LSST DM developers are responsible for creating test suites to unit test all objects they implement. Additionally, they are responsible for updating an object’s test suite after modifying the object’s source code.

The division of responsibility between DM developer groups is highlighted in the diagram below. In essence, the Applications group, which is responsible for all science algorithm development, also develops the unit testers for: algorithm components, a stage wrapped algorithm, and the sequences of stage wrapped algorithms comprising a simple-stage-tester algorithm pipeline (i.e. without parallel processing job management). The Middleware group, so named because it is responsible for all low level framework software, develops the unit testers for those framework modules. Finally, the SQA team is responsible for the higher level testers for integration, system performance, and acceptance testing.

Test Development Responsibility

Testing Frameworks

Test suite execution should be managed by a testing framework, also known as a test harness, which monitors the execution status of individual test cases.


C++: boost.test

LSST DM developers should use the single-header variant  of the Boost Unit Test Framework. When unit testing C++ private functions using the Boost util test macros, refer to the standard methods described in  private function testing.


Python:unittest

LSST DM developers should use Python’s unittest framework.

lsst.utils.tests provides several utilities for writing Python tests that developers should make use of. In particular,  lsst.utils.tests.MemoryTestCase is used to detect memory leaks in C++ objects.  MemoryTestCase should be used in all tests, even if C++ code is not explicitly referenced.

This example shows the basic structure of an LSST Python unit test module, including  lsst.utils.tests.MemoryTestCase:

import unittest

import lsst.utils.tests as utilsTests


class DemoTestCase(utilsTests.TestCase):
"""Demo test case."""

def testDemo(self):
assert True


def suite():
"""Returns a suite containing all the test cases in this module."""
utilsTests.init()

suites = []
# Test suites for this module here
suites += unittest.makeSuite(DemoTestCase)
# MemoryTestCase to find C++ memory leaks
suites += unittest.makeSuite(utilsTests.MemoryTestCase)
return unittest.TestSuite(suites)


def run(exit=False):
"""Run the tests"""
utilsTests.run(suite(), exit)


if __name__ == "__main__":
run(True)

Note that  MemoryTestCase must always be the final test suite.


Unit Testing Composite Objects

Data Management uses a bottom-up testing method where validated objects are tested with, then added to, a validated baseline. That baseline, in turn, is used as the new validated baseline for further iterative testing. When developing test suites for composite objects, the developer should first ensure that adequate test suites exist for the base objects.


Automated Nightly and On-Demand Testing

Jenkins is a system which automates the compile/load/test cycle required to validate code changes. In particular, Jenkins automatically performs unit builds and unit tests expedites the module’s repair and, hopefully, limits the time other developers are impacted by the failure. For details, refer to the workflow documentation on Testing with Jenkins.


Coverage Analysis


Verifying Test Quality

Since Unit Tests are used to validate the implementation of detailed design objects through comprehensive testing, it’s important to measure the thoroughness of the test suite. Coverage analysis does this by executing an instrumented code which records the complete execution path through the code and then calculating metrics indicative of the coverage achieved during execution.

Coverage analysis examines the output of a code instrumented to record every line executed, every conditional branch taken, and every block executed. It then generates metrics on:

  • Percent of statements executed
  • Percent of methods (and/or functions) executed
  • Percent of conditional branches executed
  • Percent of a method’s (and/or function’s) entry/exit branches taken.

The metrics give a general idea of the thoroughness of the unit tests. The most valuable aspect of most web-based coverage analysis tools is the color-coded report where the statements not exercised and the branches not taken are vividly evident. The color-coded coverage holes clearly show the developer where unit tests need improvement.

Using the coverage analysis reports, the LSST DM developer should determine code segments which have not been adequately tested and should then revise the unit test suite as appropriate. Coverage analysis reports should be generated in concert with the routine automated buildbot testing.


DM Coverage Analysis Metrics

Refer to Code Coverage Analysis, by Steve Cornett, for a discussion of coverage metrics and to Minimum Acceptable Code Coverage, also by Steve Cornett, for the companion discussion on determining ‘good-enough’ overall test coverage.

A specific metric for lines of code executed and/or metric for branch conditionals executed is expected to be defined for Construction.


Using Coverage Analysis Tools


C++

LSST scons builds will automatically instrument all object and link modules with coverage counters when invoked with:

sconsprofile=gcov

This passes  --coverage to all compile and link builds; this is equivalent to  -fprofile-arcs -ftest-coverage on compile and  -lgcov on link.

Executing the instrumented program causes coverage output to be accumulated. For each instrumented object file, the associated files  .gcda and  .gcno are created in the object file’s directory. Successive runs add to the  .gcda files resulting in a cumulative picture of object coverage.

Use one of the following tools to create the coverage analysis reports to verify that your unit testing coverage is adequate. Editor’s preference is for either ggcov or tggcov since only the local source files are processed; see below for details.


gcov

gcov is the original coverage analysis tool delivered with the GNU C/C++ compilers. The coverage analysis output is placed in the current directory. The analysis is done on all source and include files to which the tool is directed so be prepared for reports on all accessed system header files if you use gcov.

Use the following to generate coverage analysis on the LSST  <module>/src directory:

cd<module>
sconsprofile=gcov
gcov-b-osrc/src/*.ccsrc.gcov>&src_gcov.log

ggcov

ggcov is an alternate coverage analysis tool to gcov which uses a GTK+ GUI. ggcov uses the same profiling data generated from a GCC instrumented code but uses its own analysis engine.

Use the following to bring up the ggcov GUI:

cd<module>
sconsprofile=gcov
ggcov-osrc/

tggcov

tggcov is the non-graphical interface to ggcov.

tggcov creates its output files in the same directory as the source files are located. It creates analysis files for only the local source files (i.e. not the system files).

Use the following for a comprehensive coverage analysis. Output files will be in  src/*.cc.tggcov:

cd<module>
sconsprofile=gcov
tggcov-a-B-H-L-N-osrc/src

gcov output files in git directories

gcov coverage output files should be identified as non-git files to avoid the git warning about untracked files. In order to permanently ignore all gcov output files, add the extensions  .gcno and  .gcda, to the  .gitignore file.


Python

Note No recommendations have been made for Python coverage analysis tools. The following are options to explore when time becomes available.


Coverage.py

Coverage.py, written by Ned Batchelder, is a Python module that measures code coverage during Python execution. It uses the code analysis tools and tracing hooks provided in the Python standard library to determine which lines are executable and which have been executed.


figleaf

figleaf, written by Titus Brown, is a Python code coverage analysis tool, built somewhat on the model of Ned Batchelder’s Coverage.py module. The goals of figleaf are to be a minimal replacement of Coverage.py that supports more configurable coverage gathering and reporting.


Java

No options have been researched.


Python & C++ Test Setup

DM developers frequently use the Python unittest framework to exercise C++ methods and functions. This scenario still supports the use of the C++ coverage analysis tools.

As usual, the developer instruments the C++ routines for coverage analysis at compilation time by building with scons profile=gcov. The C++ routines generated from the SWIG  *.i source are also instrumented. Later when a Python unittester invokes an instrumented C++ routine, the coverage is recorded into the well-known coverage data files  <src>.gcda and  <src>.gcno. Post-processing of the coverage data files is done by the developer’s choice of C++ coverage analysis tool.


Source: The LSST Project, https://developer.lsst.io/v/DM-5063/coding/unit_test_policy.html
Creative Commons License This work is licensed under a Creative Commons Attribution 4.0 License.

Last modified: Friday, August 13, 2021, 3:54 PM