Recursive Data Structures

Read this page. In the previous unit of our course we studied recursive algorithms. Recursion is a concept that also applies to data. Here we look at recursive data structures - lists, trees, and sets. A list is a structure that consists of elements linked together. If an element is linked to more than one element, the structure is a tree. If each element is linked to two (sub) elements, it is called a binary tree. Trees can be implemented using lists, as shown in the resource for this unit. Several examples of the wide applicability of lists are presented. A link points to all the remaining links, i.e. the rest of the list or the rest of the tree; thus, a link points to a list or to a tree - this is data recursion.

The efficiency of the programming process includes both running time and size of data. This page discusses the latter for recursive lists and trees.

Lastly, why read the last section on sets? Sets are another recursive data structure and the last section 2.7.6, indicates their connection with trees, namely, a set data type can be implemented in several different ways using a list or a tree data type. Thus, the programming process includes implementation decisions, in addition, to design or algorithm decisions. Each of these types of decisions is constrained by the features of the programming language used. The decision choices, such as which data structure to use, will impact efficiency and effectiveness of the program's satisfaction of the program's requirements.

Note: You will notice an unusual use of C++ here. What the author is doing is showing how to pass a fixed-value data-structure as a calling argument.

9. Optimizing recursive algorithms with isomorphic data structures

Many images have large regions that are entirely white or black. When superimposing one region on another, if either region is entirely white, we know the result must be the same as the other region. When superimposing one region on another, if either region is entirely black, the result must be entirely black.

We can use the quadtree's hierarchal representation to exploit this. We'll store some extra information in each quadtree, its colour: If it is entirely white, its colour will be white. If it is entirely black, its colour will be black. And if it contains a mix of white and black cells, its colour will be a question mark.

const isOneByOneArray = (something) =>
  Array.isArray(something) && something.length === 1 &&
  Array.isArray(something[0]) && something[0].length === 1;

const contentsOfOneByOneArray = (array) => array[0][0];

const divideSquareIntoRegions = (square) => {
  const upperHalf = firstHalf(square);
  const lowerHalf = secondHalf(square);

  const upperLeft = upperHalf.map(firstHalf);
  const upperRight = upperHalf.map(secondHalf);
  const lowerRight = lowerHalf.map(secondHalf);
  const lowerLeft= lowerHalf.map(firstHalf);

  return [upperLeft, upperRight, lowerRight, lowerLeft];
};

const colour = (something) => {
  if (something.colour != null) {
    return something.colour;
  } else if (something === '⚪️') {
    return '⚪️';
  } else if (something === '⚫️') {
    return '⚫️';
  } else {
    throw "Can't get the colour of this thing";
  }
};

const combinedColour = (...elements) =>
  elements.reduce((acc, element => acc === element ? element : '❓'))

const regionsToQuadTree = ([ul, ur, lr, ll]) => ({
    ul, ur, lr, ll, colour: combinedColour(ul, ur, lr, ll)
  });

const arrayToQuadTree = multirec({
  indivisible: isOneByOneArray,
  value: contentsOfOneByOneArray,
  divide: divideSquareIntoRegions,
  combine: regionsToQuadTree
});

arrayToQuadTree(
  [ ['⚪️', '⚪️'],
    ['⚪️', '⚪️'] ]
).colour
  //=> "⚪️"

arrayToQuadTree(
  [ ['⚪️', '⚪️'],
    ['⚪️', '⚫️'] ]
).colour
  //=> "❓"

arrayToQuadTree(
  [ ['⚫️', '⚫️'],
    ['⚫️', '⚫️'] ]
).colour
  //=> "⚫️"

arrayToQuadTree(
  [ ['⚪️', '⚪️', '⚪️', '⚪️'],
    ['⚪️', '⚫️', '⚪️', '⚪️'],
    ['⚫️', '⚪️', '⚪️', '⚪️'],
    ['⚫️', '⚫️', '⚫️', '⚪️']]
).colour
  //=> "❓"

Now, we can take advantage of every region's computed colour when we superimpose "coloured" quadtrees:

const eitherAreEntirelyColoured = ({ left, right }) =>
  colour(left) !== '❓' || colour(right) !== '❓' ;

const superimposeColoured = ({ left, right }) => {
    if (colour(left) === '⚪️' || colour(right) === '⚫️') {
      return right;
    } else if (colour(left) === '⚫️' || colour(right) === '⚪️') {
      return left;
    } else {
      throw "Can't superimpose these things";
    }
  };

const divideTwoQuadTrees = ({ left, right }) => [
    { left: left.ul, right: right.ul },
    { left: left.ur, right: right.ur },
    { left: left.lr, right: right.lr },
    { left: left.ll, right: right.ll }
  ];

const combineColouredRegions = ([ul, ur, lr, ll]) => ({
    ul, ur, lr, ll, colour: combinedColour(ul, ur, lr, ll)
  });

const superimposeColouredQuadTrees = multirec({
  indivisible: eitherAreEntirelyColoured,
  value: superimposeColoured,
  divide: divideTwoQuadTrees,
  combine: combineColouredRegions
});

We get the same output, but now instead of comparing every cell whenever we superimpose quadtrees, we compare entire regions at a time. If either is "entirely coloured," we can return the other one without recursively drilling down to the level of individual pixels.

There is no savings if both quadtrees are composed of a fairly evenly spread mix of black and white pixels (e.g. a checkerboard pattern), but in cases where there are large expanses of white or black, the difference is substantial.

In the case of comparing the 4x4 canvas and glider images above, the  superimposeArray function requires sixteen comparisons. The  superimposeQuadTrees function requires twenty comparisons. But the superimposeColouredQuadTrees function requires just seven comparisons.

If we were writing an image manipulation application, we'd provide much snappier behaviour using coloured quadtrees to represent images on screen.

The interesting thing about this optimization is that it is tuned to the characteristics of both the data structure and the algorithm: It is not something that is easy to perform in the algorithm without the data structure, or in the data structure without the algorithm.

And it's not the only optimization. Remember our ‘whirling regions' implementation of rotateQuadTree? Here's rotateColouredQuadTree:

const isEntirelyColoured = (something) =>
  colour(something) !== '❓' ;

const rotateColouredQuadTree = multirec({
  indivisible : isEntirelyColoured,
  value : itself,
  divide: quadTreeToRegions,
  combine: regionsToRotatedQuadTree
});

Any region that is entirely white or entirely black is its own rotation, so no further dividing and conquering need be done. For images that have large areas of blank space, the "whirling regions" algorithm is not just aesthetically delightful, it's faster than a brute-force transposition of array elements.

Optimizations like this can only be implemented when the algorithm and the data structure are isomorphic to each other.