Fun with ZDDs: Notes from Knuth’s 14th Annual Christmas Tree Lecture
Example 1. ZDD of a function
Here, Knuth took the example of family
= the family of 2-element subsets of
, that is,
and explained how to construct its ZDD (see diagram above). The family
is
, so its least support is
. The
and
families turn out to be
and
respectively. So we have LO and HI branches (dotted and solid lines in the diagram above) going from
to the ZDD of
and
that we recursively construct. Notice that no node with the same (V, LO, HI) triple appears twice in our data structure (that is, the diagram is reduced). So although Knuth is giving a Christmas “Tree” lecture, a ZDD is not a true tree because overlapping subtrees are combined into one — rather, a ZDD is a DAG (directed acyclec graph).
Example 2. ZDD-base and its computer representation
When working on real problems, there’re usually many functions we’re working with simultaneously. So instead of a ZDD representing a single function, we really have a ZDD-base where common zeads are shared across functions. Knuth constructed the ZDD-base of the two functions
and
. Inside a computer, a ZDD is represented as a linked structure, where each node of the ZDD, zead, is represented as a record with 3 fields: V (Variable ID#), LO (Low Pointer), HI (High Pointer). Sink nodes have their V field set to
, with LO and HI fields pointing back to them (see diagrams above).
ZDD-bases satisfy certain laws:
- No two nodes have the same triple (V, LO, HI) — that is, the structure is reduced.
- No node has HI = 0, except sink
— this is the reason Minato called it zero-suppressed — the idea is to “skip over” such nodes - For non-sinks f, V(LO(f)) > V(f) and V(HI(f)) > V(f) when f > 1 — that is, the nodes are ordered with V-fields are always increasing as we walk down the structure
ZDD-bases, which though conceptually easy to understand, are non-trivial to implement efficiently because behind the scenes, lots of things have to be taken care of:
- New nodes are born and old ones die — nodes have to be efficiently allocated, ref-counted and garbage-collected.
- The UNIQUE-table (hash-table for (V, LO, HI) triple) has to be efficiently maintained and kept fresh with any changes made to the ZDD. A memo-cache has to be similarly maintained for caching recently computed results (like AND/OR of function).
Knuth’s BDD15 (literate) program, available from his Downloadable Programs page should provide an instructive study of the issues and techniques involved. I gained a lot of clarity on tricky BDD topics like ref-counting nodes, reordering of variables, efficient function composition etc, by studying Knuth’s BDD14 program. I used it as a reference whenever I got stuck anywhere with my own BDD implementation. BDD15 should be equally useful, and I plan to find time to study it soon.
Example 3. Nested parentheses
Nested parentheses are a natural way to describe tree structures (LISP being a good example). In this example, Knuth showed how to represent the 5 ways to correctly nest 3 sets of parentheses using a ZDD:
| Parenthesis Nesting | ZDD subset |
|---|---|
| ( ( ( ) ) ) | ![]() |
| ( ( ) ( ) ) | ![]() |
| ( ( ) ) ( ) | ![]() |
| ( ) ( ( ) ) | ![]() |
| ( ) ( ) ( ) | ![]() |
In this representation, subscripts correspond to positions in the string with
/
denoting the kind of parenthesis. The ZDD for this represents a family of 5 sets (each containing 6 elements). In the ZDD, the variable ordering used is
(see the diagram above).
For
parentheses pairs, the ZDD contains 14 nodes and don’t appear to save a whole lot. However, if we consider
parentheses pairs, the ZDD of 602 nodes represents 1.3 trillion solutions! This compact ZDD structure encodes all those solutions without having to store them explicitly; and yet we can ask questions about the solutions. Like, say, analyze solutions that contain
. The figure 1.3 trillion comes because the number of ways to nest
pairs of parentheses in a balanced way is the Catalan number,
. See TAOCP Exercises 2.2.1 — 2, where a proof of this fact is given using a “reflection principle”; and also Eq. 2.3.4.4 — (14), which deals with enumeration of binary tree.
So, using only a small space, ZDDs can represent large families of combinatorial sets of interest, allowing efficient operations on these families and helping solve problems related to them. For example, if a family of sets is represented as a ZDD, one can quickly (in terms of the size of the ZDD) find a random set of the family.











