Knuth’s Algorithm X For the Exact Cover Problem

Few if any names hold as much weight in computer science as Donald Knuth. So when knuth proposes a solution to a problem, you’d be wise to listen. Amongst his (many) famous contributions is the awesomely named “Algorithm X”. To quote wikipedia, Algorithm X is a recursive, nondeterministic, depth-first, backtracking algorithm for solving Exact Cover problems”. It then goes on to mention that the purpose was to show the “dancing links” technique in use. I am not going to going to use DLX in my implementation of Algorithm X. There is no doubt that algorithm x in combination with dancing links is an elegant slice of code. That said, hand rolling a sparse matrix implementation as suggested in DLX from circular doubly-double linked lists is not what you’d call a “simple” endeavor, nor is algorithm it’s self dependent on DLX. In todays post I am going to implement algorithm x using a full matrix implementation, which I believe is beneficial to understanding fully what it is the algorithm is actually doing, especially before trying to implement a more optimized version, such as dancing links.

The Exact Cover Problem

The exact cover problem is a special type of set cover problem. Even if you don’t exactly know what a set cover problem is, you’re probably unknowingly already familiar with them. If you’ve ever played Sudoku, that’s a set cover problem. The 8 queens puzzle is a set cover and exact cover problem. And there are many, many other real world examples. Needles to say, set cover problems are important, and efficient algorithms to solve them are equally so. The exact cover problem is known to be NP complete, but there are algorithms that solve at least smaller instances of them efficiently, such as algorithm x.

They lay mans explanation of what an exact cover is, if you have a list of lists, and exact cover is the subset of that list that contains every item in the set with no overlap. The formal definition of exact cover is this: Given a set, S which is comprised of subsets of set X, the exact cover is the subset of S, S* whose members collectively contain every member of X, and whos set intersections are mutually 0.

To look at it in another way suppose we have the following:

Set X = {1,2,3,4,5,6,7}
Set S = {A,B,C,D,E,F}
A = {1,4,7}
B = {1,4}
C = {4,5,7}
D = {3,5,6}
E = {2,3,6,7}
F = {2, 7}

Exact Cover of X = {B,D,F}

B = {1, 4, }
D = { 3, 5, 6, }
F = { 2, 7}
X = {1, 2, 3, 4, 5, 6, 7}

The above example is borrowed from the wikipedia page for exact cover problems, and is well worth a read for a more detailed overview, including an in-depth work through of the above example.

Algorithm X

The simplest way to model our problem for processing algorithmically is as an incidence matrix. An incidence matrix is a 2d boolean array, with the columns indexed by X and the rows indexed by S. As such, the above data set is transformed into the following:

   A: 1,0,0,1,0,0,1
   B: 1,0,0,1,0,0,0
   C: 0,0,0,1,1,0,1   
   D: 0,0,1,0,1,1,0
   E: 0,1,1,0,0,1,1
   F: 0,1,0,0,0,0,1

It is this representation that the description of algorithm x is described as working on. Knuth describes algorithm X as follows:

 Algorithm X(matrix A, set partial_solution):
If matrix A has no columns (is empty):
the partial_solution is a valid solution;
terminate successfully.
    step 1) choose a column c (deterministically).
    step 2) Choose a row r such that A[r][c] = 1 (nondeterministically).
    step 3) Include row r in partial_solution.
    step 4)
            For each column j such that A[r][j] = 1,
               for each row i such that A[i][j] = 1,
                    delete row i from matrix A.
            delete column j from matrix A.
    step 5) Repeat this algorithm recursively on the reduced matrix A.

All right, seems pretty straight forward… but what’s this stuff about selecting columns deterministically and rows non deterministically? Is there any pseudocode for that? Well, not exactly, but Knuth is a nice guy and he gave us some suggestions. First, he recommends when selecting a column, to chose the first column with the lowest number of 1’s from the matrix. That means if two or more columns tie, choose the column with the lowest index. Ok, not bad we can do that like this:

Integer FindColumn(matrix A):
Integer min = INF, minCol = 0
for each col in matrix A:
count = occurence of 1's in this column
if (count < min):
min = count, minCol = col;
return minCol

Next up is non deterministically choose a row from the matrix, where A[row][col] == 1. I’ll let you in on a secret. When you read “non deterministically” what they mean is try the first, and if that doesn’t work, try the next one. This gives us two bits of information moving forward: one, we need to find all the rows where A[row][col] == 1, and more importantly: this is the condition that we will be back tracking on. First, lets select the rows to try.

List<Integer> findRows(matrix A, int col):
List<Integer> rowsToTry;
for every row in matrix A:
if (A[row][col] == 1)
add row to rowsToTry
return rowsToTry

With those two helpers sketched out, along with knuths pseudocode we have what we need to start implementing. Lets start with our matrix.

#include <iostream>
#include <vector>
#include <set>
using namespace std;

typedef vector<pair<char, vector<int>>> IncidenceMatrix;

void printMatrix(IncidenceMatrix& p) {
    cout<<"\nCurrent Matrix: "<<endl;
    for (auto m : p) {
        cout<<m.first<<": ";
        for (int c : m.second) {
            cout<<c<<" ";

int main() {
    IncidenceMatrix problem = {
    algX(problem, set<char>());

I’ve “tagged’ each row of the matrix with it’s set name, because its numerical index will change as we reduce the matrix, but the set it represents must stay constant, otherwise all of this is for naught. Next is the algorithms for choosing a column and selecting rows.

int findColumn(IncidenceMatrix& p) {
    int min = p.size(), rc = 0;
    for (int col = 0; col < p[0].second.size(); col++) {
        int ones = 0;
        for (int row = 0; row < p.size(); row++) {
            if (p[row].second[col] == 1)
        if (ones < min) {
            min = ones;
            rc = col;
    return rc;

vector<int> findRows(IncidenceMatrix& p, int col) {
    vector<int> rows;
    for (int i = 0; i < p.size(); i++) {
        if (p[i].second[col] == 1) {
    return rows;

Allright, now we can start writing the depth first part of the algorithm. We know our base case is when we hit an empty matrix. When that happens, the set of rows present in our partial solution set is our exact cover, and so we use that as our terminating condition.

bool algX(IncidenceMatrix p, set<char> solnSet) {
    if (p.empty()) {
        cout<<"Exact Cover Set: ";
        for (char r : solnSet) {
            cout<<r<<" ";
        return true;

Next, we choose our column, and select our rows. This is when we start our possible backtracking. As try each row selected in turn, we add its label to our partial solution set. With each row we run our matrix reduction as described in the pseudocode, and then call algX on the reduced matrix. If that recursive call does not lead to a solution, we remove the row from the solution set (back track) and try the next row.

    int col = findColumn(p);
    for (int row : findRows(p, col)) {
        char save = p[row].first;
        if (!algX(reduceMatrix(p, row) , solnSet)) {
    return false;

If after trying all of our selected rows we have not found a solution, we return false and exit. But first, lets talk about reducing the matrix. Knuth described the process as this: “The nondeterministic choice of r means that the algorithm essentially clones itself into independent sub-algorithms; each sub-algorithm inherits the current matrix A, but reduces it with respect to a different row r.”[1] Well, we’re not going to parallelize the program, but we can do the same process linearly with a loop.

We know that we will be removing rows and columns, both of which have the potential to be very costly operations to do in-place on a 2D array structure. Instead, I recommend matrix reduction by re-writing. This also aids in our back tracking, as if the reduced matrix doesnt lead to a solution, its no big deal to throw it away since we still have our original matrix – no need to un-reduce the matrix we just reduced such as had we done the reduction in-place. We first want to analyze our matrix, and gather a list of rows, as well as a list of columns that will not be included in the next iteration. Creating the next matrix is then a simple matter of copying the current matrix, checking each row if we should skip it before adding it, and likewise for the columns”

IncidenceMatrix reduceMatrix(IncidenceMatrix& p, int row) {
    set<int> rowsToRemove, colsToRemove;
    analyzeCurrentMatrix(p, row, rowsToRemove, colsToRemove);
    return buildNextMatrix(p, rowsToRemove, colsToRemove);

void analyzeCurrentMatrix(IncidenceMatrix& p, int row, set<int>& rowsToRemove, set<int>& colsToRemove) {
    for (int col = 0; col < p[row].second.size(); col++) {
        if (p[row].second[col] == 1) {
            for (int i = 0; i < p.size(); i++) {
                if (p[i].second[col] == 1) {

IncidenceMatrix buildNextMatrix(IncidenceMatrix& p, set<int>& rowsToRemove, set<int>& colsToRemove) {
    IncidenceMatrix nextMatrix;
    for (int r = 0; r < p.size(); r++) {
        if (rowsToRemove.find(r) == rowsToRemove.end()) {
            vector<int> nrow;
            for (int c = 0; c < p[r].second.size(); c++) {
                if (colsToRemove.find(c) == colsToRemove.end()) {
            nextMatrix.push_back(make_pair(p[r].first, nrow));
    return nextMatrix;

Once we have it all put together, we can run it on the example from above, by adding diagnostic messages at important points in the algorithm, we get the following annotated output:

max@MaxGorenLaptop:~/dsa$ ./loudx
(1) Selected col: 0
Current Matrix:
A: 1 0 0 1 0 0 1
B: 1 0 0 1 0 0 0
C: 0 0 0 1 1 0 1
D: 0 0 1 0 1 1 0
E: 0 1 1 0 0 1 1
F: 0 1 0 0 0 0 1

(1) Selected row: A
(1) Removing row: A from column 0
(1) Removing row: B from column 0
(1) Removing Column: 0
(1) Removing row: A from column 3
(1) Removing row: B from column 3
(1) Removing row: C from column 3
(1) Removing Column: 3
(1) Removing row: A from column 6
(1) Removing row: C from column 6
(1) Removing row: E from column 6
(1) Removing row: F from column 6
(1) Removing Column: 6

Adding A to partial solution set.

(2) Selected col: 0

Current Matrix:
D: 0 1 1 1

(2) No eligible rows, Backtracking.

Removing: A from partial solution set.

(1) Selected row: B
(1) Removing row: A from column 0
(1) Removing row: B from column 0
(1) Removing Column: 0
(1) Removing row: A from column 3
(1) Removing row: B from column 3
(1) Removing row: C from column 3
(1) Removing Column: 3

Adding B to partial solution set.

(2) Selected col: 2

Current Matrix:
D: 0 1 1 1 0
E: 1 1 0 1 1
F: 1 0 0 0 1

(2) Selected row: D
(2) Removing row: D from column 1
(2) Removing row: E from column 1
(2) Removing Column: 1
(2) Removing row: D from column 2
(2) Removing Column: 2
(2) Removing row: D from column 3
(2) Removing row: E from column 3
(2) Removing Column: 3

Adding D to partial solution set.

(3) Selected col: 0

Current Matrix:
F: 1 1

(3) Selected row: F
(3) Removing row: F from column 0
(3) Removing Column: 0
(3) Removing row: F from column 1
(3) Removing Column: 1

Adding F to partial solution set.

matrix is empty. Exact Cover Found: { B, D, F }

And so there you have it, Algorithm X for the Exact Cover problem, sans the dancing links technique. Set covering, of which exact cover is only one type can be used to solve many types of combinatory problems, as such algorithms and techniques to find them are an important area of research. They’re pretty interesting too. Until next time, Happy Hacking.

Further Reading

[1] Knuth, Donald E,. “Dancing Links”, Stanford University,