Generating Crossword Grids Using Constraint Programming

by Philippe Olivier
Created 2023-10-10

A crossword is a puzzle where a grid of white and black squares must be filled with letters in order to form valid words across in the rows and down in the columns. We will build a constraint programming model to generate crossword grids. The goal of this blog post is to show the thought process that goes into modeling a non-trivial problem. We start small and tackle the modeling difficulties one at a time, adding more components to the model along the way.

1. Parameters and data

To model this problem we will use CP-SAT, from the OR-Tools optimization suite. The input for this model consists in a word list and a grid size, and our plan is for the model to describe a valid grid of the given size containing words from the word list. First, we initialize the model and the parameters.

from ortools.sat.python import cp_model

model = cp_model.CpModel()

rows = 2
columns = 5

We start with a small 2×5 grid to test the model as we're building it. The word list is a file with one word per line. Since the solver only works with integers, we want to convert the letters of the words to integers, i.e., A=1, B=2, etc.

def word_to_numbers(word):
    """Return the list of numbers associated with a word."""
    return list(' ABCDEFGHIJKLMNOPQRSTUVWXYZ'.find(letter)
                for letter in word.upper())


def load_words(path):
    """Load the words from a file and return a wordlist.

    The input file must have one word per line.

    Wordlist structure:

    {
      2: [[1, 25], ..., [19, 7]],
      5: [[16, 15, 7, 23, 18], ..., [4, 8, 11, 10, 16]],
      ...
    }
    """
    with open(path, "r", encoding="utf-8") as f:
        words = list(word.strip() for word in f.readlines())

    wordlist = {}

    for word in words:
        word_length = len(word)

        if word_length in wordlist:
            wordlist[word_length].append(word_to_numbers(word))

        else:
            wordlist[word_length] = [word_to_numbers(word)]

    return wordlist

We create a small word list named wordlist.txt, containing the following words:

ATI
LAGER

Let's see how this turns out once the words are processed by the above functions.

wordlist = load_words("wordlist.txt")

print(wordlist)

{
  3: [[1, 20, 9]],
  5: [[12, 1, 7, 5, 18]]
}

The contents of wordlist are key-value pairs, with the keys the word lengths, and the values the list of words (in letter-integer format) of the key's length. Now that we have our data and our parameters, we can start to build the model.

2. White and black squares

The first step is to represent a grid of size rows × columns, with white squares (for letters) and black squares (to separate words). We define integer variables L ("letters") indicating what letter, if any, is in a given square.

# L[r][c] denotes the letter at row `r` and column `c`.
# A = 1, B = 2, etc, and a black square is 0.
L = [[model.NewIntVar(0, 26, f"L[{r}][{c}]")
      for c in range(columns)]
     for r in range(rows)]

We can already predict that we will have to define a fair number of constraints on black squares. While L already tracks them (when L[r][c] == 0), we define boolean variables B to specifically track if a square is black or not. This will be helpful later on to express constraints on the black squares.

# B[r][c] == 1 if there is a black square at row `r` and column `c`, 0 otherwise.
B = [[model.NewBoolVar(f"B[{r}][{c}]")
      for c in range(columns)]
     for r in range(rows)]

We need to link L with B, so that L[r][c] and B[r][c] both indicate the same thing about a square (whether is contains a letter or not). Once they are linked, setting L[r][c] to a value will force the associated B[r][c] to be consistent with it, and vice-versa.

# Link B and L.
for r in range(rows):
    for c in range(columns):
        model.Add(L[r][c] == 0).OnlyEnforceIf(B[r][c])
        model.Add(L[r][c] != 0).OnlyEnforceIf(B[r][c].Not())

Let's make sure that this works by solving this model and comparing the values of L and B.

import sys

def solve_and_print(show_B=False):
    """Solve the model and print the solution."""
    solver = cp_model.CpSolver()
    status = solver.Solve(model)
    status_name = solver.StatusName(status)
    print(f"Status: {status_name}")
    print()

    if status_name != "OPTIMAL":
        print("We have a problem :(")
        sys.exit()

    for r in range(rows):
        for c in range(columns):
            print('.ABCDEFGHIJKLMNOPQRSTUVWXYZ'[solver.Value(L[r][c])], end="")
        print()

    print()
    if show_B:
        for r in range(rows):
            for c in range(columns):
                print(solver.Value(B[r][c]), end="")
            print()

solve_and_print(show_B=True)

Status: OPTIMAL

AAAAA
.....

00000
11111

It seems to work correctly. If L[r][c] shows a letter, the associated B[r][c] shows 0, and if L[r][c] shows a black square (denoted by the dot), the associated B[r][c] shows 1. Let's push this a little further to see if a conflict between L and B is correctly detected. We force a letter on L[r][c] but force a black square on the associated B[r][c].

model.Add(L[0][0] == 26)  # We set L to 26 (the letter Z)
model.Add(B[0][0] == 1)   # We set B to a black square

solve_and_print(show_B=True)

Status: INFEASIBLE

We have a problem :(

There is of course no solution, because a square cannot be both a letter and not a letter at the same time. The basics of L and B are working, and we can now work on constraining the black squares.

3. Black squares

We notice that the previous solution included a whole row of black squares. This happened because currently nothing prevents any square, or combination of squares, from being black. The whole grid could be made of black squares for that matter, and still be valid. To rid us of this issue, we can enforce that a black square should not be directly bordered, on either side across or down, by another black square. This is achieved by summing all pairs of adjacent B squares across and down, and constraining these sums to be lower or equal to 1. This means that two adjacent squares will either both be white, or a single one of them will be black.

# Black squares may not be adjacent to each other on the top, bottom, left, or right.
for r in range(rows-1):      # Top
    for c in range(columns):
        model.Add(B[r][c] + B[r+1][c] <= 1)

for r in range(1, rows):     # Bottom
    for c in range(columns):
        model.Add(B[r][c] + B[r-1][c] <= 1)

for c in range(1, columns):  # Left
    for r in range(rows):
        model.Add(B[r][c] + B[r][c-1] <= 1)

for c in range(columns-1):  # Right
    for r in range(rows):
        model.Add(B[r][c] + B[r][c+1] <= 1)

solve_and_print(show_B=True)

Status: OPTIMAL

AAAAA
AAAAA

00000
00000

The resulting solution happens to have no black squares, but if it did have any, no two would be directly adjacent to each other. We will see later that the above constraints are too crude to cover all the cases that are of interest to us, and we will require something more sophisticated. But this works well enough for now.

4. From letters to words

We can track letters with L and black squares with B, but we currently have no way of knowing which words are formed by the letters in L, and how they are separated by the B squares. We need to devise a way to track words on the grid, and this implies that:

All words in the grid should be in the word list,
Words should be separated by a black square,
Words should not span past the edges of the grid,
There should be no overlap of words in the same direction, e.g., if CATFISH is in the grid, then it should not be overlapped by CAT and FISH (nor should it be overlapped by any other unrelated word, or part thereof).

This is going to be the biggest component of the model. We will start by modeling this for across words. Once we're done, we should be able to easily adapt the logic for down words.

We create a 3-dimensional matrix of boolean variables A ("across"), indicating if an across word of lenght l starts at row r and column c.

# A[l][r][c] == 1 if an across word of length `l` starts at row `r` and column `c`, 0 otherwise.
A = [[[model.NewBoolVar(f"A[{l}][{r}][{c}]")
       for c in range(columns)]
      for r in range(rows)]
     for l in range(columns+1)]

For example, if the value of A[3][4][5] is equal to 1, this means that an across word of length 3 starts at row 4 and column 5. Conversely, if A[3][4][5] == 0, this means that no across word of length 3 starts at row 4 and column 5 (but a word of a different length could start there, or not, or it could be a black square, or a letter from another word starting elsewhere).

We can use the contents of A to prevent the overlapping of words and to ensure that words are separated by black squares. When A[3][4][5] == 1, for instance, no other across word of any length is allowed to start at row r and column c, as well as in any column 3 squares to the right (this includes the black square that marks the end of the A[3][4][5] word). And A[3][4][5] is allowed to be equal to 1 in the first place because these same constraints enforce that it does not overlap any word on its left.

We also ensure that words do not span past the edges of the grid. If the grid contains 6 columns, for example, then obviously we can set A[3][4][5] to 0 since any 3-letter word starting at column 5 is bound to go past the right edge of the grid.

Not only do we want words to be non-overlapping and in valid positions, but we also want them to be words from the word list. Forcing a sequence of letters to represent a valid word can be achieved with a table constraint, which CP-SAT refers to as allowed assignments. This constraint takes as input a list of variables and a set of tuples of values, and enforces that these variables assume values from one of the tuples. For instance, table([x, y], [(1, 4), (9, 3)]) forces the pair of variables (x, y) to be assigned either the values (1, 4) or (9, 3) (and nothing else). We will use these table constraints to ensure that only words from the word list are found in the grid.

If the only 3-letter words in the word list are ACT and CAR, and that A[3][4][5] == 1, then the associated table constraint ensures that (L[4][5], L[4][6], L[4][7]) are assigned either (1, 3, 20) (ACT) or (3, 1, 18) (CAR). However, we don't want this table constraint enforce anything if A[3][4][5] == 0, so we need to find a way to activate or deactivate it depending on the value of the boolean variable A[3][4][5]. Unfortunately, CP-SAT does not (yet) support turning on and off the AddAllowedAssignments() constraints depending on the value of a boolean variable, so we have to improvise.

def table(model, variables, tuples, option):
    """Add an optional TABLE (AddAllowedAssignments()) constraint to a model.

    A TABLE constraint is added to `model`, where `variables` assume the values of one of `tuples`,
    but only when `option` is true.
    """
    # One boolean variable per tuple indicates if the values of that tuple are assigned to the
    # variables or not.
    b = [model.NewBoolVar(f"b[{i}]") for i in range(len(tuples))]

    # Set of values that can be assigned to the various variables.
    possible_values = {i: set() for i in range(len(variables))}
    max_value = max(max(t) for t in tuples)
    for t in tuples:
        for i, j in enumerate(t):
            possible_values[i].add(j)

    # is_assigned[i][j] indicates if variable `i` is assigned value `j`.
    is_assigned = [[model.NewBoolVar(f"is_assigned[{i}][{j}]")
                    for j in range(max_value+1)]
                   for i in range(len(variables))]

    # Some assignments are impossible since the value is found in no tuple.
    for i in range(len(variables)):
        for j in range(max_value+1):
            if j not in possible_values[i]:
                model.Add(is_assigned[i][j] == 0).OnlyEnforceIf(option)

    # One value must be assigned to each variable.
    for i in is_assigned:
        model.Add(cp_model.LinearExpr.Sum(i) == 1).OnlyEnforceIf(option)

    # Link `is_assigned` and `variables`.
    for i in range(len(variables)):
        for j in range(max_value+1):
            model.Add(variables[i] == j).OnlyEnforceIf(is_assigned[i][j])

    # TABLE constraint.
    for i, t in enumerate(tuples):
        model.AddBoolAnd([is_assigned[j][t[j]] for j in range(len(t))]).OnlyEnforceIf(b[i])

    # Only one tuple may be assigned to the variables.
    model.Add(cp_model.LinearExpr.Sum(b) == 1).OnlyEnforceIf(option)

The above is not technically a table constraint in the sense in which we usually understand it in constraint programming (a global constraint with a filtering algorithm, etc). But it is the best we can do with the means available to us, so for the purposes of the model we will call it a table constraint. If a future CP-SAT version has the option of turning AddAllowedAssignments() on and off with a boolean variable, then we will be able to use that as a real table constraint.

We have everything we need to correctly constrain the values of A and those of L (with the table constraints). We can enforce the desired behavior of these variables by going through the A variables one by one:

If there are no words of length l in the word list, then all the A[l][*][*] can be set to 0 since no word of length l can start anywhere in the grid.
If l is too large for the grid, or if not enough squares remain on the right to fit a word of length l, then we can set those variables to 0. For example, if the grid contains 6 columns, then all A[3][*][5] can be set to 0.
We prevent word overlap and separate words with black squares by enforcing sums on A. For instance, if A[3][4][5] == 1 then the constraint A[*][4][4] + A[*][4][5] + A[*][4][6] + A[*][4][7] + A[*][4][8] == 1 (the three letters of the word, plus the two bordering black squares) is enforced.
The table constraints force certain words in certain positions, and are activated (or deactivated) by the values of the A boolean variables.

# Across constraints. These also prevent any overlap between across words.
for l in range(columns+1):
    for r in range(rows):
        for c in range(columns):
            # A[l][r][c] == 0 if there are no words of length `l` in the word list.
            if l not in wordlist:
                model.Add(A[l][r][c] == 0)

            # A[l][r][c] == 0 if not enough squares remain on the right to fit the word.
            elif l > columns - c:
                model.Add(A[l][r][c] == 0)

            # The word fits in the row. If a word of length `l` starts at row `r` and column `c`, no
            # other word may start inside of it or adjacent to it. The word should also be bordered
            # on each side either by the grid edge or by a black square.
            else:
                # The word fills the whole row.
                if l == columns:
                    model.Add(cp_model.LinearExpr.Sum(list(A[i][r][c+j]
                                                           for i in range(columns+1)
                                                           for j in range(columns-c)))
                              == 1).OnlyEnforceIf(A[l][r][c])

                # The word starts on the left edge and does not fill the whole row.
                elif c == 0:
                    model.Add(cp_model.LinearExpr.Sum(list(A[i][r][c+j]
                                                           for i in range(columns+1)
                                                           for j in range(l+1)))
                              == 1).OnlyEnforceIf(A[l][r][c])

                # The word ends on the right edge and does not fill the whole row.
                elif c + l == columns:
                    model.Add(cp_model.LinearExpr.Sum(list(A[i][r][c+j]
                                                           for i in range(columns+1)
                                                           for j in range(columns-c)))
                              == 1).OnlyEnforceIf(A[l][r][c])

                # The word does not start or end on an edge.
                else:
                    model.Add(cp_model.LinearExpr.Sum(list(A[i][r][c-1+j]
                                                           for i in range(columns+1)
                                                           for j in range(l+2)))
                              == 1).OnlyEnforceIf(A[l][r][c])

                # Since `AddAllowedAssignments().OnlyEnforceIf()` is not (yet) supported, we have to
                # do it by hand.
                table(model, [L[r][c+i] for i in range(l)], wordlist[l], A[l][r][c])

Recall that the words in our word list are ATI and LAGER. If A[5][0][0] is set to 1, for example, the table constraint should force the word LAGER to be assigned to the first row of the grid.

model.Add(A[5][0][0] == 1)

solve_and_print()

Status: OPTIMAL

LAGER
UU...

We get LAGER in the first row as expected, but where is ATI? We know that A forces L to assume specific values because of the table constraints. The problem here is that A and L are half-reified, meaning that while A[l][r][c] == 1 forces L to assume some values (the letters of a word), neither L[r][c] nor B[r][c] imply anything about A[*][r][c]. In other words, the fact that A implies something about L and B does not mean that L and B imply anything about A. In the above case, the solver simply sets all A[*][1][*] variables to 0, allowing the L[1][*] and B[1][*] variables to assume any value (in this case, the letter U or a black square). We need to link A to B, and since B is linked to L, this should sort everything out.

# If there is a letter in a square, then an across word covering that letter must be active.
for r in range(rows):
    for c in range(columns):
        model.Add(cp_model.LinearExpr.Sum(list(A[i][r][c-j]
                                               for i in range(columns+1)
                                               for j in range(min(i, c+1))))
                  == 1).OnlyEnforceIf(B[r][c].Not())

What the above does is to enforce that if a certain square is not black, then there must be an across word somewhere on the left long enough to cover that square.

solve_and_print()

Status: OPTIMAL

LAGER
LAGER

Everything appears to work now. To summarize, our model can now:

Keep track of letters with the L integer variables.
Keep track of black squares with the B boolean variables.
Keep track of whole across words with the A boolean variables and a bunch of other stuff (no word overlap, etc).

The word LAGER appears twice in the grid, so the next step is obvious: We need to eliminate duplicate words from the grid.

5. Duplicate words

At first glance, the solution to the issue of preventing duplicate words may not appear obvious. Upon closer inspection, however, it seems like the key to solving this probably lies with the table constraints, as it is they who actually set the letters of whole words in L. If these constraints can set letters in L, what would prevent them from setting other values elsewhere?

What if we gave each word a unique integer identifier? We could then have variables keep track of which identifier is used (i.e., which word is in the grid), and have our table constraints set the word identifier values as well as the word letters themselves. Then, we would just have to ensure that every identifier is unique, and voilà!

First, we must assign a unique identifier to each word. The easiest way to do so is when we are initially loading words. Let's modify the load_words() function and add an identifier as an extra "letter" at the end of each word.

def load_words(path):
    """Load the words from a file and return a wordlist.

    The input file must have one word per line.

    Wordlist structure:

    {
      2: [[1, 25, 0], ..., [19, 7, 62]],
      5: [[16, 15, 7, 23, 18, 121], ..., [4, 8, 11, 10, 16, 1062]],
      ...
    }

    Note: The last item in each tuple is the word identifier.
    """
    with open(path, "r", encoding="utf-8") as f:
        words = list(word.strip() for word in f.readlines())

    wordlist = {}

    i = 0  # Identifier counter.

    for word in words:
        word_length = len(word)

        # We add the identifier as an "extra" letter at the end of the word.
        if word_length in wordlist.keys():
            wordlist[word_length].append(word_to_numbers(word) + [i])

        else:
            wordlist[word_length] = [word_to_numbers(word) + [i]]

        i += 1  # Increment the identifier for the next word.

    return wordlist

We need to create variables to track the identifiers of the words found in the grid. We define integer variables IA ("identifiers across") to track the identifiers of the across words.

# IA[r][c] denotes the identifier of the across word starting at row `r` and column `c`.
IA = [[model.NewIntVar(0, sum(len(wordlist[l]) for l in wordlist) + rows*columns*2, f"IA[{r}][{c}]")
       for c in range(columns)]
      for r in range(rows)]

Note that the domain of the IA variables is larger than the total number of identifiers. Since the number of words in the grid will always be lower than the number of squares in the grid (because words have more than one letter and the grid contains black squares), and that the number of words in the word list could be small, we need to have enough values in IA to go around.

Now we need to adjust the table constraints in the nested loop used for the across words. We simply replace table(model, [L[r][c+i] for i in range(l)], wordlist[l], A[l][r][c]) with table(model, [L[r][c+i] for i in range(l)] + [IA[r][c]], wordlist[l], A[l][r][c]) in order to allow the constraint to set the across identifier along with the letters of the word.

What this means for IA is that if A[l][r][c] == 1, then IA[r][c] == X, where X is the identifier of the word of length l starting at row r and column c. Conversely, if A[l][r][c] == 0, then IA[r][c] == X, where X is any free identifier not used by a "real" word on the grid.

Since we'll soon need ID ("identifiers down") for the down words, let's get that out of the way.

# ID[r][c] denotes the identifier of the down word starting at row `r` and column `c`.
ID = [[model.NewIntVar(0, sum(len(wordlist[l]) for l in wordlist) + rows*columns*2, f"ID[{r}][{c}]")
       for c in range(columns)]
      for r in range(rows)]

Finally, we want all the values found in IA and ID to be different. This way, no word should be found more than once in the grid.

# Ensure that all identifiers are different.
model.AddAllDifferent([IA[r][c]
                       for r in range(rows)
                       for c in range(columns)] +
                      [ID[r][c]
                       for r in range(rows)
                       for c in range(columns)])

Let's see how we fare now.

solve_and_print()

Status: OPTIMAL

LAGER
.ATI.

We're getting there! Now we need to implement this for down words.

6. Down words

The logic is very similar for down words. We just have to make minor adjustments to what has been done for the across words.

# D[l][r][c] == 1 if a down word of length `l` starts at row `r` and column `c`, 0 otherwise.
D = [[[model.NewBoolVar(f"D[{l}][{r}][{c}]")
       for c in range(columns)]
      for r in range(rows)]
     for l in range(rows+1)]

# Down constraints. These also prevent any overlap between down words.
for l in range(rows+1):
    for r in range(rows):
        for c in range(columns):
            # D[l][r][c] == 0 if there are no words of length `l` in the word list.
            if l not in wordlist:
                model.Add(D[l][r][c] == 0)

            # D[l][r][c] == 0 if not enough squares remain on at the bottom to fit the word.
            elif l > rows - r:
                model.Add(D[l][r][c] == 0)

            # The word fits in the column. If a word of length `l` starts at row `r` and column `c`,
            # no other word may start inside of it or adjacent to it. The word should also be
            # bordered on each side either by the grid edge or by a black square.
            else:
                # The word fills the whole column.
                if l == rows:
                    model.Add(cp_model.LinearExpr.Sum(list(D[i][r+j][c]
                                                           for i in range(rows+1)
                                                           for j in range(rows-r)))
                              == 1).OnlyEnforceIf(D[l][r][c])

                # The word starts on the top edge and does not fill the whole column.
                elif r == 0:
                    model.Add(cp_model.LinearExpr.Sum(list(D[i][r+j][c]
                                                           for i in range(rows+1)
                                                           for j in range(l+1)))
                              == 1).OnlyEnforceIf(D[l][r][c])

                # The word ends on the bottom edge and does not fill the whole column.
                elif r + l == rows:
                    model.Add(cp_model.LinearExpr.Sum(list(D[i][r+j][c]
                                                           for i in range(rows+1)
                                                           for j in range(rows-r)))
                              == 1).OnlyEnforceIf(D[l][r][c])

                # The word does not start or end on an edge.
                else:
                    model.Add(cp_model.LinearExpr.Sum(list(D[i][r-1+j][c]
                                                           for i in range(rows+1)
                                                           for j in range(l+2)))
                              == 1).OnlyEnforceIf(D[l][r][c])

                # Since `AddAllowedAssignments().OnlyEnforceIf()` is not (yet) supported, we have to
                # do it by hand.
                table(model, [L[r+i][c] for i in range(l)] + [ID[r][c]], wordlist[l], D[l][r][c])

# If there is a letter in a square, then a down word covering that letter must be active.
for r in range(rows):
    for c in range(columns):
        model.Add(cp_model.LinearExpr.Sum(list(D[i][r-j][c]
                                               for i in range(rows+1)
                                               for j in range(min(i, r+1))))
                  == 1).OnlyEnforceIf(B[r][c].Not())

The across and down logic is fully implemented. We want the final grid to look like this:

LAGER
ATI.E

We need to add a the missing down words to the word list:

ATI
LAGER
LA
AT
GI
RE

Let's run this again to get the grid we expect.

solve_and_print()

Status: INFEASIBLE

We have a problem :(

Well, this is definitely not what we hoped for. The model is correct, and our word list contains all the necessary words… or does it? Let's take another look at the expected output:

LAGER
ATI.E

We notice that the black square creates two "lone letters", i.e., words containing a single letter. Firstly, we have the E from LAGER in the top row that creates a lone letter in the down direction. And secondly, we have the E from RE in the rightmost column that creates a lone letter in the across direction. Let's see if that was really the problem by adding these two lone letters to the word list and running the model again.

ATI
LAGER
LA
AT
GI
RE
E
E

solve_and_print()

Status: OPTIMAL

LAGER
ATI.E

We are making good progress. The next step is to deal with the problem of lone letters.

7. Lone letters

The naive solution to the problem of lone letters is to simply add a bunch of single letters to the word list in order to get rid of the issue. But this is neither efficient nor elegant, so let's tackle this properly. Recall the previous constraints enforcing that if a square is not black (i.e., if it is a letter), then a word must cover it (below is for across).

# If there is a letter in a square, then an across word covering that letter must be active.
for r in range(rows):
    for c in range(columns):
        model.Add(cp_model.LinearExpr.Sum(list(A[i][r][c-j]
                                               for i in range(columns+1)
                                               for j in range(min(i, c+1))))
                  == 1).OnlyEnforceIf(B[r][c].Not())

We want to transform the condition OnlyEnforceIf(B[r][c].Not()) into something that is not just about black squares, but also includes lone letters. We define lone letters as letters that are bordered on each side, across or down (depending on the context) with a grid edge and/or a black square. We start by defining boolean variables, across and down, to keep track of lone letters.

# LLA[r][c] == 1 if the across letter at row `r` and column `c` is a lone letter, 0 otherwise.
LLA = [[model.NewBoolVar(f"LLA[{r}][{c}]")
        for c in range(columns)]
       for r in range(rows)]
        
# LLD[r][c] == 1 if the down letter at row `r` and column `c` is a lone letter, 0 otherwise.
LLD = [[model.NewBoolVar(f"LLD[{r}][{c}]")
        for c in range(columns)]
       for r in range(rows)]

We now need to constrain the values of LLA ("lone letter across") and LLD ("lone letter down"). If a letter is bordered by an edge, then it is a lone letter if the square on its other side is black. And if a letter is bordered by two black squares, then it is a lone letter.

# Constraints for LLA. A letter is a lone letter if it is bordered on each side with a grid edge or
# a black square.
for r in range(rows):
    # Edge-adjacent squares.
    model.Add(LLA[r][0] == B[r][1])
    model.Add(LLA[r][columns-1] == B[r][columns-2])

    # Other squares.
    for c in range(1, columns-1):
        model.Add(B[r][c-1] + B[r][c+1] == 2).OnlyEnforceIf(LLA[r][c])
        model.Add(B[r][c-1] + B[r][c+1] <= 1).OnlyEnforceIf(LLA[r][c].Not())

# Constraints for LLD. A letter is a lone letter if it is bordered on each side with a grid edge or
# a black square.
for c in range(columns):
    # Edge-adjacent squares.
    model.Add(LLD[0][c] == B[1][c])
    model.Add(LLD[rows-1][c] == B[rows-2][c])

    # Other squares.
    for r in range(1, rows-1):
        model.Add(B[r-1][c] + B[r+1][c] == 2).OnlyEnforceIf(LLD[r][c])
        model.Add(B[r-1][c] + B[r+1][c] <= 1).OnlyEnforceIf(LLD[r][c].Not())

We're halfway there. We define LLAB ("lone letter across or black square") and LLDB ("lone letter down or black square").

# LLAB[r][c] == 1 if the across letter at row `r` and column `c` is a lone letter or a black square,
# 0 otherwise.
LLAB = [[model.NewBoolVar(f"LLAB[{r}][{c}]")
         for c in range(columns)]
        for r in range(rows)]

# LLDB[r][c] == 1 if the down letter at row `r` and column `c` is a lone letter or a black square,
# 0 otherwise.
LLDB = [[model.NewBoolVar(f"LLDB[{r}][{c}]")
         for c in range(columns)]
        for r in range(rows)]

We enforce constraints on LLAB and LLDB by using the values of LLA, LLD, and B. LL*B[r][c] == 1 if either LL*[r][c] == 1, or if B[r][c] == 1.

# Constraints for LLAB.
for r in range(rows):
    for c in range(columns):
        model.Add(LLA[r][c] + B[r][c] >= 1).OnlyEnforceIf(LLAB[r][c])
        model.Add(LLA[r][c] + B[r][c] == 0).OnlyEnforceIf(LLAB[r][c].Not())

# Constraints for LLDB.
for r in range(rows):
    for c in range(columns):
        model.Add(LLD[r][c] + B[r][c] >= 1).OnlyEnforceIf(LLDB[r][c])
        model.Add(LLD[r][c] + B[r][c] == 0).OnlyEnforceIf(LLDB[r][c].Not())

Finally, we replace the previous conditions OnlyEnforceIf(B[r][c].Not()) with OnlyEnforceIf(LLAB[r][c].Not()) and OnlyEnforceIf(LLDB[r][c].Not()).

# If there is a letter in a square, and that this letter is not a lone letter, then an across word
# covering that letter must be active.
for r in range(rows):
    for c in range(columns):
        model.Add(cp_model.LinearExpr.Sum(list(A[i][r][c-j]
                                               for i in range(columns+1)
                                               for j in range(min(i, c+1))))
                  == 1).OnlyEnforceIf(LLAB[r][c].Not())

# If there is a letter in a square, and that this letter is not a lone letter, then a down word
# covering that letter must be active.
for r in range(rows):
    for c in range(columns):
        model.Add(cp_model.LinearExpr.Sum(list(D[i][r-j][c]
                                               for i in range(rows+1)
                                               for j in range(min(i, r+1))))
                  == 1).OnlyEnforceIf(LLDB[r][c].Not())

This looks good, let's see what we get now.

solve_and_print()

Status: OPTIMAL

.U.U.
A.U.U

Well, this works (technically), albeit a little too well. Now the problem is that a grid full of lone letters is considered valid. We fix this by saying that a letter is not allowed to be a lone letter both across and down. This way, every letter has to be part of a "real" word.

# A letter shouldn't be a lone letter both across and down.
for r in range(rows):
    for c in range(columns):
        model.Add(LLA[r][c] + LLD[r][c] <= 1)

Does this fix the issue?

solve_and_print()

Status: OPTIMAL

LAGER
A.I.E

It almost works. We get a valid grid, although it is not the full grid that we expected:

LAGER
ATI.E

The black squares still need some more modeling work.

8. Black squares (again)

We could still improve the model and get the full grid we want by adding an objective that minimizes the sum of black squares.

# Minimize the number of black squares.
model.Minimize(cp_model.LinearExpr.Sum(
    list(cp_model.LinearExpr.Sum(
        list(c for c in row)) for row in B)))

solve_and_print()

Status: OPTIMAL

LAGER
ATI.E

This is as perfect as we can make it. However, we will see later that this model does not scale very well. As such, we choose to forego an objective function.

While we have discarded the idea of minimizing the number of black squares, we should still ensure that their number is kept in check. Based on observations of some of the New York Times' 13×13 crossword grids, the ratio of black squares to white squares is about one black square for every five white squares. We enforce such a ratio on our grids, to make sure that we don't end up with too many black squares.

# Limit the number of black squares.
model.Add(cp_model.LinearExpr.Sum(
    list(cp_model.LinearExpr.Sum(
        list(c for c in row)) for row in B))
          <= int(rows*columns/5))  # int(x) is the same as floor(x)

9. Finishing touches

It is often the case that while everything seems to work fine on a small scale, some unexpected issues pop up when the size of the problem is increased. Let's increase the size of the grid to 6×6 and change the word list to:

ACT
ASK
BAD
BAR
BED
BUY
CAMERA
COURSE

The solver outputs the following grid.

solve_and_print()

Status: OPTIMAL

.B.B.B
CAMERA
.R.D.D
A.B.A.
COURSE
T.Y.K.

Interesting! If you draw a horizontal line right in the middle of the grid, you get two smaller grids that are entirely independent from each other. In other words, the bigger grid is simply comprised two smaller grids glued together. This is undesirable since we prefer that all words be more tightly interconnected. This can be fixed by ensuring that the number of black squares found in adjacent rows or columns be strictly smaller than the size of such rows or columns.

# The words in the grid must be fully interconnected (we should not be able to split a grid into
# smaller sub-grids).
for r in range(rows-1):
    model.Add(cp_model.LinearExpr.Sum(list(B[r][c] for c in range(columns))) +
              cp_model.LinearExpr.Sum(list(B[r+1][c] for c in range(columns)))
              < columns)

for c in range(columns-1):
    model.Add(cp_model.LinearExpr.Sum(list(B[r][c] for r in range(rows))) +
              cp_model.LinearExpr.Sum(list(B[r][c+1] for r in range(rows)))
              < rows)

Let's try this on a smaller grid with the full word list.

solve_and_print()

Status: OPTIMAL

BABE
ALT.
CA.A
K.BW

Well, now you can draw a diagonal line to split the grid in two! Preventing all diagonals of black squares going from one edge of the grid to another would not be enough, as we could still generate grids like this one:

.XXX
X.X.
XX.X
XXXX

We have to rethink our black square constraints from scratch. The critical observation to make here is that a 3×3 grid cannot possibly be divided into independent sub-grids if the number of black squares in it is less than or equal to 2. Even cutting a corner of the the grid would be impossible as that would result in a lone letter both across and down, which is already forbidden:

XXX
XX.
X.X

It follows that any combination of such grids cannot split a larger grid into independent sub-grids. Let's scrap our previous black square constraints (even the early ones preventing adjacent black squares), and replace them with this new constraint.

# Prevent 3x3 sub-grids from having too many black squares.
# This also prevents independent sub-grids.
if rows >= 3 and columns >= 3:
    for r in range(0, rows-2):
        for c in range(0, columns-2):
            model.Add(cp_model.LinearExpr.Sum(list(B[i][j]
                                                   for i in range(r, r+3)
                                                   for j in range(c, c+3)))
                      <= 2)

Note that we still keep the black square constraint enforcing the ratio of black squares to white squares.

This completes the modeling of this problem. The complete source code of the model can be found here. Now let's run some benchmarks!

10. Benchmarks

10.1. Preliminary notes

The data consists in 200 words each for the lengths 2 to 10, for a total of 1800 words. These are all common English words. The tests were performed on an i5-1240P CPU with 32GB RAM, using OR-Tools v9.7 on Linux v6.5.4. All tests are averaged over 10 instances. We track the time taken to build the model, the time taken to solve it, and the total time elapsed (build time + solve time).

10.2. Results

For a typical use-case of crossword grid generation, we would like to allow larger grids to have longer words (in addition to all the shorter word found in smaller grids). As such, we are first interested in the scalability of the model w.r.t. the combined effects of the grid size and the number of words in the word list. We compare the build and solve times of grid sizes ranging from 2×2 to 10×10. The word lists used for each grid include all words of length 2, up to the size of the grid. For instance, the results for x=5 show the build/solve/total times for generating a 5×5 grid using all the 2, 3, 4, and 5-letter words (800 words total).

The model does not seem to scale very well under these conditions. In fact, for the larger grid sizes, it becomes a struggle to even build the model, let alone solve it. This is due to the increasing number of table constraints, that are themselves made up of several variables and constraints.

For the second test, we try using the same set of words for all grid sizes. We use all 2, 3, 4, and 5-letter words (800 words total) for grid sizes 5×5 to 10×10.

The model now appears to scale slightly better with the grid size. There are still more table constraints on larger grids for the same words as for the smaller grids, obviously. However, since we use the same word list for all grids, we are not introducting a whole new host of table constraints for longer words in larger grids.

In the third test, we use the same 10×10 grid size but change the number of words in the word list. The x-axis indicates the largest words allowed in the word list (all shorter words are also used). For example, the word list for x=5 will include all 2, 3, 4, and 5-letter words (800 words total).

With a fixed grid size, the model appears to scale somewhat linearly with the number of words in the word list. In short, once a grid size has been chosen, increasing the size of the word list is not going to have a dramatic impact.

Finally, we generate one 11×11 crossword grid using the whole word list. This is the largest problem size that will fit in the memory of my machine (32GB RAM + 32GB swap). This includes the memory required to store the search tree until a solution is found. The build time was 235s and the solve time was 730s (965s total), resulting in the following grid:

ARABIC.AZ.A
..DA.ALT.JR
DAS.BROTHER
BALLOT.AL.A
.R.GB.AC.AN
BOB.BACKING
ANALYSTS.AE
S.CC.HI.HH.
ECO.BLOCKED
.ANCIENT.IO
LN.BOYS.AMD

The model is very memory-hungry for grids containing 100 squares or more.

11. Conclusion

We have shown how to build a non-trivial model step by step. We started out with two simple and unconstrained grids, one for letters and one for black squares. We constrained these grids further and further along the way, to match the requirements of the problem. As we got deeper into modeling, we had to take a few steps back and rework some constraints concerning black squares. In the end, we were able to get interesting crossword grids.

We went through a lot of trouble to modelize the black tile placement constraints. Instead, we could have started off with a grid prefilled with some black squares, and simply filled it with words. This would have allowed us to directly use the built-in table constraint without the need for a boolean-based activation/deactivation scheme. In fact, this would have spared us most of the modeling effort, needing only a few table constraints, along with constraints preventing duplicate words.

So, why not just do this and call it a day? This would work very well, as long as the word list contains enough words. If you're looking to build, say, a thematic grid, then your word list is likely to be small. By using a grid prefilled with black squares, you're cutting the solution space by a lot. This means that while testing a word list on a specific grid template is going to be easy, finding a template that works with the word list is going to be hard. Our model directly encapsulates the black tile placement dimension of the problem.

This adventure is not over. The model could yet be enhanced in several ways. We could try to improve the performance of the model by replacing the table constraints with regular constraints. A crossword expert could create a partial grid, and the model could handle the trickier parts and finish the job. Finally, we could scrape Wikipedia pages for words on a given topic to automatically create thematic word lists, and use those to generate thematic crossword grids (then use LLMs to create tricky word hints for the word list).