Solving MathDoku with Mixed-Integer Programming

KenKen™ is Hebrew for YesYes

julia

math

programming

Author

Vincent “VM” Mercator

Published

October 28, 2023

Modified

May 13, 2024

MathDoku (also known as KenKen™, KenDoku™, CalcuDoku, and SquareLogic) is a mathematical logic puzzle that I really liked in middle school. Like Sudoku, MathDoku is based on the Latin square: the goal of the puzzle is to fill in an N-by-N grid with the digits 1 to N in such a way that each row and column contains one of each digit. The MathDoku puzzle grid is further subdivided into cages, collections of adjacent cells whose numbers must perform an operation to reach a target value. With a little bit of mathematical rearranging, we can convert MathDoku puzzles into a mixed-integer programming problem and leverage the power of linear programming to solve them.

In general, optimization problems require making the right decisions subject to some constraints to complete some objective as much or as little as possible. To convert this into mathematics, we need to find the chosen extremum of some objective function by choosing the right decision variables that are subject to constraints. Linear programming is a technique used for solving optimization problems where both the objective function and constraints are defined by linear equations, equalities, or inequalities relating to the decision variables. Mixed-integer linear programming is a further restriction on linear programming, where some to all decision variables are integers.¹ I’m not going to go into too much detail about linear programming in this article, but it’s a fascinating tool to use for solving problems. If you want to learn more about linear programming, I highly recommend watching “What in the world is a linear program?” by OptWhiz, because I think it’s the best introduction to linear programming that I’ve seen so far.

Furthermore, I’ve chosen Julia as my language of choice for this article, since it is more suited to scientific computing problems than Python and its one-based indexing makes formulating the problem a little easier. For optimizing, I’m using the JuMP optimization library to connect to HiGHS, a powerful linear optimization tool.

using Dates
using HiGHS
using JuMP
using Printf

Setting up the MathDoku in Julia

Here is a 6-by-6 MathDoku puzzle that I generated with KSudoku on my computer that we’re going to solve using mixed-integer programming. If you want to try this puzzle yourself, you can download the XML file.

A 6-by-6 MathDoku Puzzle. Cage 1 has the instruction 30-times, and has cells at indices (1, 1), (1, 2), (1, 3). Cage 2 has the instruction 7-plus, and has cells at indices (1, 4), (1, 5). Cage 3 has the instruction 2-equals, and has cells at indices (1, 6). Cage 4 has the instruction 30-times, and has cells at indices (2, 1), (3, 1), (4, 1). Cage 5 has the instruction 2-times, and has cells at indices (2, 2), (2, 3). Cage 6 has the instruction 21-plus, and has cells at indices (2, 4), (2, 5), (2, 6), (3, 5). Cage 7 has the instruction 2-divide, and has cells at indices (3, 2), (4, 2). Cage 8 has the instruction 4-times, and has cells at indices (3, 3), (3, 4). Cage 9 has the instruction 90-times, and has cells at indices (3, 6), (4, 5), (4, 6), (5 5). Cage 10 has the instruction 90-times, and has cells at indices (4, 3), (5, 2), (5, 3). Cage 11 has the instruction 3-equals, and has cells at indices (4, 4). Cage 12 has the instruction 1-equals, and has cells at indices (5, 1). Cage 13 has the instruction 120-times, and has cells at indices (5, 4), (6, 3), (6, 4), (6, 5). Cage 14 has the instruction 3-minus, and has cells at indices (5, 6), (6, 6). Cage 15 has the instruction 1-minus, and has cells at indices (6, 1), (6, 2). — The 6x6 MathDoku puzzle used in this article.

Operations as an Enum

Most variations of MathDoku that I’ve seen have five different types of operations, which I’m going to represent using an enum. I’ve described the operations for each cage in the table below.

Cage Type	Symbol	Cage Rule Description
Equality	= (or None)	The one cell value equals the target number.
Addition	+	The cell values sum together to the target number.
Subtraction	-	The two cell values have a difference equal to the target number.
Multiplication	×	The cell values have the target number as their product.
Division	÷	The maximum cell value divided by the minimum cell value equals the target number.

"""
Different operations that a MathDoku cage can have.
"""
@enum Operation begin
    equals
    plus
    minus
    times
    divide
end

Side Note

Each operation has special requirements about cage sizes. Equality must contain 1 cell, subtraction cages and division cages must contain 2 cells, and addition and multiplication cages must have at least 2 cells. Checking for this requirement is outside the scope of this article.

Cages as a Struct

We can represent a cage its target number, its operation, and the MathDoku puzzle grid coordinates that it occupies. We can represent these coordinates as a vector of CartesianIndices with length 2. This built-in type can extract values from matrices, which makes the most sense for keeping track of cage locations on the MathDoku puzzle’s grid. The CartesianIndex’s first element corresponds to a cell’s row, and its second element corresponds to a cell’s column.

"""
A single cage (collection of cells) in a MathDoku puzzle.
"""
struct Cage
    """Correct value when applying the operation on the cell values."""
    target::Int
    """Mathematical operation used to produce the result."""
    operation::Operation
    """Coordinate pairs belonging to this cage."""
    coords::Vector{CartesianIndex{2}}
    """Instantiate Cage with a tuple vector to avoid repeatedly typing `CartesianIndex`."""
    function Cage(target::Int, operation::Operation, coords::Vector{Tuple{Int,Int}})
        return new(target, operation, [CartesianIndex(i) for i in coords])
    end
end;

Repeatedly typing out CartesianIndex multiple times when defining a cage makes reading code difficult, so I’m going to instead create a constructor method that converts an inputted vector of tuples into a vector of CartesianIndices. Here are all fifteen cages from my MathDoku puzzle instantiated as Cage structs.

c1 = Cage(30, times, [(1, 1), (1, 2), (1, 3)])
c2 = Cage(7, plus, [(1, 4), (1, 5)])
c3 = Cage(2, equals, [(1, 6)])
c4 = Cage(30, times, [(2, 1), (3, 1), (4, 1)])
c5 = Cage(2, times, [(2, 2), (2, 3)])
c6 = Cage(21, plus, [(2, 4), (2, 5), (2, 6), (3, 5)])
c7 = Cage(2, divide, [(3, 2), (4, 2)])
c8 = Cage(4, times, [(3, 3), (3, 4)])
c9 = Cage(90, times, [(3, 6), (4, 5), (4, 6), (5, 5)])
c10 = Cage(90, times, [(4, 3), (5, 2), (5, 3)])
c11 = Cage(3, equals, [(4, 4)])
c12 = Cage(1, equals, [(5, 1)])
c13 = Cage(120, times, [(5, 4), (6, 3), (6, 4), (6, 5)])
c14 = Cage(3, minus, [(5, 6), (6, 6)])
c15 = Cage(1, minus, [(6, 1), (6, 2)])

my_cages = (c1, c2, c3, c4, c5, c6, c7, c8, c9, c10, c11, c12, c13, c14, c15);

Side Note

Cages that belong to the same MathDoku puzzle are not supposed to overlap (i.e., share cell coordinates). Checking for this is also outside the scope of this article.

Assembling the MathDoku Puzzle

Armed with all the necessary components, we can now define our MathDoku puzzle as the set of cages that define it. To calculate the MathDoku puzzle’s size, we sum up the lengths of each cage’s coordinates vector and calculate the square root. This value is important for building the linear program; I’m going to include it as a part of the struct as well by creating a constructor method that calculates it.

"""
A MathDoku puzzle.
"""
struct MathDoku
    """Cages belonging to the MathDoku puzzle."""
    cages::Tuple{Vararg{Cage}}
    """Size length of the MathDoku puzzle."""
    size::Int
    """Instantiate a new MathDoku puzzle by inputting the cages that define it."""
    function MathDoku(cages::Tuple{Vararg{Cage}})
        return new(cages, Int(√sum([length(cage.coords) for cage in cages])))
    end
end;

Let’s define the MathDoku puzzle we want to solve. We’ve already defined its cages, so we can reuse them in the definition.

md = MathDoku(my_cages);

Solving the MathDoku

Now that we have our MathDoku, let’s solve it using mixed-integer programming.

md_model = Model(HiGHS.Optimizer)

A JuMP Model
├ solver: HiGHS
├ objective_sense: FEASIBILITY_SENSE
├ num_variables: 0
├ num_constraints: 0
└ Names registered in the model: none

Each cell in an N-by-N MathDoku puzzle has a possibility of being any one of N numbers from 1 to N, but must be one number at a time. So, we can represent these values for a single cell as a one-hot encoded vector of binary values with size N where the position of the true value in the vector corresponds to the value of the cell.

For example:

[false, true, false, false, false, false] => 2

We have one of these vectors for each cell in the MathDoku puzzle, so we can represent this as an N-by-N-by-N multidimensional binary array C. The first dimension corresponds to the values stored in each row of the puzzle, the second dimension corresponds to the values stored in each column of the puzzle, and the third dimension corresponds to which digit is in that cell. The candidate vectors for row i and column j are in the 1-by-1-by-N “tube” that contains all elements located in row i and column j of the multidimensional array.

\[ c_{i,j,k} = \begin{cases} 1 & \text{if cell }(i, j) = k\\ 0 & \text{otherwise} \end{cases} \quad \forall \left(i, j, k\right) \in \{1, \ldots, N\}^3 \]

@variable(md_model, candidates[row=1:(md.size), col=1:(md.size), digit=1:(md.size)], Bin);

Row, Column, and Candidate Constraints

Since we know that the solved MathDoku puzzle is a Latin square, we know that it must have one number of each type per row and column. To do this, we begin adding constraints to our candidate variables such that every row of candidate variables and column of candidate variables must sum to one. As previously mentioned, we also need to constrain the candidates so that only one value in each 1-by-1-by-N tube is true at a time.²

This leads to the following constraints:

\[ \begin{aligned} \text{rows: } & \sum_{i = 1}^{N} c_{i,j,k} = 1 \quad \forall \left(j, k\right) \in \{1, \ldots, N\}^2 \\ \text{columns: } & \sum_{j = 1}^{N} c_{i,j,k} = 1 \quad \forall \left(i, k\right) \in \{1, \ldots, N\}^2 \\ \text{candidates: } & \sum_{k = 1}^{N} c_{i,j,k} = 1 \quad \forall \left(i, j\right) \in \{1, \ldots, N\}^2 \\ \end{aligned} \]

This is how we write those constraints in Julia.

# Latin Square Constraints
for a in 1:(md.size)
    for b in 1:(md.size)
        # Only one number of each type per row
        @constraint(md_model, sum(candidates[:, a, b]) == 1)
        # Only one number of each type per column
        @constraint(md_model, sum(candidates[a, :, b]) == 1)
        # Only one number can occupy each cell
        @constraint(md_model, sum(candidates[a, b, :]) == 1)
    end
end

Cage Constraints for Cell Values

Let’s define some expressions to make constraining the cells in each cage easier. These are shorthand for specific linear combinations of variables that we can reuse so that we don’t have to clutter our constraints with overly complicated equations.

Let’s define the N-by-N matrix of expressions V that contains the values of each cell in the MathDoku grid. Each element in V is equal to the corresponding tube of candidates multiplied element-wise by their position and summed together, as shown here.³

\[ v_{i,j} = \sum_{k = 1}^{N} \left(k \times c_{i,j,k}\right) \quad \forall \left(i, j\right) \in \{1, \ldots, N\}^2 \]

@expression(
    md_model,
    vals[row=1:(md.size), col=1:(md.size)],
    sum(candidates[row, col, :] .* collect(1:(md.size)))
);

Equality Cage Constraints

Equality cages are the easiest cages to form constraints around. The value of the single cell in the cage v_i,j must be equal to the target number t. If we wanted to, we could expand this by substituting v_i,j with its definition.⁴

\[ \begin{aligned} &v_{i,j} &= t \\ \Leftrightarrow & \sum_{k = 1}^{N} \left(k \times c_{i,j,k}\right) & = t \end{aligned} \]

Addition Cage Constraints

Addition cages are a little harder, but still doable. The sum of the values in cage A must be equal to the target number t.⁵

\[ \sum_{(i, j) \in \mathbf{A}} v_{i,j} = t \]

Subtraction Cage Constraints

Subtraction cages are the first cage type that we have to introduce more auxiliary variables to deal with; this is because the raw constraint required for subtraction cages shown below is nonlinear.

\[ \begin{aligned} \left| v_{i_1, j_1} - v_{i_2, j_2} \right| & = t \\ & \Downarrow & \\ \{ v_{i_1, j_1} - v_{i_2, j_2} = t \} & \text{ or } \{ v_{i_1, j_1} - v_{i_2, j_2} = -t \} \end{aligned} \]

To solve this, we’ll have to add a binary variable δ to linearize the solution. Using this binary variable δ, We can set the right-hand side of the equation equal to be equal to either t or -t, but not both.⁶

\[ \delta \in \{0, 1\} \]

\[ \left.\begin{matrix} \{ v_{i_1, j_1} - v_{i_2, j_2} &=& -t \} & \text{ or} \\ \{ v_{i_1, j_1} - v_{i_2, j_2} &=& t \} \end{matrix}\right\} \Rightarrow v_{i_1, j_1} - v_{i_2, j_2} = t \left(2 \delta - 1 \right) \]

Division Cage Constraints

While we could get by with subtraction cages by adding auxiliary variables, division itself is nonlinear! How are we supposed to rewrite a division cage’s constraints below as a system of linear equations?

\[ \{ v_{i_1, j_1} \div v_{i_2, j_2} = t \} \text{ or } \left\{ v_{i_1, j_1} \div v_{i_2, j_2} = \frac{1}{t} \right\} \]

We can use properties of logarithms to help us out. By definition, the product of two numbers is equal to the sum of their logarithms, and the difference of two numbers is equal to the difference of their logarithms. The logarithm of the inverse of a number is equal to the negative logarithm of that number.

\[ \begin{aligned} \log(a \times b) &= \log(a) + \log(b) \\ \log(a \div b) &= \log(a) - \log(b) \\ \log\left(\frac{1}{a}\right) &= -\log(a) \end{aligned} \]

By using logarithm properties, we can perform substitution to convert our division constraints into a subtraction constraint and condense it using the same tactic for subtraction cages. But, the resulting constraint is a linear combination of the logarithms of our values.

\[ \begin{aligned} \{ v_{i_1, j_1} \div v_{i_2, j_2} = t \} & \text{ or } \left\{ v_{i_1, j_1} \div v_{i_2, j_2} = \frac{1}{t} \right\} \\ & \Downarrow & \\ \{ \log(v_{i_1, j_1}) - \log(v_{i_2, j_2}) = \log(t) \} & \text{ or } \{ \log(v_{i_1, j_1}) - \log(v_{i_2, j_2}) = -\log(t) \} \\ & \Downarrow & \\ \log\left(v_{i_1, j_1}\right) - \log\left(v_{i_2, j_2}\right) & = \log(t) \left(2 \delta - 1 \right) \end{aligned} \]

We can create a new expression matrix L that will be equal to the log of the value expressions V when the puzzle is fully solved. Since each 1x1xN “tube” of candidate values should have one true value that corresponds to V, we can define the corresponding log-value in L equal to summing the element-wise product with the logarithms of 1 to N. We can use these expressions to skirt around the fact that logarithms themselves are also nonlinear.⁷

\[ \begin{aligned} \ell_{i,j} &= \sum_{k = 1}^{N} \left(\log(k) \times c_{i,j,k}\right) \quad \forall \left(i, j\right) \in \{1, \ldots, N\}^2 \\ \ell_{i,j} &= \log\left(v_{i,j}\right) \text{ when constraints are solved}\ \end{aligned} \]

With the elements of the expression matrix L, we can write the constraints for the division cage below.

\[ \begin{aligned} \log\left(v_{i_1, j_1}\right) - \log\left(v_{i_2, j_2}\right) & = \log(t) \left(2 \delta - 1 \right) \\ & \Downarrow & \\ \ell_{i_1, j_1} - \ell_{i_2, j_2} & = \log(t) \left(2 \delta - 1 \right) \end{aligned} \]

This is how we would define the log-values in Julia.

@expression(
    md_model,
    log_vals[row=1:(md.size), col=1:(md.size)],
    sum(candidates[row, col, :] .* log2.(1:(md.size)))
);

Multiplication Cage Constraints

It’s no surprise that multiplication itself is also nonlinear; but with the logarithm-values provided from figuring out division cages, we now know that we can convert some nonlinear constraints into linear constraints. With the logarithm-value expressions, we can convert a product of values into a sum of log-values.⁸

\[ \begin{aligned} &\prod_{(i, j) \in \mathbf{A}} v_{i,j} &=& t \\ \Rightarrow & \sum_{(i, j) \in \mathbf{A}} \log\left(v_{i,j}\right) &=& \log(t) \\ \Rightarrow & \sum_{(i, j) \in \mathbf{A}} \ell_{i,j} &=& \log(t) \\ \end{aligned} \]

Completing the Cage Constraints in Julia

Since we don’t need the auxiliary binary variable δ for each cage, we can set it to 0 for cages where it is not needed. Therefore, this is how we write our cage constraints for each cage in the MathDoku puzzle.

# Auxiliary binary variables for subtraction and division cages
@variable(md_model, δ[1:length(md.cages)], Bin)
# Cage Constraints
for (idx, cage) in enumerate(md.cages)
    if cage.operation == equals
        @constraint(md_model, vals[cage.coords[1]] == cage.target)
        fix(δ[idx], 0)
    elseif cage.operation == plus
        @constraint(md_model, sum(vals[cage.coords]) == cage.target)
        fix(δ[idx], 0)
    elseif cage.operation == minus
        @constraint(md_model, -(vals[cage.coords]...) == cage.target * (2 * δ[idx] - 1))
    elseif cage.operation == times
        @constraint(md_model, sum(log_vals[cage.coords]) == log2(cage.target))
        fix(δ[idx], 0)
    elseif cage.operation == divide
        @constraint(
            md_model, -(log_vals[cage.coords]...) == log2(cage.target) * (2 * δ[idx] - 1)
        )
    else
        throw("Invalid cage operation")
    end
end

Solving the MathDoku

Now that we have completed the constraints, initialized the variables, and used the expressions, we can solve the MathDoku puzzle. Since we assume that the puzzle only has one solution, we don’t need to define an objective function to solve it.

optimize!(md_model)
lp_solution = round.(Int, value.(vals))
display(lp_solution)

Running HiGHS 1.11.0 (git hash: 364c83a51e): Copyright (c) 2025 HiGHS under MIT licence terms
MIP  has 123 rows; 231 cols; 844 nonzeros; 231 integer variables (219 binary)
Coefficient ranges:
  Matrix [1e+00, 6e+00]
  Cost   [0e+00, 0e+00]
  Bound  [1e+00, 1e+00]
  RHS    [1e+00, 2e+01]
Presolving model
123 rows, 192 cols, 736 nonzeros  0s
105 rows, 149 cols, 582 nonzeros  0s
12 rows, 0 cols, 0 nonzeros  0s
0 rows, 0 cols, 0 nonzeros  0s
Presolve: Optimal

Src: B => Branching; C => Central rounding; F => Feasibility pump; J => Feasibility jump;
     H => Heuristic; L => Sub-MIP; P => Empty MIP; R => Randomized rounding; Z => ZI Round;
     I => Shifting; S => Solve LP; T => Evaluate node; U => Unbounded; X => User solution;
     z => Trivial zero; l => Trivial lower; u => Trivial upper; p => Trivial point

        Nodes      |    B&B Tree     |            Objective Bounds              |  Dynamic Constraints |       Work      
Src  Proc. InQueue |  Leaves   Expl. | BestBound       BestSol              Gap |   Cuts   InLp Confl. | LpIters     Time

         0       0         0   0.00%   0               0                  0.00%        0      0      0         0     0.1s

Solving report
  Status            Optimal
  Primal bound      0
  Dual bound        0
  Gap               0% (tolerance: 0.01%)
  P-D integral      0
  Solution status   feasible
                    0 (objective)
                    0 (bound viol.)
                    0 (int. viol.)
                    0 (row viol.)
  Timing            0.06 (total)
                    0.00 (presolve)
                    0.00 (solve)
                    0.00 (postsolve)
  Max sub-MIP depth 0
  Nodes             0
  Repair LPs        0 (0 feasible; 0 iterations)
  LP iterations     0 (total)
                    0 (strong br.)
                    0 (separation)
                    0 (heuristics)

6×6 Matrix{Int64}:
 6  5  1  4  3  2
 3  1  2  6  4  5
 5  2  4  1  6  3
 2  4  5  3  1  6
 1  6  3  2  5  4
 4  3  6  5  2  1

Look at that! There’s the solution to the MathDoku puzzle.

Issues with This Approach

The main problem with using linear programming to solve logic puzzles like MathDoku and Sudoku is that the tools for these types of problems find at most one solution. This is perfectly fine for most optimization problems, as they need to find the best values for decision variables; unlike them, logic puzzles like MathDoku aren’t supposed to have more than one solution. Alternative methods of formulating these logic puzzles as constraint satisfaction problems or iterating through possible solutions using search algorithms would work for finding other solutions, but they would take much more time.

Package Versions & Last Run Date

Julia 1.11.7
JuMP 1.29.1
HiGHS 1.19.0
Last Run 2025-10-04T12:49:40.564

References

Ercsey-Ravasz, Maria, and Zoltan Toroczkai. “The Chaos Within Sudoku.” arXiv, August 2012. https://doi.org/10.48550/arXiv.1208.0370.

JuMP core developers and contributors. “JuMP,” October 2023. https://jump.dev/JuMP.jl/stable/JuMP.pdf.

Kalvelagen, Erwin. “Yet Another Math Programming Consultant: MIP Modeling: From Sudoku to KenKen via Logarithms.” Yet Another Math Programming Consultant, October 2016. https://yetanothermathprogrammingconsultant.blogspot.com/2016/10/mip-modeling-from-sudoku-to-kenken.html.

Melkonian, Vardges. “An Integer Programming Model for the KenKen Problem.” American Journal of Operations Research 6, no. 3 (May 2016): 213–25. https://doi.org/10.4236/ajor.2016.63022.

Footnotes

JuMP core developers and contributors, “JuMP”.↩︎
Ercsey-Ravasz and Toroczkai, “The Chaos Within Sudoku”.↩︎
Ercsey-Ravasz and Toroczkai.↩︎
Kalvelagen, “MIP Modeling”.↩︎
Melkonian, “An Integer Programming Model for the KenKen Problem”.↩︎
Melkonian.↩︎
Kalvelagen, “MIP Modeling”.↩︎
Kalvelagen.↩︎

Reuse

CC BY SA 4.0 International License(View License)