A Knowledge Compilation Take on Binary Boolean Optimization

Florent Capelli
join work with Alberto Del Pia and Silvia di Gregorio

Université d’Artois - CRIL

Industrial and Systems Engineering Department, UW Madison

October 27, 2023

Boolean Optimization Problem

BPO problem: $\max_{x_1,\dots,x_n \in \{0,1\}^n} P(x_1,\dots,x_n)$

where $P$ is a polynomial.

Observation: $P$ may be assumed to be multilinear since $x^2 = x$ over $\{0,1\}$

$P = \sum_{e \in E} \alpha_e \prod_{i \in e} x_i$

where $E \subseteq 2^V$

Example

$P(x_1,x_2,x_3) = x_1x_2x_3 - 2x_1x_3 + 3x_1$

$x_1$	$x_2$	$x_3$	$P(x)$
0	0	0	0
0	0	1	0
0	1	0	0
0	1	1	0
1	0	0	3
1	0	1	1
1	1	0	3
1	1	1	2

$P(1,0,0)=P(1,1,0)=3$ are maximal

Complexity of BPO

Bad news: Solving BPO is NP-hard.

Intuition: a polynomial can encode many things:

$OR(x,y) = x+y-xy$ encodes $x \vee y$ on $\{0,1\}$
For a graph $G=(V,E)$ , with $N$ vertices and $M$ edges $VC(V)=\sum_{\{v,w\} \in E} OR(v,w)$ $VC(\tilde{V}) = M$ iff $\tilde{V}$ encodes a vertex cover of $G$
$MVC(V) = 2N \times (VC(V) - M)-\sum_{x \in V} v$ is maximal at $\tilde{V}$ iff $\tilde{V}$ encodes a minimal vertex cover!

Solving BPO as ILP

BPO is a non-linear optimization problem.

Make it linear so that we can use LP solvers!

$\max_{x \in \{0,1\}^V} \sum_{e \in E} \alpha_e \prod_{v \in e} x_v$ rewrites as:

$\max \sum_{e \in E} \alpha_e y_e$ such that

$y_e = \prod_{v \in e} x_v$

$y_e \leq \prod_{v \in e} x_v$ and $\prod_{v \in e} x_v \leq y_e$

$y_e \leq x_v$ for $v \in e$ and $\prod_{v \in e} x_v \leq y_e$

$y_e \leq x_v$ for $v \in e$ and $\sum_{v \in e} x_v \leq y_e-1 + |e|$

$x_v, y_e \in \{0,1\}$

Integer Linear Program solvers can now solve it!

BPO as a Boolean Function Problem

Boolean Function

$f \subseteq \{0,1\}^X$ is a Boolean function on variables $X$ .

An assignment $\tau : X \rightarrow \{0,1\}$ satisfies $f$ iff $\tau \in f$ .

Example: Boolean function on $\{x,y,z\}$ :

x	y	z
0	0	0
0	0	1
0	1	0
0	1	1
1	1	1

Represented as a formula $x \Rightarrow (y \wedge z)$ or by $(\neg x \vee y) \wedge (\neg x \vee z)$

Weighted Boolean Function

For $w : X \times \{0,1\} \rightarrow \mathbb{R}$ and $\tau \in \{0,1\}^X$ consider:

$w(\tau) = \prod_{x \in X} w(x, \tau(x)) \text{ and } w(f) = \sum_{\tau \in f} w(\tau)$

Example:

$w(x,0) = 1$ ,
$w(x,1)=2$ ,
$w(y,0) = 3$ ,
$w(y,1)=-3$ ,
$w(z,0)=5$ ,
$w(z,1)=-5$

x	y	z		$w$
0	0	0	$135$	$15$
0	0	1	$13-5$	$-15$
0	1	0	$1-35$	$-15$
0	1	1	$1-3-5$	$15$
1	1	1	$2-3-5$	$30$
$w(f)$			$15-15-15+15+30$	30

Algebraic Model Counting

$w(f) = \sum_{\tau \in f} \prod_{x \in X} w(x, \tau(x))$

$w(f) = \bigoplus_{\tau \in f} \bigotimes_{x \in X} w(x, \tau(x))$

where $\mathbb{K} = (K,\oplus, \otimes, 0_\oplus, 1_\otimes)$ is a semi-ring

That is:

$\oplus, \otimes$ commutative, associative
$a \oplus 0_\oplus = a$ , $b \otimes 1_\otimes = b$
$\otimes$ distributes over $\oplus$ : $(a \otimes (b \oplus c)) = (a \otimes b) \oplus (a \otimes c)$ .

Examples:

$(\mathbb{R}, +, \times, 0 , 1)$
Any fields, e.g., $\mathbb{Z} / p\mathbb{Z}$
Arctic semi-ring: $(\mathbb{Q}, \max, +, -\infty, 0)$

AMC over Arctic Semiring

$w(f) = \max_{\tau \in f} \sum_{x \in X} w(x, \tau(x))$

This is an optimization task!

Example:

$w(x,0) = 1$ ,
$w(x,1)=2$ ,
$w(y,0) = 3$ ,
$w(y,1)=-3$ ,
$w(z,0)=5$ ,
$w(z,1)=-5$

x	y	z		$w$
0	0	0	$1+3+5$	$9$
0	0	1	$1+3-5$	$-1$
0	1	0	$1-3+5$	$3$
0	1	1	$1-3-5$	$-7$
1	1	1	$2-3-5$	$-6$
$w(f)$			$\max(9,-1,3,-7,-6)$	9

BPO and Boolean Functions

For $P := \sum_{e \in E} \alpha_e \prod_{i \in e} x_i$ define: $f_P := \bigwedge_{e \in E} C_e$ where $C_e := Y_e \Leftrightarrow \bigwedge_{i \in e} X_i$

$C_e$ encodes $y_e = \prod_{i \in e} x_i$ !

and $w_P$ on $(\mathbb{Q}, \max, +, -\infty, 0)$ as:

$w_P(Y_e,1) = \alpha_e$ and
$w_P(X_i,b)=w_P(Y_e,0) = 0$ for $b \in \{0,1\}$ .

Encoding BPO as Boolean function: an example

Example: $P(x_1,x_2,x_3) =$ $x_1x_2x_3$ $- 2x_1x_3$ $+ 3x_1$

$f_P =$ $(Y_1 \Leftrightarrow (X_1 \wedge X_2 \wedge X_3))$ $\wedge$ $(Y_2 \Leftrightarrow (X_1 \wedge X_3))$ $\wedge$ $(Y_3 \Leftrightarrow X_1)$
$w_P(Y_1,1) = 1$ , $w_P(Y_2,1)=-1$ and $w_P(Y_3,1)=3$ .

$\begin{align*} w_P(f_P) & = \max_{\tau \in f_P} w_P(f_\tau) \\ & = w_P(Y_1=0, Y_2=0, Y_3=1, X_1=1, X_2=0, X_3=0) \\ & = 3 \\ & = P(1,0,0) \end{align*}$

BPO as a Boolean Function

$\bigoplus_{\tau \in f_P} \bigotimes_{x \in X \cup Y} w_P(x, \tau(x)) = \max P(x_1,\dots,x_n)$

where $f_P = \bigwedge_{e \in E} C_e$ .

We can use existing toolbox for AMC to solve BPO

theoretical results
AND practical results

Knowledge Compilation how to solve AMC

Representing Boolean functions

How can we represent Boolean function: $f \subseteq \{0,1\}^X$

So far we have seen: list every satisfying assignment of $f$ (aka Truth Table)

Easy to manipulate since the representation is explicit
Not compact

CNF Formulas

$F = \bigwedge (\bigvee \ell)$ where $\ell$ is a literal $x$ or $\neg x$ for some variable $x$ .

Examples:

$F_1=(x \vee \neg y) \wedge (\neg x \vee y)$

$x$	$y$	$F_1$
$0$	$0$	$1$
$0$	$1$	$0$
$1$	$0$	$0$
$1$	$1$	$1$

$F_2=(x \vee \neg z) \wedge (\neg x \vee y) \wedge (x \vee y \vee z)$

$x$	$y$	$z$	$F_2$
$1$	$1$	$1$	$1$
$0$	$1$	$0$	$1$
$1$	$1$	$0$	$1$
$*$	$*$	$*$	$0$

The SAT Problem

CNF formulas are extremely simple yet can encode many interesting problems.

Cook, Levin, 1971: The problem SAT of deciding whether a CNF formula is satisfiable is NP-complete.

Valiant 1979: The problem #SAT of counting the satisfying assignment of a CNF formula is #P-complete.

Very unlikely that efficient algorithms exists for solving SAT / #SAT
Thriving community nevertheless addresses this problem in practice
SAT Solver very efficient in many applications

Relevance of CNF formulas

Natural encoding: succinctly encodes many problems, witnessed by the many existing industrial benchmarks.
Intractable for algebraic model counting

Looking for tradeoffs between Truth Tables and CNFs!

Circuit Based Representations

Research has focused on factorized representation.

An example

Data structure based on decision nodes to represent “ $(x+y+z)$ is even”.

Path for $x=1$ , $y=0$ and $z=1$ is accepting.

OBDDs

Previous data structure are Ordered Binary Decision Diagrams.

Directed Acyclic graphs with one source
Sinks are labeled by $0$ or $1$
Internal nodes are decision nodes on a variable in $x_1, \dots, x_n$
Variables tested in order.

Counting with OBDDs

How many $3 \times 3$ $\{0,1\}$ -matrices have a row full of ones?

Tractability of OBDDs

This idea can be generalized to any OBDDs:

Let $f \subseteq \{0,1\}^X$ be a function computed by an OBDD having $E$ edges. We can compute $\#f$ with $O(E)$ arithmetic operations.

Generalises to many tasks:

Evaluate $Pr(f)$ if probabilities $Pr(x=1)$ are given for each $x \in X$
Enumerate $f$
Algebraic Model Counting on any semi-ring.

Back to BPO

$w_P(f_P) = \max P(x_1,\dots,x_n)$ where

$f_P = \bigwedge_{e \in E} Y_e \Leftrightarrow \bigwedge_{i \in e} X_i$
$w_P(Y_e,1)=\alpha_e$ and $0$ otherwise

Solving BPO:

transform $f_P$ into an OBDD
compute $w_P(f_P)$ via dynamic programming on the OBDD itself

A Knowledge Compiler for OBDD

Exhaustive DPLL with Caching based on Shannon Expansion:

$F = (x \vee y \vee z) \wedge (x \vee \neg y \vee \neg z) \wedge (\neg x \vee \neg y \vee \neg z) \wedge (\neg x \vee y \vee z)$

$F[x=0] = (y \vee z) \wedge (\neg y \vee \neg z)$
$F[x=1] = (\neg y \vee \neg z) \wedge (y \vee z)$
$F[x=1,y=1] = \neg z$
$F[x=1,y=0] = z$
$F[x=0,y=1] = \neg z$ $= F[x=1,y=1]$
$F[x=0,y=0] = z$ $= F[x=1,y=0]$

This scheme is parameterized by:

caching policy
branching heuristics

Exploiting decomposition

For many tasks, such as model counting, it is interesting to detect syntactic decomposable part of the formula, that is:

$F(X) = G(Y) \wedge H(Z)$ and $Y \cap Z = \emptyset$

decDNNF: OBDD + $\wedge$ -gates decomposable
Still allows for algebraic model counting via the identity $w(F) = w(G) \times w(H)$
Compilers can be adapted to detect this rule.

The D4 compiler

D4 is a top-down compiler as shown earlier:

Use oracle calls to a SAT solver with clause learning to cut branches and speed up later computation
Use heuristics to decompose the formula so that it breaks into smaller connected components.

instance	d4 (s)	scip (s)
bernasconi.20.3	0.002	0.01
bernasconi.20.5	0.04	8.91
bernasconi.20.10	1.21	119.20
bernasconi.20.15	14.92	479.15
bernasconi.25.3	0.00	0.01
bernasconi.25.6	0.19	151.65
bernasconi.25.13	12.59	1 698.18
bernasconi.25.19	442.26	TIMEOUT
bernasconi.25.25	TIMEOUT	TIMEOUT

Tractability results

Tractable classes of BPO

$P(x_1,\dots,x_n) = \sum_{e \in E} \alpha_e \prod_{i \in e} x_i \text{ where } E \subseteq 2^V$

$H=(V,E)$ is a hypergraph.

Exploit the structure of $H$ to solve BPO more efficiently.

A toy example

Tree BPO: BPO problem where $H$ is a tree.

Example: $x_1x_2+5x_1x_3+3x_0x_1-2x_0x_4+3x_4x_5$

Many Tractable Classes

If $H$ has tree width $k$ then one can solve BPO in time $2^{O(k)}poly(H)$ .
If $H$ is $\beta$ -acyclic then one can solve BPO in time $poly(H)$ .

Dedicated algorithm for each class.

A strange symmetry

Very similar results from Boolean function literature:

If a CNF $F$ has tree width $k$ then one can construct a DNNF for $F$ of size $2^{O(k)}poly(F)$ .
If a CNF $F$ is $\beta$ -acyclic then one can construct a DNNF for $F$ $poly(F)$ .

Is there a connection?

Encoding BPO as a CNF

For $P := \sum_{e \in E} \alpha_e \prod_{i \in e} x_i$ define: $f_P := \bigwedge_{e \in E} C_e$ where $C_e := Y_e \Leftrightarrow \bigwedge_{i \in e} X_i$

$C_e$ can be encoded as the conjunction of:

$\bigvee_{i \in e} \neg X_i \vee Y_e$
$\neg Y_e \vee X_i$ for every $i \in e$

$f_P$ is naturally encoded as a CNF $F_P$ that preserves tree width.

Tractability of BPO via KC

Every known tractability for BPO can be recovered in our framework as follows:

Encode $P$ as a CNF formula $F_P$
Transform $F_P$ into a polynomial size tractable representation $C_P$ using known results
Solve AMC on $C_P$

And we get new tractability results!

Beyond BPO

KC approach very versatile:

Solve top-k BPO: find the $k$ best solutions of $P$ by finding the $k$ best in the circuit
Solve BPO + Cardinality constraints: $\max P(x_1,\dots,x_n)\ such\ that \sum_{i=1}^n x_i \in S$ where $S \subseteq [n]$ by transforming the circuit
Solve pseudo BPO: $P$ can contains monomial of the form $\prod_{i \in A} x_i \prod_{i \in B} (1-x_i)$

Conclusion

Connection between BPO and Boolean functions:

Recover known results and generalize them using the existing rich literature
Seems to have practical relevance

Perspective:

KC only exploits combinatorics of the underlying Boolean function. How could we mix existing more algebraic techniques?