A simpler FPRAS for #NFA

Florent Capelli

June 22, 2023

The #NFA problem

Given a non-deterministic finite automaton \(A\) (NFA) and \(n \in \mathbb{N}\) (given in unary), compute the size of

\[L_n(A) = \{w \in L(A) : |w|=n\}\]

A hard problem

#NFA is #P-complete, that is, as hard as #SAT, that is:

Given a CNF formula \(F = \bigwedge_i\ C_i\) where \(C_i = \bigvee_j \ell_j\) with \(n\) variables, we can construct an automaton \(A\) accepting exactly \(2^n-\#F\) words of length \(n\).

For \(C = x_1 \vee \neg x_2 \vee x_4\), \(\neg C\) can be “represented” as \(A_C\):

\(\neg F\) is represented as \(\bigcup_{C \in F}\ A_{C}\) which accepts \(2^n-\#F\) words of size \(n\).

End of the story

#SAT is really hard (even to approximate), so is #NFA. Thank you for your attention!

\(A_F\) accepts \(2^n-\#F\) words: reduction not parcimonious.

We still have hopes to approximate!

Approximation

\(\tilde{f}\) is a ε-add approximation \((1\pm\varepsilon)\)-approximation for \(f\) if

\(f - \varepsilon < \tilde{f} < f + \varepsilon\)

\(f - \varepsilon f < \tilde{f} < f + \varepsilon f\)

that is \((1-\varepsilon)f < \tilde{f} < (1+\varepsilon)f\)

Still hard for #NFA:

if \(\tilde{N}\) is a ε-add approx of \(2^n-\#F\) then \(2^n-\tilde{N}\) is a ε-add approx of \(\#F\).
ε-add approx for \(\#NFA\) gives a ε-add approx for \(\#SAT\), known to be NP-hard!

Tractable for #NFA: Arenas, Croquevielle, Jayaram, Riveros, JACM 2021.

#NFA Admits an FPRAS: Efficient Enumeration, Counting, and Uniform Generation for Logspace Classes.

FPRAS

PTIME algorithm returning a \((1\pm\varepsilon)\)-approximation with high probability.

\(\tilde{f}\) is an FPRAS for \(f : A \rightarrow \mathbb{N}\) if on input \(x \in A\), \(\varepsilon \in [0,1]\), \(\delta \in [0,1]\)

\(\tilde{f}\) runs in polynomial time in \(\varepsilon^{-1}\), \(|x|\) and \(\log(\delta^{-1})\)
\(\tilde{f}(x, \varepsilon\)\(, \delta\)\()\) is a \((1 \pm \varepsilon)\)-approximation of \(f(x)\) with probability \(\geq 3/4\) \(\geq 1-\delta\).

FPRAS and #NFA

#NFA admits an FPRAS, that is, there exists a polynomial time algorithm that given an NFA \(A\), \(n \in \mathbb{N}\), \(\varepsilon, \delta \in [0,1]\), returns \(\tilde{a}\) such that

\[\mathsf{Pr}[(1-\varepsilon)|L_n(A)| < \tilde{a} < (1+\varepsilon)|L_n(A)|] \geq 1-\delta.\]

This talk: let’s prove this!

Monte-Carlo: mother of all FPRAS

Approximate \(r = {|S| \over |U|}\) with \(\tilde{r}={|S \cap T| \over |T|}\) where \(T\) is drawn uniformly from \(U\).

If \(|T| > TMC(\varepsilon, \delta) = O({\log(\delta^{-1}) \over \varepsilon^2})\), \[|\tilde{r}-r|<\varepsilon\] with proba \(\geq 1-\delta\).

Monte-Carlo is short-sighted

Approximate \(s=|S|=r|U|\) with \(\tilde{s}=\tilde{r}|U|={|S \cap T| \over |T|}|U|\) where \(T\) is drawn uniformly from \(U\).

If \(|T| > O({\log(\delta^{-1}) \over {\color{red} r}\varepsilon^2})\),

\(\tilde{s}\) is a \((1\pm\varepsilon)\)-approximation of \(s\)

with proba \(\geq 1-\delta\).

Union of sets

Let \(A_1, \dots, A_m \subseteq U\) so that we have:

the size \(a_i\) of \(A_i\),
a uniform sampler \(u_i\) in \(A_i\).
membership testing (\(x \in? A_i\) ) Estimate \(a=|\bigcup_{i=1}^m A_i|\).

KLM Algorithm (Revisited)

Karp, Luby, Madras Algorithm (1974):

\(B_i = |A_i \setminus \bigcup_{j<i} A_j|\). We have \(\biguplus_i B_i = \bigcup_i A_i\)

Sketch \(S_i \subseteq A_i\) to get \(\tilde{r_i} = {|S_i \cap B_i|\over|A_i|}\), an approximation of \(r_i = {|B_i| \over |A_i|}\).
Return \(\tilde{a}=\sum_i a_i \tilde{r_i}\)

Intuition: \(\sum_i a_i r_i = a\)

KLM Algorithm: A proof!

\(\tilde{r_i} = {|B_i \cap S_i| \over |S_i|}\) for \(S_i \subseteq A_i\) uniformly drawn.
If \(|S_i| > TMC({\varepsilon \over m}, {\delta \over m})\), with proba \(\geq 1-{\delta \over m}\): \[|r_i - \tilde{r_i}| < {\varepsilon \over m}\]
Hence with proba \(\geq \delta\), \(\forall i, |r_i - \tilde{r_i}| < {\varepsilon \over m}\)

\(\sum_i a_i \tilde{r_i}\leq\sum_i a_i r_i +{\varepsilon\over m} a_i \leq |\bigcup_i A_i|(1+\varepsilon)\)

because \(\sum_i a_i \leq m a\).

Time to relax!

Let \(A_1, \dots, A_m \subseteq U\) so that we have:

a \((1\pm\varepsilon)^p\)-approximation \(\tilde{a_i}\) of \(|A_i|\),
uniform sampler/membership.

Estimate \(a=|\bigcup_{i=1}^m A_i|\).

\[\tilde{a} = \sum_{i} \tilde{a_i}\tilde{r_i}\] is a \((1\pm\varepsilon)^{p+1}\)-approximation of \(a\) with proba \(\geq \delta\).

YAKLM, yet another KLM

Let \(A_1, \dots, A_m \subseteq U\) so that we have:

a \((1\pm\varepsilon)^p\)-approximation \(\tilde{a_i}\) of \(|A_i|\),
uniform sampler/membership.

Estimate \(a_I=|\bigcup_{i \in I} A_i|\) for \(I \subseteq [m]\)

Preprocessing phase: only a polynomial amount of sampling is allowed.
Online phase: for any \(I\), the algorithm has to answer \(\tilde{a_I}\) in polynomial time without any further sampling
Guarantee: \(\mathbf{Pr}(\forall I \subseteq [m], \tilde{a_I}\) is \((1\pm\varepsilon)^{p+1}\)-approx of \(a_I)\) \(\geq 1-\delta\).

Solving YAKLM

Preprocessing: Sketch \(S_i \subseteq A_i\) for every \(i \leq m\)
Online phase: on input \(I\), return \(\sum_{i \in I} \tilde{a_i}\tilde{r_{i,I}}\) where

\[\tilde{r_{i,I}} = {|S_i \cap A_i \setminus (\bigcup_{j \in I, j<i} A_j)| \over |S_i|}\]

Works as before, as long as \(\tilde{r_{i,I}}\) is “good” for every \(I \subseteq [m]\). We need to sample \(S_i\) of size \(TMC({\varepsilon\over m}, {\delta \over {\color{red} 2^m}})\)
Recall \(TMC(\varepsilon, \delta) = O({\log(\delta^{-1}) \over \varepsilon^2})\) so \(TMC({\varepsilon\over m}, {\delta \over 2^m})\) is polynomial!!

Dynamic programming FPRAS for #NFA

Given an NFA and \(Q' \subseteq Q\), let \(L_k(Q')\) be the words of length \(k\) accepted by the NFA where \(Q'\) are the initial states and \(N(Q',k)=|L_k(Q')|\).

FPRAS works by computing for every \(q \in Q\) and \(k \leq n\):

\(\tilde{N}(q,k)\): a \((1 \pm \varepsilon)^k\)-approximation of \(N(q,k)\).
\(\tilde{S}(q,k) \subseteq L_k(q)\): a uniformly drawn of size and “big enough”

YAKLM and #NFA

Fundamental relation: \[\begin{align} N(q,k+1) & = \sum_{a \in \Sigma} N(\delta(q,a), k) \\ & = \sum_{a \in \Sigma} |\bigcup_{q \in \delta(q,a)}L_k(q)| \end{align}\]

To get \(N(q,k+1)\), we need to estimate \(|\bigcup_{q \in Q'}L_k(q)|\) for some \(Q' \subseteq Q\).

Use the sketches Luke: If \(\tilde{S}(q,k) \subseteq L_k(q)\) are big enough, YAKLM gives a good approximation \(\tilde{N}(Q',k)\) for every \(Q' \subseteq Q\) with high probability!

Last Ingredient: sampling

YAKLM allows to compute \(\tilde{N}(q,k+1)\) from good sketches \(\tilde{S}(q,k)\).

For the induction to work, we need to uniformly sample a sketch \(\tilde{S}(q,k+1) \subseteq L_{k+1}(q)\)

\[\begin{align} L_{k+1}(q) & = \biguplus_{a \in \Sigma} L_k(\delta(q,a)) \end{align}\]

The problem boils down to: given disjoint \(A_1, \dots, A_m\), how can we uniformly sample in \(\biguplus A_i\)?

Sampling: the easy case

Let \(A_1, \dots, A_m\) be disjoint set for which we are given:

a uniform sampler for each \(A_i\),
\(a_i = |A_i|\).

Then the following uniformly sample elements from \(A=\biguplus A_i\):

pick \(i\) with probability \(a_i \over \sum_{j=1}^m a_j\)
return \(a \in A_i\) picked uniformly.

Indeed: the probability that \(a \in A_i \subseteq A\) is \({1 \over a_i}\times{a_i \over \sum_{j=1}^m a_j} = {1 \over |A|}\)

Sampling: the realistic case

Let \(A_1, \dots, A_m\) be disjoint set for which we are given:

a uniform sampler for each \(A_i\),
\(\tilde{a_i}\) a \((1\pm\varepsilon)^p\)-approximation of \(|A_i|\).

Can we uniformly sample in \(A\)? Not really…

pick \(i\) with probability \(\tilde{a_i} \over \sum_{j=1}^m \tilde{a_j}\)
return \(a \in A_i\) picked uniformly.

is almost uniform but not quite.

Biases get amplified inductively, not good enough.

Las Vegas Uniform Sampler

Uniform Samplers as long as they do not fail!

A Las Vegas Uniform Sampler for a set \(A\) is an algorithm \(M\) that returns \(\tilde{x} \in A \cup \{\bot\}\) such that there exists \(u \in [0,1]\) and for every \(a \in A\)

\[\mathbf{Pr}[\tilde{x}=a]=u\]

The failure probability of \(M\) is given as \(\mathbf{Pr}[\tilde{x}=\bot]=1-u|A|\).

Controlled Las Vegas Uniform Sampler

A Controlled Las Vegas Uniform Sampler up to \(\beta\) for a set \(A\) is an algorithm \(M\) that returns \(\tilde{x} \in A \cup \{\bot\}\) on input \(\alpha \leq \beta\) such that:

\[\mathbf{Pr}[\tilde{x}=a]=\alpha\]

The failure probability of \(M(\alpha)\) is hence \(\mathbf{Pr}[\tilde{x}=\bot]=1-\alpha|A|\).

Combining Controlled LVUS

Let \(A_1, A_2\) be two disjoint sets such that:

\(\tilde{a_i}\) is a \((1\pm\varepsilon)\)-approximation of \(|A_i|\).
\(M_i\) is a Controlled LVUS for \(A_i\) up to \(\beta \over |A_i|\).

Algorithm on input \(\alpha\):

Pick \(i\) with proba \(\tilde{a_i}\over \tilde{a_1}+\tilde{a_2}\)
Sample \(\tilde{x}\) in \(A_i\) with control \(\alpha{\tilde{a_1}+\tilde{a_2} \over \tilde{a_i}}\)

This is a LVUS for \(A=A_1 \cup A_2\) as long as \(\alpha \leq {\beta \over |A|}{(1-\varepsilon) \over (1+\varepsilon)}\)

An induction compatible statement

Let \(A_1, A_2\) be two disjoint sets such that:

\(\tilde{a_i}\) is a \((1\pm\varepsilon)^p\)-approximation of \(|A_i|\).
\(M_i\) is a Controlled LVUS for \(A_i\) up to \({1 \over |A_i|}{(1-\varepsilon)^{p(p-1)/2} \over (1+\varepsilon)^{p(p-1)/2}}\).

Then we have a LVUS for \(A=A_1 \cup A_2\) as long as \(\alpha \leq {1 \over |A|}{(1-\varepsilon)^{p(p+1)/2} \over (1+\varepsilon)^{p(p+1)/2}}\)

The failure probability is \({(1-\varepsilon)^{p(p+1)/2} \over (1+\varepsilon)^{p(p+1)/2}}\) which can be made small.

Piecing things together

Assume we have computed: - \(\tilde{N}(q,k)\): a \((1 \pm \varepsilon)^k\)-approximation of \(N(q,k)\).

\(\tilde{S}(q,k) \subseteq L_k(q)\): a uniformly drawn of size and “big enough”

(YAKML): Compute \(\tilde{N}(q, k+1)\) with the sketches \(\tilde{S}(q,k)\) and approximation \(\tilde{N}(q,k)\).

(Controlled LVUS): We inductively get a LVUS for \(L_{k+1}(q)\) with the relation

\[\begin{align} L_{k+1}(q) & = \biguplus_{a \in \Sigma} L_k(\delta(q,a)) \end{align}\]

Failure probability can be made small by choosing \(\varepsilon\) very small (still polynomial).

Polishing

To finish the proof:

Observe that a \((1\pm{\varepsilon\over 2p})^p\)-approximation is a \((1\pm\varepsilon)\)-approximation
Estimate the probability that some sketches are not giving good approximations,
Estimate the probability that some call to a LVUS fails,
Choose \(\varepsilon' = f(\varepsilon, \delta)\) small enough to make these probabilities small enough.

To have a probability greater than \(3/4\) of having a good approxmiation…

\[\varepsilon' < {1 \over 32}m^{-4}(n+1)^{-7}\log(16mn)^{-1}\]