Florent Capelli
Université d’Artois - CRIL
Probabilistic Circuits and Logic Workshop - Simons Institute
October 18, 2023
Join Queries:
where tuple over
Example:
People | Id | Name | City |
---|---|---|---|
1 | Alice | Paris | 2 | Bob | Lens |
3 | Chiara | Roma | |
4 | Djibril | Berlin | |
5 | Émile | Paris | |
6 | Francesca | Roma |
Capital | City | Country |
---|---|---|
Berlin | Germany | |
Paris | France | |
Roma | Italy |
Q | City | Country | Id | Name |
---|---|---|---|---|
Paris | France | 1 | Alice | |
Roma | Italy | 3 | Chiara | |
Berlin | Germany | 4 | Djibril | |
Roma | Italy | 6 | Francesca |
Q | City | Country | Id | Name |
---|---|---|---|---|
Berlin | Germany | 4 | Djibril | |
Paris | France | 1 | Alice | |
Roma | Italy | 3 | Chiara | |
Roma | Italy | 6 | Francesca |
Reorder in lexicographical order.
Given and , simulate array access to where is the element of by lexicographical order.
Q | City | Country | Id | Name |
---|---|---|---|---|
Berlin | Germany | 4 | Djibril | |
Paris | France | 1 | Alice | |
Roma | Italy | 3 | Chiara | |
Roma | Italy | 6 | Francesca |
is .
Preprocessing expensive: possibly
Answering DA queries Cheap:
Can we have a better preprocessing?
There are classes of queries for which one can decide in polynomial time both in and .
Central class of join queries because of their tractability.
Total of solutions.
One can use these annotations top-down to solve direct access queries on compatible orders
Given a join query , if is a compatible order then we can answer direct access queries with precomputation time and access time .
Compatible order?
An -leaf in a hypergraph is such that there is , .
A hypergraph is -acyclic iff one can obtain by successively removing -leaves in : induces an order on called an -elimination order
[Brault Baron, 2014], simplification of [Graham-Yu-Özsoyoglu, 1979], also known as “without disruptive trio” in [Carmeli, Tziavelis, Gatterbauer, Kimelfeld, Riedewald, 2020]
Given a join query , if is a -elimination order then we can answer direct access queries with precomputation time and access time .
[Carmeli, Tziavelis, Gatterbauer, Kimelfeld, Riedewald, 2020]
Algorithm schema:
Factorised DB POV: the trace of Yannakakis Algorithm is roughly a decision-DNNF of size linear.
0 | 0 | 0 |
0 | 0 | 1 |
0 | 1 | 0 |
0 | 1 | 1 |
1 | 2 | 0 |
1 | 2 | 1 |
1 | 0 | 2 |
1 | 1 | 2 |
2 | 2 | 2 |
Let be a JQ and an -elimination order of .
It produces an ordered circuit computing of size
Precomputation: Compute for each decision gate on and value , how many solutions of we can build by setting to .
Find the solution.
Access: construct the solution one variable at a time.
Find the solution.
Access: construct the solution one variable at a time.
Given a join query , if is a -elimination order then we can answer direct access queries with precomputation time and access time .
Algorithm schema:
Why bother? It looks like the same algorithm on a slightly different data structure.
It is not: join tree has an extra underlying “tree” structure, the circuit obtained from this has more structure.
There exist relations with small ordered circuit but big tree-structured circuits [LICS17].
Can we solve more direct access problem with our technique?
Negative Join Queries (NJQ):
Good candidates for tractability of Negative Join Queries: every is acyclic.
This is a property known as -acyclicity.
Characterization: is -acyclic iff there exists a -elimination order , ie iteratively removing -leaves, ie, such that edges containing are ordered by :
Alternatively: a -elimination order is an order that is an -elimination order for every subquery.
Let be an NJQ and a -elimination order for . Exhaustive DPLL on , and with order returns an ordered circuit of size .
Generalization of [LICS17].
Corollary: Direct Access for -acyclic NJQ with preprocessing and access time for lexicographical orders based on (reversed) -elimination orders.
Side observation: we know that “join tree” based techniques fail for -acyclic NJQ!
Technique generalizes to: