Florent Capelli, Oliver Irwin
CRIL, UniversitΓ© dβArtois
April 19, 2024
Join Query :
where is a tuple over
Example:
| id | name | city |
|---|---|---|
| 1 | Alice | Paris |
| 2 | Bob | Lens |
| 3 | Chiara | Rome |
| 4 | Djibril | Berlin |
| 5 | Γmile | Dortmund |
| 6 | Francesca | Rome |
| city | country |
|---|---|
| Berlin | Germany |
| Paris | France |
| Rome | Italy |
| city | country | name | id |
|---|---|---|---|
| Paris | France | Alice | 1 |
| Rome | Italy | Chiara | 3 |
| Berlin | Germany | Djibril | 4 |
| Rome | Italy | Francesca | 6 |
Quickly access , the element of .
| city | country | name | id |
|---|---|---|---|
| Paris | France | Alice | 1 |
| Rome | Italy | Chiara | 3 |
| Berlin | Germany | Djibril | 4 |
| Rome | Italy | Francesca | 6 |
? .
Naive algorithm: materialize in an array, sort it. Access.
| city | country | name | id |
|---|---|---|---|
| Berlin | Germany | Djibril | 4 |
| Paris | France | Alice | 1 |
| Rome | Italy | Chiara | 3 |
| Rome | Italy | Francesca | 6 |
Precomputation : at least (may be worst), very costly
Access : , nearly free
Variable order :
| city | country | name | id |
|---|---|---|---|
| Berlin | Germany | Djibril | 4 |
| Paris | France | Alice | 1 |
| Rome | Italy | Chiara | 3 |
| Rome | Italy | Francesca | 6 |
In this talk: only lexicographical orders.
Direct Access generalizes many tasks that have been previously studied:
Naive Direct Access:
Can we have better preprocessing and reasonable access time?
For example:
Computing given and is -hard.
No Direct Access algorithm with good guarantees for every and .
Data complexity assumption: for a fixed , what is the best preprocessing for an access time ?
In this work, all presented complexity in data complexity will also be polynomial for combined complexity.
Direct Access for lexicographical order induced by ?
| a | b |
|---|---|
| 0 | 0 |
| 1 | 1 |
| 2 | 1 |
| b | c |
|---|---|
| 0 | 0 |
| 0 | 1 |
| 0 | 2 |
| 1 | 1 |
| 1 | 2 |
Precomputation :
Access :
Direct Access for lexicographical order induced by ?
Reduces to multiplying two -matrices over :
Given a query and order on its variables, we can compute such that:
So, if we understand everything for Direct Access and lexicographical orders, what is our contribution?
Negation interpreted over a given domain :
| 0 | 1 | 0 |
| 0 | 0 | 0 |
| 0 | 0 | 1 |
| 0 | 1 | 1 |
| 1 | 0 | 0 |
| 1 | 0 | 1 |
| 1 | 1 | 0 |
| 1 | 1 | 1 |
, domain .
Positive encoding: preprocessing
| 0 | 1 | 0 |
| 1 | 0 | 1 |
Linear preprocessing!
linear preprocessing
non-linear preprocessing
Subqueries may be harder to solve than the query itself!
Equivalent to if
non-linear preprocessing (triangle)
DA for implies DA for for every !
Good candidate for :
Signed-HyperOrder Width
For a (positive) JQ signed JQ, and a variable ordering, we can solve DA with
Our contribution : new island of tractability for Signed JQ!
Signed HyperOrder Width (and incidentally, our result) generalizes:
Basically, everything that is known to be tractable on SCQ/NCQ.
| 0 | 0 | 0 |
| 0 | 0 | 1 |
| 0 | 1 | 0 |
| 0 | 1 | 1 |
| 1 | 0 | 1 |
| 1 | 0 | 2 |
| 1 | 1 | 1 |
| 1 | 1 | 2 |
| 1 | 2 | 0 |
| 1 | 2 | 1 |
| 2 | 0 | 1 |
| 2 | 0 | 2 |
| 2 | 2 | 1 |
| 2 | 2 | 2 |
Factorized representation of relation :
Ordered: decision gates below only mention with .
For on domain , variables , DA possible :
Idea : for each gate over and for each domain value
compute the size of the relation where is set to a value
Compute the 7th solution 111
Compute the 13th solution 221
SCQ , .
Preprocessing:
Direct Access :
, considered constant here!
Compilation based on a variation of DPLL :