Discussion:
INNER JOIN Optimization
petmal Malik
14 years ago
Permalink
Hello.

I have two tables to join. TABLE_A (contains column 'a') and TABLE_BC (contains columns 'b' and 'c'). There is a condition on TABLE_BC. The two tables are joined by 'rowid'.
Something like:

SELECT a, b, c FROM main.TABLE_A INNER JOIN main.TABLE_BC WHERE (b > 10.0 AND c < 10.0) ON main.TABLE_A.rowid = main.TABLE_BC.rowid ORDER BY a;

Alternatively:

SELECT a, b, c FROM main.TABLE_A AS s1 INNER JOIN (SELECT rowid, b, c FROM main.TABLE_BC WHERE (b > 10.0 AND c < 10.0)) AS s2 ON s1.rowid = s2.rowid ORDER BY a;

I need to do this a couple of time with different TABLE_A, but TABLE_BC does not change... I could therefore speed things up by creating a temporary in-memory database (mem) for the constant part of the query.

CREATE TABLE mem.cache AS SELECT rowid, b, c FROM main.TABLE_BC WHERE (b > 10.0 AND c < 10.0);

followed by (many):

SELECT a, b, c FROM main.TABLE_A INNER JOIN mem.cache ON main.TABLE_A.rowid = mem.cache.rowid ORDER BY a;

I get the same result set from all the queries above, but the last option is by far the fastest one.
The problem is that I would like to avoid splitting the query into two parts. I would expect SQLite to do the same thing for me automatically (at least in the second scenario), but it does not seem to happen... Why is that?

Thanks.
Pavel Ivanov
14 years ago
Permalink
Post by petmal Malik
The problem is that I would like to avoid splitting the query into two parts. I would expect SQLite to do the same thing for me automatically (at least in the second scenario), but it does not seem to happen... Why is that?
In short, because SQLite cannot read your mind.

To understand the answer compare speeds of executing one query (with
one TABLE_A) and creating an in-memory database, creating a table in
it and using that table in one query (with the same TABLE_A). I bet
the first option (straightforward query without in-memory database)
will be much faster. So SQLite selects the fastest way to execute your
query. It cannot predict what the future queries will be to understand
how to execute the whole set of queries faster. You can do that and
you should split your query in two parts.


Pavel
...
Continue reading on narkive:
Loading...