Discussion:
[sqlite] SQLITE gives incorrect results for 'NOT IN' query if partial index exists
Deon Brewis
2018-12-08 08:02:02 UTC
Permalink
The 'NOT IN' query at the bottom should not return any results. It returns '2'.

Repro:
=====
create table foo(y, z);
insert into foo values(1,1);
insert into foo values(2,2);
CREATE INDEX fooZ on Foo(y) where z=1;

create table bar(x);
insert into bar values(1);
insert into bar values(2);

select * from bar WHERE x NOT IN (SELECT y from foo); -- this will wrongly return 2.

Issue:
=====
In the query plan, sqlite chooses to use the FooZ index:

explain query plan select * from bar WHERE x NOT IN (SELECT y from foo);
QUERY PLAN
|--SCAN TABLE bar
`--USING INDEX fooZ FOR IN-OPERATOR


FooZ is not a valid index for this subquery, it is incomplete due to the partial index expression, with the result is doing a NOT IN over it returns incorrect results.

- Deon
Olivier Mascia
2018-12-08 11:10:42 UTC
Permalink
Post by Deon Brewis
The 'NOT IN' query at the bottom should not return any results. It returns '2'.
=====
create table foo(y, z);
insert into foo values(1,1);
insert into foo values(2,2);
CREATE INDEX fooZ on Foo(y) where z=1;
create table bar(x);
insert into bar values(1);
insert into bar values(2);
select * from bar WHERE x NOT IN (SELECT y from foo); -- this will wrongly return 2.
=====
explain query plan select * from bar WHERE x NOT IN (SELECT y from foo);
QUERY PLAN
|--SCAN TABLE bar
`--USING INDEX fooZ FOR IN-OPERATOR
FooZ is not a valid index for this subquery, it is incomplete due to the partial index expression, with the result is doing a NOT IN over it returns incorrect results.
Indeed. As well as, logically:

select * from bar WHERE x IN (SELECT y from foo); -- this will wrongly return 1.

As expected a simple WHERE 1 in the subselect cancels the wrong index selection:

select * from bar WHERE x NOT IN (SELECT y from foo WHERE 1); -- OK, no results
--
Best Regards, Meilleures salutations, Met vriendelijke groeten,
Olivier Mascia
Richard Hipp
2018-12-08 13:10:49 UTC
Permalink
Thanks for the bug report and for the repro script.

This problem appears to have been in the code ever since partial
indexes were first introduced in version 3.8.0 (2013-08-26). I have
some other things I have to deal with today. I will get to this as
soon as I can. There is a ticket at
https://www.sqlite.org/src/tktview/1d958d90596593a77420e590a6ab71756484f576
Post by Deon Brewis
The 'NOT IN' query at the bottom should not return any results. It returns '2'.
=====
create table foo(y, z);
insert into foo values(1,1);
insert into foo values(2,2);
CREATE INDEX fooZ on Foo(y) where z=1;
create table bar(x);
insert into bar values(1);
insert into bar values(2);
select * from bar WHERE x NOT IN (SELECT y from foo); -- this will wrongly return 2.
=====
explain query plan select * from bar WHERE x NOT IN (SELECT y from foo);
QUERY PLAN
|--SCAN TABLE bar
`--USING INDEX fooZ FOR IN-OPERATOR
FooZ is not a valid index for this subquery, it is incomplete due to the
partial index expression, with the result is doing a NOT IN over it returns
incorrect results.
- Deon
_______________________________________________
sqlite-users mailing list
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
--
D. Richard Hipp
***@sqlite.org
Deon Brewis
2018-12-08 21:53:18 UTC
Permalink
Thanks a lot for the quick fix!

I'm curious how that test that you added works? i.e. What causes the test to fail if the results are wrong? (Sorry, I know almost nothing about the SQLITE test framework).
https://www.sqlite.org/src/info/15bc915dd7ae25bf

- Deon

-----Original Message-----
From: ***@gmail.com <***@gmail.com> On Behalf Of Richard Hipp
Sent: Saturday, December 8, 2018 5:11 AM
To: SQLite mailing list <sqlite-***@mailinglists.sqlite.org>
Cc: Deon Brewis <***@outlook.com>; ***@integral.be
Subject: Re: [sqlite] SQLITE gives incorrect results for 'NOT IN' query if partial index exists

Thanks for the bug report and for the repro script.

This problem appears to have been in the code ever since partial indexes were first introduced in version 3.8.0 (2013-08-26). I have some other things I have to deal with today. I will get to this as soon as I can. There is a ticket at
https://www.sqlite.org/src/tktview/1d958d90596593a77420e590a6ab71756484f576
Post by Deon Brewis
The 'NOT IN' query at the bottom should not return any results. It returns '2'.
=====
create table foo(y, z);
insert into foo values(1,1);
insert into foo values(2,2);
CREATE INDEX fooZ on Foo(y) where z=1;
create table bar(x);
insert into bar values(1);
insert into bar values(2);
select * from bar WHERE x NOT IN (SELECT y from foo); -- this will wrongly return 2.
=====
explain query plan select * from bar WHERE x NOT IN (SELECT y from
foo); QUERY PLAN
|--SCAN TABLE bar
`--USING INDEX fooZ FOR IN-OPERATOR
FooZ is not a valid index for this subquery, it is incomplete due to
the partial index expression, with the result is doing a NOT IN over
it returns incorrect results.
- Deon
_______________________________________________
sqlite-users mailing list
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
--
D. Richard Hipp
***@sqlite.org
Richard Hipp
2018-12-08 22:47:33 UTC
Permalink
Post by Deon Brewis
I'm curious how that test that you added works? i.e. What causes the test
to fail if the results are wrong?
The particular test case you are referring to is written in in the TCL
language. The TCL tests are the oldest set of tests for SQLite since
SQLite is really a TCL-extension that escaped into the wild.

Everything in TCL is a command followed by zero or more arguments. In
this sense, TCL is very much like Bourne shell. COMMAND ARG1 ARG2
ARG3 .... Where TCL excels is in how it quotes the arguments. Curly
braces {...} are quoting characters that nest. Take, for example,
the "if" command in TCL:

if {$a<0} {
set a [expr {-$a}]
} else {
set a [expr {$a+10}]
}

In this case, the "if" command has four arguments

if EXPR SCRIPT else SCRIPT

When the "if" command runs, it evaluates its first argument EXPR. If
EXPR is true, then the if command runs the SCRIPT given in the second
argument. Otherwise it runs the SCRIPT in the fourth argument. The
magic, you see, is in the use of nested curly braces for quoting.

The test command you refer to is this:

do_execsql_test index6-12.1 {
DROP TABLE IF EXISTS t1;
DROP TABLE IF EXISTS t2;
CREATE TABLE t1(a,b);
INSERT INTO t1 VALUES(1,1);
INSERT INTO t1 VALUES(2,2);
CREATE TABLE t2(x);
INSERT INTO t2 VALUES(1);
INSERT INTO t2 VALUES(2);
SELECT 'one', * FROM t2 WHERE x NOT IN (SELECT a FROM t1);
CREATE INDEX t1a ON t1(a) WHERE b=1;
SELECT 'two', * FROM t2 WHERE x NOT IN (SELECT a FROM t1);
} {}

The name of the command is "do_execsql_test". That command takes
three arguments:

do_execsql_test TESTNAME SQL-SCRIPT EXPECTED-RESULT

This command simply runs the SQL found in its second argument and
accumulates the results. The accumulated result should exactly match
the third argument. If it does not match, then it prints an error
message and increments the error counter.

If you start with the canonical SQL source code, you can generate the
appropriate TCL interpreter by typing

./configure; make testfixture

Or on windows:

nmake /f Makefile.msc testfixture.exe

Then you say "./testfixture test/index6.test" to run that particular test file.
--
D. Richard Hipp
***@sqlite.org
Loading...