Discussion:
[sqlite] Defect: Redundant CTE table materialization gives wrong answer.
petern
2018-01-19 05:04:32 UTC
Permalink
WITH flips(s) AS (VALUES (random()>0), (random()>0), (random()>0))
SELECT sum(s),(SELECT sum(s) FROM flips) FROM flips;
sum(s),"(SELECT sum(s) FROM flips)"
1,3
--Expected output is 1,1.

Why isn't the constant notional table table [flips] materialized just once
per CTE?

FYI. PostgreSQL 9.6 materializes [flips] exactly once per CTE:

WITH flips(s) AS (VALUES (random()), (random()), (random()))
SELECT sum(s),(SELECT sum(s) FROM flips) FROM flips;
sum sum
1.503042308613658 1.503042308613658

Also FYI, double checked: random() is indeed run 3 times by PostgreSQL.

WITH flips(s) AS (VALUES (random()), (random()), (random()))
SELECT * FROM flips;

s
0.760850821621716
0.9941047639586031
0.48273737309500575

Peter
Shane Dev
2018-01-19 06:17:31 UTC
Permalink
Good question
Post by petern
WITH flips(s) AS (VALUES (random()>0), (random()>0), (random()>0))
SELECT sum(s),(SELECT sum(s) FROM flips) FROM flips;
sum(s),"(SELECT sum(s) FROM flips)"
1,3
--Expected output is 1,1.
Why isn't the constant notional table table [flips] materialized just once
per CTE?
WITH flips(s) AS (VALUES (random()), (random()), (random()))
SELECT sum(s),(SELECT sum(s) FROM flips) FROM flips;
sum sum
1.503042308613658 1.503042308613658
Also FYI, double checked: random() is indeed run 3 times by PostgreSQL.
WITH flips(s) AS (VALUES (random()), (random()), (random()))
SELECT * FROM flips;
s
0.760850821621716
0.9941047639586031
0.48273737309500575
Peter
_______________________________________________
sqlite-users mailing list
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Simon Slavin
2018-01-19 06:46:22 UTC
Permalink
Post by petern
WITH flips(s) AS (VALUES (random()), (random()), (random()))
SELECT * FROM flips;
s
0.760850821621716
0.9941047639586031
0.48273737309500575
Are you expecting the three values to be the same ? What about

INSERT INTO MyTable VAKUES (random(), random(), random())

? Would you expect those three values to be the same ?

If you think of random() as an external function, do you think of it as marked SQLITE_DETERMINISTIC as defined in the following page ?

<https://sqlite.org/c3ref/create_function.html>

Simon.
petern
2018-01-19 06:56:58 UTC
Permalink
NO. I wrote that to rule out the objection that PostrgeSQL is getting the
right answer because random() is only being computed once per statement.
I naturally DO expect random() to run every time it is called. What I
wrote is a formality for those thinking ahead about what else could be
happening.
Post by Simon Slavin
Post by petern
WITH flips(s) AS (VALUES (random()), (random()), (random()))
SELECT * FROM flips;
s
0.760850821621716
0.9941047639586031
0.48273737309500575
Are you expecting the three values to be the same ? What about
INSERT INTO MyTable VAKUES (random(), random(), random())
? Would you expect those three values to be the same ?
If you think of random() as an external function, do you think of it as
marked SQLITE_DETERMINISTIC as defined in the following page ?
<https://sqlite.org/c3ref/create_function.html>
Simon.
_______________________________________________
sqlite-users mailing list
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Clemens Ladisch
2018-01-19 08:10:43 UTC
Permalink
Post by petern
WITH flips(s) AS (VALUES (random()>0), (random()>0), (random()>0))
SELECT sum(s),(SELECT sum(s) FROM flips) FROM flips;
sum(s),"(SELECT sum(s) FROM flips)"
1,3
--Expected output is 1,1.
Why isn't the constant notional table table [flips] materialized just once
per CTE?
FYI. PostgreSQL 9.6 materializes [flips] exactly once per CTE-users
Its documentation says:
<https://www.postgresql.org/docs/9.6/static/queries-with.html>
| A useful property of WITH queries is that they are evaluated only once
| per execution of the parent query, even if they are referred to more
| than once by the parent query or sibling WITH queries. Thus, expensive
| calculations that are needed in multiple places can be placed within
| a WITH query to avoid redundant work. Another possible application is
| to prevent unwanted multiple evaluations of functions with side-
| effects. However, the other side of this coin is that the optimizer is
| less able to push restrictions from the parent query down into a WITH
| query than an ordinary subquery.

This is an implementation detail of Postgres, and it is not required by
the SQL specification. SQLite chose the other side of the coin.


Regards,
Clemens
petern
2018-01-19 09:30:35 UTC
Permalink
Another implementation detail? I was wondering where you were, Clemens.
You're entitled to your opinion but I'd rather hear from somebody with
actual source check-in privileges.

I've posted at least 5 legitimate defects which triangulate on a
consistency problem when trying to access properties of the current
statement by any of CTE, cross join, aux_data, pointer, and subtype API's.

What response did I get? Crickets - and then Clemens telling me I'm
imagining things because these are ALL implementation details.

Well, Clemens you are definitely wrong about the CTE. If table aliases of
CTE's were intended as ordinary independent views, the CTE syntax makes no
sense.
Every other implementation agrees with the expected set-mathematical
definition. SQLite is producing the wrong answer and I'm looking forward
to seeing the trouble ticket that fixes this problem.

Peter
Post by Clemens Ladisch
Post by petern
WITH flips(s) AS (VALUES (random()>0), (random()>0), (random()>0))
SELECT sum(s),(SELECT sum(s) FROM flips) FROM flips;
sum(s),"(SELECT sum(s) FROM flips)"
1,3
--Expected output is 1,1.
Why isn't the constant notional table table [flips] materialized just
once
Post by petern
per CTE?
FYI. PostgreSQL 9.6 materializes [flips] exactly once per CTE-users
<https://www.postgresql.org/docs/9.6/static/queries-with.html>
| A useful property of WITH queries is that they are evaluated only once
| per execution of the parent query, even if they are referred to more
| than once by the parent query or sibling WITH queries. Thus, expensive
| calculations that are needed in multiple places can be placed within
| a WITH query to avoid redundant work. Another possible application is
| to prevent unwanted multiple evaluations of functions with side-
| effects. However, the other side of this coin is that the optimizer is
| less able to push restrictions from the parent query down into a WITH
| query than an ordinary subquery.
This is an implementation detail of Postgres, and it is not required by
the SQL specification. SQLite chose the other side of the coin.
Regards,
Clemens
_______________________________________________
sqlite-users mailing list
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
R Smith
2018-01-19 14:08:31 UTC
Permalink
Post by petern
You're entitled to your opinion but I'd rather hear from somebody with
actual source check-in privileges.
Perhaps mail a dev directly then?  On here you will never satisfy that
requirement.
Shane Dev
2018-01-20 23:17:00 UTC
Permalink
The following statement executes the random() function twice -

sqlite> select random() union all select random();
random()
2678358683566407062
-5528866137931448843
sqlite> explain select random() union all select random();
addr opcode p1 p2 p3 p4 p5 comment
---- ------------- ---- ---- ---- ------------- -- -------------
0 Init 0 1 0 00
1 Function0 0 0 1 random(0) 00
2 ResultRow 1 1 0 00
3 Function0 0 0 1 random(0) 00
4 ResultRow 1 1 0 00
5 Halt 0 0 0 00

but if random() is replaced with round(1.1) -

sqlite> select round(1.1) union all select round(1.1);
round(1.1)
1.0
1.0
sqlite> explain select round(1.1) union all select round(1.1);
addr opcode p1 p2 p3 p4 p5 comment
---- ------------- ---- ---- ---- ------------- -- -------------
0 Init 0 6 0 00
1 Copy 2 1 0 00
2 ResultRow 1 1 0 00
3 Copy 2 1 0 00
4 ResultRow 1 1 0 00
5 Halt 0 0 0 00
6 Real 0 3 0 1.1 00
7 Function0 1 3 2 round(1) 01
8 Goto 0 1 0 00

It seems to be executed once only. Does this happen because random() is
flagged non-deterministic?
Post by Clemens Ladisch
Post by petern
WITH flips(s) AS (VALUES (random()>0), (random()>0), (random()>0))
SELECT sum(s),(SELECT sum(s) FROM flips) FROM flips;
sum(s),"(SELECT sum(s) FROM flips)"
1,3
--Expected output is 1,1.
Why isn't the constant notional table table [flips] materialized just
once
Post by petern
per CTE?
FYI. PostgreSQL 9.6 materializes [flips] exactly once per CTE-users
<https://www.postgresql.org/docs/9.6/static/queries-with.html>
| A useful property of WITH queries is that they are evaluated only once
| per execution of the parent query, even if they are referred to more
| than once by the parent query or sibling WITH queries. Thus, expensive
| calculations that are needed in multiple places can be placed within
| a WITH query to avoid redundant work. Another possible application is
| to prevent unwanted multiple evaluations of functions with side-
| effects. However, the other side of this coin is that the optimizer is
| less able to push restrictions from the parent query down into a WITH
| query than an ordinary subquery.
This is an implementation detail of Postgres, and it is not required by
the SQL specification. SQLite chose the other side of the coin.
Regards,
Clemens
_______________________________________________
sqlite-users mailing list
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
petern
2018-01-20 23:57:16 UTC
Permalink
Exactly. But that doesn't resolve the problem of duplicated
non-deterministic tables in a CTE.
CTE is the acryonym for Common TABLE Expression, not Common View
Expression.

eg: WITH x AS (<query with non-deterministic rows>)
SELECT * FROM x UNION SELECT * FROM x; --Should return just x, not
x1 UNION x2.

Peter
Post by Shane Dev
The following statement executes the random() function twice -
sqlite> select random() union all select random();
random()
2678358683566407062
-5528866137931448843
sqlite> explain select random() union all select random();
addr opcode p1 p2 p3 p4 p5 comment
---- ------------- ---- ---- ---- ------------- -- -------------
0 Init 0 1 0 00
1 Function0 0 0 1 random(0) 00
2 ResultRow 1 1 0 00
3 Function0 0 0 1 random(0) 00
4 ResultRow 1 1 0 00
5 Halt 0 0 0 00
but if random() is replaced with round(1.1) -
sqlite> select round(1.1) union all select round(1.1);
round(1.1)
1.0
1.0
sqlite> explain select round(1.1) union all select round(1.1);
addr opcode p1 p2 p3 p4 p5 comment
---- ------------- ---- ---- ---- ------------- -- -------------
0 Init 0 6 0 00
1 Copy 2 1 0 00
2 ResultRow 1 1 0 00
3 Copy 2 1 0 00
4 ResultRow 1 1 0 00
5 Halt 0 0 0 00
6 Real 0 3 0 1.1 00
7 Function0 1 3 2 round(1) 01
8 Goto 0 1 0 00
It seems to be executed once only. Does this happen because random() is
flagged non-deterministic?
Post by Clemens Ladisch
Post by petern
WITH flips(s) AS (VALUES (random()>0), (random()>0), (random()>0))
SELECT sum(s),(SELECT sum(s) FROM flips) FROM flips;
sum(s),"(SELECT sum(s) FROM flips)"
1,3
--Expected output is 1,1.
Why isn't the constant notional table table [flips] materialized just
once
Post by petern
per CTE?
FYI. PostgreSQL 9.6 materializes [flips] exactly once per CTE-users
<https://www.postgresql.org/docs/9.6/static/queries-with.html>
| A useful property of WITH queries is that they are evaluated only once
| per execution of the parent query, even if they are referred to more
| than once by the parent query or sibling WITH queries. Thus, expensive
| calculations that are needed in multiple places can be placed within
| a WITH query to avoid redundant work. Another possible application is
| to prevent unwanted multiple evaluations of functions with side-
| effects. However, the other side of this coin is that the optimizer is
| less able to push restrictions from the parent query down into a WITH
| query than an ordinary subquery.
This is an implementation detail of Postgres, and it is not required by
the SQL specification. SQLite chose the other side of the coin.
Regards,
Clemens
_______________________________________________
sqlite-users mailing list
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Cory Nelson
2018-01-21 03:31:36 UTC
Permalink
CTEs are not as-if-memoized in SQL Server either. I can't find any
standards language saying if they should or shouldn't be, which
typically indicates "anything goes".
Post by petern
Exactly. But that doesn't resolve the problem of duplicated
non-deterministic tables in a CTE.
CTE is the acryonym for Common TABLE Expression, not Common View
Expression.
eg: WITH x AS (<query with non-deterministic rows>)
SELECT * FROM x UNION SELECT * FROM x; --Should return just x, not
x1 UNION x2.
Peter
Post by Shane Dev
The following statement executes the random() function twice -
sqlite> select random() union all select random();
random()
2678358683566407062
-5528866137931448843
sqlite> explain select random() union all select random();
addr opcode p1 p2 p3 p4 p5 comment
---- ------------- ---- ---- ---- ------------- -- -------------
0 Init 0 1 0 00
1 Function0 0 0 1 random(0) 00
2 ResultRow 1 1 0 00
3 Function0 0 0 1 random(0) 00
4 ResultRow 1 1 0 00
5 Halt 0 0 0 00
but if random() is replaced with round(1.1) -
sqlite> select round(1.1) union all select round(1.1);
round(1.1)
1.0
1.0
sqlite> explain select round(1.1) union all select round(1.1);
addr opcode p1 p2 p3 p4 p5 comment
---- ------------- ---- ---- ---- ------------- -- -------------
0 Init 0 6 0 00
1 Copy 2 1 0 00
2 ResultRow 1 1 0 00
3 Copy 2 1 0 00
4 ResultRow 1 1 0 00
5 Halt 0 0 0 00
6 Real 0 3 0 1.1 00
7 Function0 1 3 2 round(1) 01
8 Goto 0 1 0 00
It seems to be executed once only. Does this happen because random() is
flagged non-deterministic?
Post by Clemens Ladisch
Post by petern
WITH flips(s) AS (VALUES (random()>0), (random()>0), (random()>0))
SELECT sum(s),(SELECT sum(s) FROM flips) FROM flips;
sum(s),"(SELECT sum(s) FROM flips)"
1,3
--Expected output is 1,1.
Why isn't the constant notional table table [flips] materialized just
once
Post by petern
per CTE?
FYI. PostgreSQL 9.6 materializes [flips] exactly once per CTE-users
<https://www.postgresql.org/docs/9.6/static/queries-with.html>
| A useful property of WITH queries is that they are evaluated only once
| per execution of the parent query, even if they are referred to more
| than once by the parent query or sibling WITH queries. Thus, expensive
| calculations that are needed in multiple places can be placed within
| a WITH query to avoid redundant work. Another possible application is
| to prevent unwanted multiple evaluations of functions with side-
| effects. However, the other side of this coin is that the optimizer is
| less able to push restrictions from the parent query down into a WITH
| query than an ordinary subquery.
This is an implementation detail of Postgres, and it is not required by
the SQL specification. SQLite chose the other side of the coin.
Regards,
Clemens
_______________________________________________
sqlite-users mailing list
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
--
Cory Nelson
http://int64.org
petern
2018-01-21 05:04:17 UTC
Permalink
Besides PostrgreSQL, MySQL also says x is TABLE with exactly one row as x
UNION x = x

WITH x AS (SELECT rand()) SELECT * FROM x UNION SELECT * FROM x;
rand()
0.6710336931711377

Peter
Post by Cory Nelson
CTEs are not as-if-memoized in SQL Server either. I can't find any
standards language saying if they should or shouldn't be, which
typically indicates "anything goes".
Post by petern
Exactly. But that doesn't resolve the problem of duplicated
non-deterministic tables in a CTE.
CTE is the acryonym for Common TABLE Expression, not Common View
Expression.
eg: WITH x AS (<query with non-deterministic rows>)
SELECT * FROM x UNION SELECT * FROM x; --Should return just x, not
x1 UNION x2.
Peter
Post by Shane Dev
The following statement executes the random() function twice -
sqlite> select random() union all select random();
random()
2678358683566407062
-5528866137931448843
sqlite> explain select random() union all select random();
addr opcode p1 p2 p3 p4 p5 comment
---- ------------- ---- ---- ---- ------------- -- -------------
0 Init 0 1 0 00
1 Function0 0 0 1 random(0) 00
2 ResultRow 1 1 0 00
3 Function0 0 0 1 random(0) 00
4 ResultRow 1 1 0 00
5 Halt 0 0 0 00
but if random() is replaced with round(1.1) -
sqlite> select round(1.1) union all select round(1.1);
round(1.1)
1.0
1.0
sqlite> explain select round(1.1) union all select round(1.1);
addr opcode p1 p2 p3 p4 p5 comment
---- ------------- ---- ---- ---- ------------- -- -------------
0 Init 0 6 0 00
1 Copy 2 1 0 00
2 ResultRow 1 1 0 00
3 Copy 2 1 0 00
4 ResultRow 1 1 0 00
5 Halt 0 0 0 00
6 Real 0 3 0 1.1 00
7 Function0 1 3 2 round(1) 01
8 Goto 0 1 0 00
It seems to be executed once only. Does this happen because random() is
flagged non-deterministic?
Post by Clemens Ladisch
Post by petern
WITH flips(s) AS (VALUES (random()>0), (random()>0), (random()>0))
SELECT sum(s),(SELECT sum(s) FROM flips) FROM flips;
sum(s),"(SELECT sum(s) FROM flips)"
1,3
--Expected output is 1,1.
Why isn't the constant notional table table [flips] materialized
just
Post by petern
Post by Shane Dev
Post by Clemens Ladisch
once
Post by petern
per CTE?
FYI. PostgreSQL 9.6 materializes [flips] exactly once per CTE-users
<https://www.postgresql.org/docs/9.6/static/queries-with.html>
| A useful property of WITH queries is that they are evaluated only
once
Post by petern
Post by Shane Dev
Post by Clemens Ladisch
| per execution of the parent query, even if they are referred to more
| than once by the parent query or sibling WITH queries. Thus,
expensive
Post by petern
Post by Shane Dev
Post by Clemens Ladisch
| calculations that are needed in multiple places can be placed within
| a WITH query to avoid redundant work. Another possible application
is
Post by petern
Post by Shane Dev
Post by Clemens Ladisch
| to prevent unwanted multiple evaluations of functions with side-
| effects. However, the other side of this coin is that the optimizer
is
Post by petern
Post by Shane Dev
Post by Clemens Ladisch
| less able to push restrictions from the parent query down into a
WITH
Post by petern
Post by Shane Dev
Post by Clemens Ladisch
| query than an ordinary subquery.
This is an implementation detail of Postgres, and it is not required
by
Post by petern
Post by Shane Dev
Post by Clemens Ladisch
the SQL specification. SQLite chose the other side of the coin.
Regards,
Clemens
_______________________________________________
sqlite-users mailing list
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
--
Cory Nelson
http://int64.org
_______________________________________________
sqlite-users mailing list
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Loading...