Discussion:
R-Tree Storage Optimization for Points
Mohit Sindhwani
2014-06-19 04:27:39 UTC
Permalink
Hello! We are using SQLite3 for storing geographical points that can be
queried using a bounding box (find everything that lies within this
box). Obviously, this query fits the capabilities of the RTree module
very well and it is a simple 2 dimensional search using an R-Tree that
has 5 columns.

However, since these are points that are stored in the table, x1=x2 and
y1=y2 when we do the insertion. As a former embedded systems engineer,
this feels like a waste since I can see that we are inserting exactly
the same value into the table.

INSERT into data_rtree(1000, 10, 5, 10, 5);
INSERT into data_rtree(1000, 17, 1, 17, 1);
and so on.

Is there a way that we could optimize the module so that we don't need
to store the same value twice? We are using this on a system with
constrained resources, so it helps to reduce the amount of storage space
we need for our database.

Thanks,
Mohit.
Noel Frankinet
2014-06-19 06:19:30 UTC
Permalink
Hi Mohit,

Maybe you should use the spatialite extension ?

Noël
Post by Mohit Sindhwani
Hello! We are using SQLite3 for storing geographical points that can be
queried using a bounding box (find everything that lies within this box).
Obviously, this query fits the capabilities of the RTree module very well
and it is a simple 2 dimensional search using an R-Tree that has 5 columns.
However, since these are points that are stored in the table, x1=x2 and
y1=y2 when we do the insertion. As a former embedded systems engineer,
this feels like a waste since I can see that we are inserting exactly the
same value into the table.
INSERT into data_rtree(1000, 10, 5, 10, 5);
INSERT into data_rtree(1000, 17, 1, 17, 1);
and so on.
Is there a way that we could optimize the module so that we don't need to
store the same value twice? We are using this on a system with constrained
resources, so it helps to reduce the amount of storage space we need for
our database.
Thanks,
Mohit.
_______________________________________________
sqlite-users mailing list
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
--
Noël Frankinet
Strategis sprl
0478/90.92.54
Mohit Sindhwani
2014-06-19 07:00:23 UTC
Permalink
Hi Noël,

Thanks for our reply.
Post by Noel Frankinet
Hi Mohit,
Maybe you should use the spatialite extension ?
Noël
I have to see if indeed spatialite handles the data more efficiently
since it also relies on the R-Tree for quite a bit of stuff. That said,
I do remember that once upon a time (admittedly 3 - 4 years ago), we had
trouble getting Spatialite compiled for a Windows CE target. Maybe, it
is time to revisit that again.


Best Regards,
Mohit.
Noel Frankinet
2014-06-19 07:21:37 UTC
Permalink
It should be painless if you omit geos, I think.
Post by Mohit Sindhwani
Hi Noël,
Thanks for our reply.
Post by Noel Frankinet
Hi Mohit,
Maybe you should use the spatialite extension ?
Noël
I have to see if indeed spatialite handles the data more efficiently since
it also relies on the R-Tree for quite a bit of stuff. That said, I do
remember that once upon a time (admittedly 3 - 4 years ago), we had trouble
getting Spatialite compiled for a Windows CE target. Maybe, it is time to
revisit that again.
Best Regards,
Mohit.
_______________________________________________
sqlite-users mailing list
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
--
Noël Frankinet
Strategis sprl
0478/90.92.54
Wolfgang Enzinger
2014-06-19 15:54:17 UTC
Permalink
Post by Mohit Sindhwani
However, since these are points that are stored in the table, x1=x2 and
y1=y2 when we do the insertion. As a former embedded systems engineer,
this feels like a waste since I can see that we are inserting exactly
the same value into the table.
INSERT into data_rtree(1000, 10, 5, 10, 5);
INSERT into data_rtree(1000, 17, 1, 17, 1);
and so on.
Is there a way that we could optimize the module so that we don't need
to store the same value twice?
Not sure why you think you have to store those point coordinates twice.

This works:

sqlite> CREATE VIRTUAL TABLE abc USING rtree(id,x,y);
sqlite> INSERT INTO abc VALUES(1,20,30);
sqlite> SELECT id FROM abc WHERE x>=10 AND x<=30 AND y >=20 AND y<=40;
1
sqlite> SELECT id FROM abc WHERE x>=40 AND x<=50 AND y >=40 AND y<=50;
sqlite>

HTH,
Wolfgang
Mohit Sindhwani
2014-06-19 16:57:12 UTC
Permalink
Hi Wolfgang,
Post by Wolfgang Enzinger
Not sure why you think you have to store those point coordinates twice.
sqlite> CREATE VIRTUAL TABLE abc USING rtree(id,x,y);
sqlite> INSERT INTO abc VALUES(1,20,30);
sqlite> SELECT id FROM abc WHERE x>=10 AND x<=30 AND y >=20 AND y<=40;
1
sqlite> SELECT id FROM abc WHERE x>=40 AND x<=50 AND y >=40 AND y<=50;
sqlite>
I do feel a bit stupid after reading your email... but I guess I was
working on the basis that the data we have is 2 dimensional and my
recollection was that we need 2 items per dimension.

Am I reading this wrong?
The SQLite R*Tree module is implemented as a virtual table. Each R*Tree
index is a virtual table with an odd number of columns between 3 and 11.
The first column is always a 64-bit signed integer primary key. The
other columns are pairs, one pair per dimension, containing the minimum
and maximum values for that dimension, respectively. A 1-dimensional
R*Tree thus has 3 columns. A 2-dimensional R*Tree has 5 columns. A
3-dimensional R*Tree has 7 columns. A 4-dimensional R*Tree has 9
columns. And a 5-dimensional R*Tree has 11 columns. The SQLite R*Tree
implementation does not support R*Trees wider than 5 dimensions.

Best Regards,
Mohit.
Dan Kennedy
2014-06-19 17:06:12 UTC
Permalink
Post by Mohit Sindhwani
Hi Wolfgang,
Post by Wolfgang Enzinger
Not sure why you think you have to store those point coordinates twice.
sqlite> CREATE VIRTUAL TABLE abc USING rtree(id,x,y);
sqlite> INSERT INTO abc VALUES(1,20,30);
sqlite> SELECT id FROM abc WHERE x>=10 AND x<=30 AND y >=20 AND y<=40;
1
sqlite> SELECT id FROM abc WHERE x>=40 AND x<=50 AND y >=40 AND y<=50;
sqlite>
I do feel a bit stupid after reading your email... but I guess I was
working on the basis that the data we have is 2 dimensional and my
recollection was that we need 2 items per dimension.
Am I reading this wrong?
The SQLite R*Tree module is implemented as a virtual table. Each
R*Tree index is a virtual table with an odd number of columns between
3 and 11. The first column is always a 64-bit signed integer primary
key. The other columns are pairs, one pair per dimension, containing
the minimum and maximum values for that dimension, respectively. A
1-dimensional R*Tree thus has 3 columns. A 2-dimensional R*Tree has 5
columns. A 3-dimensional R*Tree has 7 columns. A 4-dimensional R*Tree
has 9 columns. And a 5-dimensional R*Tree has 11 columns. The SQLite
R*Tree implementation does not support R*Trees wider than 5 dimensions.
Probably not. The CREATE TABLE code above actually creates a
1-dimensional r-tree with deceptive column names. Column "y" contains
the maximum value for the first dimension:

SQLite version 3.8.5 2014-06-19 12:34:33
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> CREATE VIRTUAL TABLE abc USING rtree(id,x,y);
sqlite> INSERT INTO abc VALUES(NULL, 20, 10);
Error: constraint failed
sqlite>
Alysson Gonçalves de Azevedo
2014-06-19 17:10:08 UTC
Permalink
Post by Dan Kennedy
sqlite> INSERT INTO abc VALUES(NULL, 20, 10);
*The first column is always a 64-bit signed integer primary key*. The other
Post by Dan Kennedy
columns are pairs, one pair per dimension, containing the minimum and
maximum values for that dimension, respectively.
Alysson Gonçalves de Azevedo

"Anarcho-syndicalism is a way of preserving freedom." - Monty Python
Post by Dan Kennedy
Hi Wolfgang,
Post by Wolfgang Enzinger
Not sure why you think you have to store those point coordinates twice.
sqlite> CREATE VIRTUAL TABLE abc USING rtree(id,x,y);
sqlite> INSERT INTO abc VALUES(1,20,30);
sqlite> SELECT id FROM abc WHERE x>=10 AND x<=30 AND y >=20 AND y<=40;
1
sqlite> SELECT id FROM abc WHERE x>=40 AND x<=50 AND y >=40 AND y<=50;
sqlite>
I do feel a bit stupid after reading your email... but I guess I was
working on the basis that the data we have is 2 dimensional and my
recollection was that we need 2 items per dimension.
Am I reading this wrong?
The SQLite R*Tree module is implemented as a virtual table. Each R*Tree
index is a virtual table with an odd number of columns between 3 and 11.
The first column is always a 64-bit signed integer primary key. The other
columns are pairs, one pair per dimension, containing the minimum and
maximum values for that dimension, respectively. A 1-dimensional R*Tree
thus has 3 columns. A 2-dimensional R*Tree has 5 columns. A 3-dimensional
R*Tree has 7 columns. A 4-dimensional R*Tree has 9 columns. And a
5-dimensional R*Tree has 11 columns. The SQLite R*Tree implementation does
not support R*Trees wider than 5 dimensions.
Probably not. The CREATE TABLE code above actually creates a 1-dimensional
r-tree with deceptive column names. Column "y" contains the maximum value
SQLite version 3.8.5 2014-06-19 12:34:33
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> CREATE VIRTUAL TABLE abc USING rtree(id,x,y);
sqlite> INSERT INTO abc VALUES(NULL, 20, 10);
Error: constraint failed
sqlite>
_______________________________________________
sqlite-users mailing list
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Dan Kennedy
2014-06-19 18:23:20 UTC
Permalink
Post by Alysson Gonçalves de Azevedo
Post by Dan Kennedy
sqlite> INSERT INTO abc VALUES(NULL, 20, 10);
*The first column is always a 64-bit signed integer primary key*.
Right, but if you insert NULL it assigns a value automatically. The
constraint failure is because the minimum value of the first dimension
is larger than the maximum.


SQLite version 3.8.5 2014-06-19 12:34:33
...
sqlite> CREATE VIRTUAL TABLE abc USING rtree(id,x,y);
sqlite> INSERT INTO abc VALUES(1, 20, 10);
Error: constraint failed

sqlite> INSERT INTO abc VALUES(NULL, 10, 20);
sqlite> SELECT * FROM abc;
1|10.0|20.0
Post by Alysson Gonçalves de Azevedo
The other
Post by Dan Kennedy
columns are pairs, one pair per dimension, containing the minimum and
maximum values for that dimension, respectively.
Alysson Gonçalves de Azevedo
"Anarcho-syndicalism is a way of preserving freedom." - Monty Python
Post by Dan Kennedy
Hi Wolfgang,
Post by Wolfgang Enzinger
Not sure why you think you have to store those point coordinates twice.
sqlite> CREATE VIRTUAL TABLE abc USING rtree(id,x,y);
sqlite> INSERT INTO abc VALUES(1,20,30);
sqlite> SELECT id FROM abc WHERE x>=10 AND x<=30 AND y >=20 AND y<=40;
1
sqlite> SELECT id FROM abc WHERE x>=40 AND x<=50 AND y >=40 AND y<=50;
sqlite>
I do feel a bit stupid after reading your email... but I guess I was
working on the basis that the data we have is 2 dimensional and my
recollection was that we need 2 items per dimension.
Am I reading this wrong?
The SQLite R*Tree module is implemented as a virtual table. Each R*Tree
index is a virtual table with an odd number of columns between 3 and 11.
The first column is always a 64-bit signed integer primary key. The other
columns are pairs, one pair per dimension, containing the minimum and
maximum values for that dimension, respectively. A 1-dimensional R*Tree
thus has 3 columns. A 2-dimensional R*Tree has 5 columns. A 3-dimensional
R*Tree has 7 columns. A 4-dimensional R*Tree has 9 columns. And a
5-dimensional R*Tree has 11 columns. The SQLite R*Tree implementation does
not support R*Trees wider than 5 dimensions.
Probably not. The CREATE TABLE code above actually creates a 1-dimensional
r-tree with deceptive column names. Column "y" contains the maximum value
SQLite version 3.8.5 2014-06-19 12:34:33
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> CREATE VIRTUAL TABLE abc USING rtree(id,x,y);
sqlite> INSERT INTO abc VALUES(NULL, 20, 10);
Error: constraint failed
sqlite>
_______________________________________________
sqlite-users mailing list
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Wolfgang Enzinger
2014-06-19 19:01:18 UTC
Permalink
Post by Dan Kennedy
Probably not. The CREATE TABLE code above actually creates a
1-dimensional r-tree with deceptive column names. Column "y" contains
SQLite version 3.8.5 2014-06-19 12:34:33
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> CREATE VIRTUAL TABLE abc USING rtree(id,x,y);
sqlite> INSERT INTO abc VALUES(NULL, 20, 10);
Error: constraint failed
sqlite>
I stand corrected. Should have tried this before:

sqlite> INSERT INTO abc VALUES(2,30,20);
Error: constraint failed

Note to self: r-tree is about *ranges* in 1 to 5 dimensions.

Wolfgang
Mohit Sindhwani
2014-06-20 11:08:52 UTC
Permalink
Hello All...
Post by Wolfgang Enzinger
sqlite> INSERT INTO abc VALUES(2,30,20);
Error: constraint failed
Note to self: r-tree is about *ranges* in 1 to 5 dimensions.
Coming back to the original problem again... I was wondering if there is
a way that we could save space on the R-Tree storage if the item being
inserting is just a single point (such that x1=x2 and y1=y2).

Thanks for the answers thus far.

Best Regards,
Mohit.
Clemens Ladisch
2014-06-20 12:08:04 UTC
Permalink
I was wondering if there is a way that we could save space on the
R-Tree storage if the item being inserting is just a single point
(such that x1=x2 and y1=y2).
Not without changing the SQLite code.

A non-leaf R-tree node must store the extents covered by all its
children, so these are (n-dimensional) rectangles. At the moment,
SQLite assumes that user data has exactly the same format, so such
a change would not be trivial.


Regards,
Clemens
Mohit Sindhwani
2014-06-20 14:52:26 UTC
Permalink
Post by Clemens Ladisch
Not without changing the SQLite code.
A non-leaf R-tree node must store the extents covered by all its
children, so these are (n-dimensional) rectangles. At the moment,
SQLite assumes that user data has exactly the same format, so such
a change would not be trivial.
Thanks Clemens - I was afraid that might be the case. I guess that's a
project for a different time and day.

Best Regards,
Mohit.

Loading...