Andrea Aime
2017-12-29 18:59:12 UTC
Hi,
I'm writing some software that can read data off GeoPackage (SQLite + rtree
+ standardized
set of metadata tables) as an alternative format for spatial databases,
like PostgreSql with the
PostGIS extension.
Now, when I use PostGIS the query plan optimizer checks the bbox provided
in the query
and verifies if using the spatial index is a good idea, or not. At
conferences I've been told
that the query has to be rather selective (e.g., retrieve less than 10% of
the data) in order
for the index to actually be used.
With SQLite R-Tree I'm using either a join with the index virtual table, or
a subquery
retrieving the ids from the rtree. Regardless, the query is basically
ordering SQLite
to use the index.
So I was wondering, is there any opportunity to run a blazing fast
pre-query against
the index that will tell me whether joining/subquerying into the rtree is
going to be a win, or not?
Also, while I'm here, in PostGIS there is an option to cluster a table
along the spatial
index, in order to reduce IO when the spatial index is the main access
driver (which is often
the case in geographic information systems). I looked at tables with no
rowids, but
it does not seem like a way to do it (spatial index not being suitable for
primary key).
Anything else that could be done here?
Cheers
Andrea
I'm writing some software that can read data off GeoPackage (SQLite + rtree
+ standardized
set of metadata tables) as an alternative format for spatial databases,
like PostgreSql with the
PostGIS extension.
Now, when I use PostGIS the query plan optimizer checks the bbox provided
in the query
and verifies if using the spatial index is a good idea, or not. At
conferences I've been told
that the query has to be rather selective (e.g., retrieve less than 10% of
the data) in order
for the index to actually be used.
With SQLite R-Tree I'm using either a join with the index virtual table, or
a subquery
retrieving the ids from the rtree. Regardless, the query is basically
ordering SQLite
to use the index.
So I was wondering, is there any opportunity to run a blazing fast
pre-query against
the index that will tell me whether joining/subquerying into the rtree is
going to be a win, or not?
Also, while I'm here, in PostGIS there is an option to cluster a table
along the spatial
index, in order to reduce IO when the spatial index is the main access
driver (which is often
the case in geographic information systems). I looked at tables with no
rowids, but
it does not seem like a way to do it (spatial index not being suitable for
primary key).
Anything else that could be done here?
Cheers
Andrea