Gabriele Lanaro
2017-04-19 18:50:19 UTC
Dear SQLite communiy,
I’m trying to increase the performance of a query to its maximum possible
speed.
The schema is roughly constituted of two tables, one is a full text index,
fts5_table, while the other is called datatable.
The query in question is a full text query on the full text index joined
with the other table.
*SELECT* fts5.column, datatable.column2 *FROM* fts5_table, datatable *WHERE*
fts5_table *MATCH* ‘term’ *AND* datatable.id = fts5_column.rowid *ORDER BY*
rank *ASC* *LIMIT* 10;
Another query looks for counts of the same kind:
*SELECT* count(datatable.id) *FROM* fts5_table, datatable *WHERE*
fts5_table *MATCH* ‘term’ *AND* datatable.id = fts5_column.rowid
On a database of about the size of 1 GB of text (evenly distributed amount
100 000 rows). To give an idea of the speeds obtained I achieve a speed of
about 40 ms for the first and 80 ms for the second query.
Things I tried:
1) ANALYZE
Decreased the speed by orders of magnitude.
2) pragma cache_size=-256000;
Improved the performance considerably, increasing further than this value
has no effect
3) pragma mmap_size
I didn’t see an improvement
4) putting the database in a ramdisk
No improvement
I was wondering which other strategies could be effecting in increasing the
query speed.
I also noticed that the *ORDER BY* statement in the first query slows down
the query quite a bit (and this is probably due to calculating the bm25
score on all the results to get a ranking) I was wondering if there’s other
ways to prevent that.
Thanks,
Gabriele
I’m trying to increase the performance of a query to its maximum possible
speed.
The schema is roughly constituted of two tables, one is a full text index,
fts5_table, while the other is called datatable.
The query in question is a full text query on the full text index joined
with the other table.
*SELECT* fts5.column, datatable.column2 *FROM* fts5_table, datatable *WHERE*
fts5_table *MATCH* ‘term’ *AND* datatable.id = fts5_column.rowid *ORDER BY*
rank *ASC* *LIMIT* 10;
Another query looks for counts of the same kind:
*SELECT* count(datatable.id) *FROM* fts5_table, datatable *WHERE*
fts5_table *MATCH* ‘term’ *AND* datatable.id = fts5_column.rowid
On a database of about the size of 1 GB of text (evenly distributed amount
100 000 rows). To give an idea of the speeds obtained I achieve a speed of
about 40 ms for the first and 80 ms for the second query.
Things I tried:
1) ANALYZE
Decreased the speed by orders of magnitude.
2) pragma cache_size=-256000;
Improved the performance considerably, increasing further than this value
has no effect
3) pragma mmap_size
I didn’t see an improvement
4) putting the database in a ramdisk
No improvement
I was wondering which other strategies could be effecting in increasing the
query speed.
I also noticed that the *ORDER BY* statement in the first query slows down
the query quite a bit (and this is probably due to calculating the bm25
score on all the results to get a ranking) I was wondering if there’s other
ways to prevent that.
Thanks,
Gabriele