Mike Shal
2009-10-02 18:39:02 UTC
It seems to be common knowledge that running selects inside a
transaction should offer no performance benefit (for example, this
thread has a number of replies to that effect:
http://www.mail-archive.com/sqlite-users-CzDROfG0BjIdnm+***@public.gmane.org/msg41699.html).
However, I noticed a curious behavior in my application. I was running
a large number of selects on my database, and when I removed the
seemingly superfluous begin/commit statements around the selects, the
application slowed down. I was able to reproduce this using the
sqlite3 command line:
$ time sqlite3 tmpdb '.read transaction.sql' > /dev/null
real 0m2.014s
user 0m1.985s
sys 0m0.030s
$ time sqlite3 tmpdb '.read notransaction.sql' > /dev/null
real 0m2.800s
user 0m2.222s
sys 0m0.578s
$ diff -au notransaction.sql transaction.sql
--- notransaction.sql 2009-10-02 14:12:12.000000000 -0400
+++ transaction.sql 2009-10-02 14:12:12.000000000 -0400
@@ -1,3 +1,4 @@
+begin;
select * from foo where entry='A';
select * from foo where entry='a';
select * from foo where entry='aa';
@@ -49998,3 +49999,4 @@
select * from foo where entry='degraded';
select * from foo where entry='degradedly';
select * from foo where entry='degradedness';
+commit;
Granted it's not an order of magnitude style difference, but ~800ms
out of 2800ms seems like a pretty significant savings for just adding
a begin/commit around a bunch of select statements. These results were
fairly consistent with sqlite 3.6.18, 3.6.11, and 3.6.5 built from
source (I was just trying a few different versions to see if it might
have been a regression).
Does anybody know why just adding the begin/commit here improves
performance? If I have to do a large number of selects like this in my
application, should I always wrap it in a transaction?
If you want to try it out yourself, I used these scripts to create a
database of words from /usr/share/dict/words, and then generate the
sql to test:
#### create.sh (overwrites 'tmpdb' and 'create.sql' in the current directory)
#! /bin/sh
if [ $# -lt 1 ]; then
echo "Usage: $0 num" 1>&2
exit 1
fi
rm -f tmpdb; (echo 'create table foo (id integer primary key not null,
entry varchar(100), unique(entry));'; echo 'begin;'; for i in `cat
/usr/share/dict/words | head -n $1`; do echo "insert into foo(entry)
values('$i');"; done; echo 'commit;') > create.sql
echo "Create:"
time sqlite3 tmpdb '.read create.sql'
#### end create.sh
#### read.sh (overwrites 'transaction.sql' and 'notransaction.sql' in
the current directory)
#! /bin/sh
if [ $# -lt 1 ]; then
echo "Usage: $0 num" 1>&2
exit 1
fi
rm -f transaction.sql
rm -f notransaction.sql
for i in `cat /usr/share/dict/words | head -n $1`; do echo "select *
from foo where entry='$i';" >> notransaction.sql; done
(echo 'begin;'; cat notransaction.sql; echo 'commit;') > transaction.sql
sync
echo "Without transaction:"
time sqlite3 tmpdb '.read notransaction.sql' > /dev/null
echo "Transaction:"
time sqlite3 tmpdb '.read transaction.sql' > /dev/null
#### end read.sh
The parameter is the size of the table to create / number of entries to read:
$ sh create.sh 50000
$ sh read.sh 50000
Thanks,
-Mike
transaction should offer no performance benefit (for example, this
thread has a number of replies to that effect:
http://www.mail-archive.com/sqlite-users-CzDROfG0BjIdnm+***@public.gmane.org/msg41699.html).
However, I noticed a curious behavior in my application. I was running
a large number of selects on my database, and when I removed the
seemingly superfluous begin/commit statements around the selects, the
application slowed down. I was able to reproduce this using the
sqlite3 command line:
$ time sqlite3 tmpdb '.read transaction.sql' > /dev/null
real 0m2.014s
user 0m1.985s
sys 0m0.030s
$ time sqlite3 tmpdb '.read notransaction.sql' > /dev/null
real 0m2.800s
user 0m2.222s
sys 0m0.578s
$ diff -au notransaction.sql transaction.sql
--- notransaction.sql 2009-10-02 14:12:12.000000000 -0400
+++ transaction.sql 2009-10-02 14:12:12.000000000 -0400
@@ -1,3 +1,4 @@
+begin;
select * from foo where entry='A';
select * from foo where entry='a';
select * from foo where entry='aa';
@@ -49998,3 +49999,4 @@
select * from foo where entry='degraded';
select * from foo where entry='degradedly';
select * from foo where entry='degradedness';
+commit;
Granted it's not an order of magnitude style difference, but ~800ms
out of 2800ms seems like a pretty significant savings for just adding
a begin/commit around a bunch of select statements. These results were
fairly consistent with sqlite 3.6.18, 3.6.11, and 3.6.5 built from
source (I was just trying a few different versions to see if it might
have been a regression).
Does anybody know why just adding the begin/commit here improves
performance? If I have to do a large number of selects like this in my
application, should I always wrap it in a transaction?
If you want to try it out yourself, I used these scripts to create a
database of words from /usr/share/dict/words, and then generate the
sql to test:
#### create.sh (overwrites 'tmpdb' and 'create.sql' in the current directory)
#! /bin/sh
if [ $# -lt 1 ]; then
echo "Usage: $0 num" 1>&2
exit 1
fi
rm -f tmpdb; (echo 'create table foo (id integer primary key not null,
entry varchar(100), unique(entry));'; echo 'begin;'; for i in `cat
/usr/share/dict/words | head -n $1`; do echo "insert into foo(entry)
values('$i');"; done; echo 'commit;') > create.sql
echo "Create:"
time sqlite3 tmpdb '.read create.sql'
#### end create.sh
#### read.sh (overwrites 'transaction.sql' and 'notransaction.sql' in
the current directory)
#! /bin/sh
if [ $# -lt 1 ]; then
echo "Usage: $0 num" 1>&2
exit 1
fi
rm -f transaction.sql
rm -f notransaction.sql
for i in `cat /usr/share/dict/words | head -n $1`; do echo "select *
from foo where entry='$i';" >> notransaction.sql; done
(echo 'begin;'; cat notransaction.sql; echo 'commit;') > transaction.sql
sync
echo "Without transaction:"
time sqlite3 tmpdb '.read notransaction.sql' > /dev/null
echo "Transaction:"
time sqlite3 tmpdb '.read transaction.sql' > /dev/null
#### end read.sh
The parameter is the size of the table to create / number of entries to read:
$ sh create.sh 50000
$ sh read.sh 50000
Thanks,
-Mike