Discussion:
How dangerous is PRAGMA Synchronous OFF?
Dinsmore, Jeff
2005-11-16 14:49:10 UTC
Permalink
I have a little engine doing HL7 messaging using SQLite3 from Tcl
(Windows Server 2003 with a local attached RAID 5 for database).

I'm looking for some speed improvements, so yesterday I experimented
with PRAGMA synchronous = OFF. Holy Cats - it's WAAAY faster - like
night and day.

So, I'd like to use synchronous = OFF, but I'd like to understand a
little more about the possible side effects.

I understand that if the OS crashes or if I lose power to the server
it's possible that the SQLite database could be corrupted assuming that
the failure happens at a time after SQLite has handed off a database
write to the OS but before the OS has committed the write to disk.

My server is UPS'd and the server has dual power supplies, so power
should be pretty stable. The Windows OS also seems to be pretty stable
and the RAID has battery protected cache. So I'm fairly comfortable with
my OS, hardware and power reliability.

The Big Question: What evidence will I see from SQLite that a database
file is corrupt? Will it fail to open and return an error to that
effect? Or, will it (possibly) open the database and then fail somewhere
down the line when it hits the corrupted area of the database?

I have never experienced any SQLite database corruption. Does anyone
have a feel for how common database corruption is in the real world?

Thanks,

Jeff Dinsmore
IT - Interfaces
Ridgeview Medical Center
jeff.dinsmore-a/B9otNJ57NCiBvDG4KOnGD2FQJk+8+***@public.gmane.org
952.442.2191 x6592



Ridgeview Medical Center Confidentiality Notice: This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.
Preston Z
2005-11-17 00:54:38 UTC
Permalink
If the power never goes out and no programs ever crash on you system then
Synchronous = OFF is for you, but the rest of the world might still want it
ON. Really it sounds like the thing you need to worry about most is the
unexpected termination of your program. If you aren't worried about that...
The bottom line is how important is the speed VS Data?

Your program must do a lot of data modification for you to see the kind of
improvements that you sound like you are getting by turning off the Sync.
Are you using transactions where you can? Transactions can speed up things
quite a bit if you have more than one row of data that you are modifying at
a time... personally i would try transactions to speed things up before i
did something as drastic as turning off the sync.

As for detecting a corrupt database, it won't tell you on opening it that it
is corrupt... it will even execute statements till it hits a piece of the db
that is corrupt (at least the corruption that i have seen, which was from a
bad disk, so might not be the same). There is a symptom of a corrupt
database that i think is almost a sure thing.. run vacuum or dump the
database from the command line tool. If these run OK then you are probably
fine.

Anybody have a better way to tell if a db is garbage?

I hope i answered at least part of your question...
--Preston
Subject: [sqlite] How dangerous is PRAGMA Synchronous OFF?
Date: Wed, 16 Nov 2005 08:49:10 -0600
I have a little engine doing HL7 messaging using SQLite3 from Tcl
(Windows Server 2003 with a local attached RAID 5 for database).
I'm looking for some speed improvements, so yesterday I experimented
with PRAGMA synchronous = OFF. Holy Cats - it's WAAAY faster - like
night and day.
So, I'd like to use synchronous = OFF, but I'd like to understand a
little more about the possible side effects.
I understand that if the OS crashes or if I lose power to the server
it's possible that the SQLite database could be corrupted assuming that
the failure happens at a time after SQLite has handed off a database
write to the OS but before the OS has committed the write to disk.
My server is UPS'd and the server has dual power supplies, so power
should be pretty stable. The Windows OS also seems to be pretty stable
and the RAID has battery protected cache. So I'm fairly comfortable with
my OS, hardware and power reliability.
The Big Question: What evidence will I see from SQLite that a database
file is corrupt? Will it fail to open and return an error to that
effect? Or, will it (possibly) open the database and then fail somewhere
down the line when it hits the corrupted area of the database?
I have never experienced any SQLite database corruption. Does anyone
have a feel for how common database corruption is in the real world?
Thanks,
Jeff Dinsmore
IT - Interfaces
Ridgeview Medical Center
952.442.2191 x6592
Ridgeview Medical Center Confidentiality Notice: This email message,
including any attachments, is for the sole use of the intended recipient(s)
and may contain confidential and privileged information. Any unauthorized
review, use, disclosure or distribution is prohibited. If you are not the
intended recipient, please contact the sender by reply email and destroy
all copies of the original message.
Paul Smith
2005-11-17 10:32:18 UTC
Permalink
Post by Preston Z
If the power never goes out and no programs ever crash on you system then
Synchronous = OFF is for you, but the rest of the world might still want
it ON. Really it sounds like the thing you need to worry about most is the
unexpected termination of your program. If you aren't worried about that...
I'm pretty sure that even with Synchronous=off, a program crash won't cause
a problem. It's "only" OS crashes or power failures that are a problem.

From the docs:
"With synchronous OFF (0), SQLite continues without pausing as soon as it
has handed data off to the operating system. If the application running
SQLite crashes, the data will be safe, but the database might become
corrupted if the operating system crashes or the computer loses power
before that data has been written to the disk surface. On the other hand,
some operations are as much as 50 or more times faster with synchronous OFF."

So, if you have a UPS (with ordered shutdown software!) and are happy with
the OS reliability, I'd consider using "synchronous off" as long as I kept
regular backups, and didn't care if very recent data was lost. (But this is
really the case with most DBs - even with 'synchronous normal' there is a
small risk that the database could be corrupted)
Post by Preston Z
The bottom line is how important is the speed VS Data?
As for detecting a corrupt database, it won't tell you on opening it that
it is corrupt... it will even execute statements till it hits a piece of
the db that is corrupt (at least the corruption that i have seen, which
was from a bad disk, so might not be the same).
What has happened here with data corruption has generally been that the
program has started OK, then crashed later on. (At least, I'm guessing it
was database corruption because the crash happened in the SQLite DLL, and
restoring a backup DB file stopped the crashing)


Paul VPOP3 - Internet Email Server/Gateway
support-***@public.gmane.org http://www.pscs.co.uk/
Henry Miller
2005-11-17 14:28:09 UTC
Permalink
Post by Preston Z
Anybody have a better way to tell if a db is garbage?
You will know in the real world because things don't work randomly and
the users will complain.

My experience is like yours, a corrupt database can be worked with much
like a normal one so long as you stay away from any parts that are
corrupt.

In addition to the list of causes, I would like to add the one that bit
us:

Not noticing that sqlite requires that each instance be used in exactly
one thread of your program. (Someone in our early design decided to
wrap sqlite in a singleton class and then left. Since things worked
and he was a good programer normally we didn't find the problem until
customers ran into corruption)
d***@public.gmane.org
2005-11-17 01:29:35 UTC
Permalink
Post by Dinsmore, Jeff
The Big Question: What evidence will I see from SQLite that a database
file is corrupt? Will it fail to open and return an error to that
effect? Or, will it (possibly) open the database and then fail somewhere
down the line when it hits the corrupted area of the database?
The latter.
Post by Dinsmore, Jeff
I have never experienced any SQLite database corruption. Does anyone
have a feel for how common database corruption is in the real world?
Database corruption is uncommon. The usual causes are

(1) Running with synchronous=OFF
(2) Running on a network filesystem with broken file locking
(3) Using a customized OS layer (in place of the standard
os_unix.c or os_win.c) that doesn't work quite right.
(4) Manually deleting or renaming hot journals
d***@public.gmane.org
2005-11-17 01:33:37 UTC
Permalink
Post by Preston Z
If the power never goes out and no programs ever crash on you system then
Synchronous = OFF is for you, but the rest of the world might still want it
ON.
Minor clarification: Program crashes will not cause data corruption with
synchronous=off. But OS crashes will. As will power failures.
Post by Preston Z
Are you using transactions where you can? Transactions can speed up things
quite a bit if you have more than one row of data that you are modifying at
a time... personally i would try transactions to speed things up before i
did something as drastic as turning off the sync.
Preston's advice is very sound here. Listen to him.
Post by Preston Z
As for detecting a corrupt database, it won't tell you on opening it that it
is corrupt... it will even execute statements till it hits a piece of the db
that is corrupt (at least the corruption that i have seen, which was from a
bad disk, so might not be the same). There is a symptom of a corrupt
database that i think is almost a sure thing.. run vacuum or dump the
database from the command line tool. If these run OK then you are probably
fine.
Anybody have a better way to tell if a db is garbage?
VACUUM is good. Also "PRAGMA integrity_check". Note, however, that
both commands have to read the entire database file and so can run
slowing on large databases.

--
D. Richard Hipp <***@hwaci.com>
roger-9O0up6nH0NtWk0Htik3J/
2005-11-17 10:06:06 UTC
Permalink
Post by d***@public.gmane.org
Database corruption is uncommon. The usual causes are
(1) Running with synchronous=OFF
Which just about sums it up!

But, a question, I assume there is no risk if you are only using
selects, i.e. the db is read only. Is there any performance advantage
in having synchronous off in these circumstances? I suspect not.

I have in mind another scenario, that of bulk inserts to a newly created
table. Would it be viable and safe to:

PPRAGMA synchronous=OFF
Create table (with no indices)
Insert ad nauseum
PPRAGMA synchronous=FULL
Create Index....
Christian Smith
2005-11-17 16:08:58 UTC
Permalink
Post by roger-9O0up6nH0NtWk0Htik3J/
Post by d***@public.gmane.org
Database corruption is uncommon. The usual causes are
(1) Running with synchronous=OFF
Which just about sums it up!
But, a question, I assume there is no risk if you are only using
selects, i.e. the db is read only. Is there any performance advantage
in having synchronous off in these circumstances? I suspect not.
I have in mind another scenario, that of bulk inserts to a newly created
PPRAGMA synchronous=OFF
Create table (with no indices)
Insert ad nauseum
PPRAGMA synchronous=FULL
Create Index....
No more safe than the rest of the discussion. Plus, all you need to do is
wrap the lot in a transaction, and you'll only have synchronous IO when
updating the rollback journal and when closing the transaction. As many
row inserts can be aggregated into single journal updates, and journal
updates will only happen when you spill data from the cache, small to
medium bulk inserts should only incur the final closing synchronous IO
when committing the transaction.

Christian
--
/"\
\ / ASCII RIBBON CAMPAIGN - AGAINST HTML MAIL
X - AGAINST MS ATTACHMENTS
/ \
Loading...