Nikhil Deshpande
7 years ago
Hi,
We have an application that in a Linux VM that's running into
SQLite DB corruption (after weeks and months of running,
4 such instances yet in different VMs).
We would appreciate some help in debugging this further to identify
source of corruption!
Symptom is btree page corruption, e.g.
There were no power-off or reboots in near time vicinity when the
corruption was detected. We have poured over this document
https://sqlite.org/howtocorrupt.html
many times to check if any of the conditions could apply,
but so far no leads yet.
We have also been unable to reproduce the corruption by stressing
application's SQLite DB read/write code paths for a week.
I'm attaching showdb output for the DB header and 2 corrupt pages
if it's of any hint.
---
A bit more application setup context/information:
- Linux kernel 4.4.41
- glibc 2.22
- Ext4 file system, mounted as (rw,relatime,data=ordered).
- Writer C++ process: sqlite-3.17
- Creates a set of "time series" tables, each table has 2 numeric
columns (timestamp, int) during initialization.
- Every 1 minute, 2 threads will do total 15 writes. (using "INSERT OR
REPLACE ... (timestamp, int)" SQL into 15 tables).
- SQLite DB opened with SQLITE_OPEN_READWRITE | SQLITE_OPEN_CREATE
flags, initialized with "PRAGMA journal_mode=wal;", threading mode
is Serialized for the libsqlite build, uses default VFS ("unix").
All other config params are default (e.g. autovacuum is disabled
etc.).
- A separate thread runs "PRAGMA quick_check;" periodically every 5
minutes, in its own separate DB connection.
- Reader process: sqlite-3.11 + Python 2.7.11
- Periodically reads time series tables for a given timestamp range
(usually latest 5 minutes) using SELECT queries (no INSERT/UPDATE/
DELETE from this process).
- Uses same same "PRAGMA journal_mode=wal", uses the sqlite3 DBAPI
module from Python standard library.
Apart from above 2, no other processes are accessing the SQLite DB file.
We have updated both the reader and writer to use latest SQLite 3.21,
but without understanding the cause of corruption, we are unable to
say if this update to latest 3.21 would indeed prevent further
occurrences.
Thanks,
Nikhil
We have an application that in a Linux VM that's running into
SQLite DB corruption (after weeks and months of running,
4 such instances yet in different VMs).
We would appreciate some help in debugging this further to identify
source of corruption!
Symptom is btree page corruption, e.g.
$ sqlite3 stats.sqlite "pragma integrity_check;"
*** in database main ***
Page 3818: btreeInitPage() returns error code 11
Page 46: btreeInitPage() returns error code 11
Error: database disk image is malformed
(Same error is raised for SELECT queries too.)*** in database main ***
Page 3818: btreeInitPage() returns error code 11
Page 46: btreeInitPage() returns error code 11
Error: database disk image is malformed
There were no power-off or reboots in near time vicinity when the
corruption was detected. We have poured over this document
https://sqlite.org/howtocorrupt.html
many times to check if any of the conditions could apply,
but so far no leads yet.
We have also been unable to reproduce the corruption by stressing
application's SQLite DB read/write code paths for a week.
I'm attaching showdb output for the DB header and 2 corrupt pages
if it's of any hint.
---
A bit more application setup context/information:
- Linux kernel 4.4.41
- glibc 2.22
- Ext4 file system, mounted as (rw,relatime,data=ordered).
- Writer C++ process: sqlite-3.17
- Creates a set of "time series" tables, each table has 2 numeric
columns (timestamp, int) during initialization.
- Every 1 minute, 2 threads will do total 15 writes. (using "INSERT OR
REPLACE ... (timestamp, int)" SQL into 15 tables).
- SQLite DB opened with SQLITE_OPEN_READWRITE | SQLITE_OPEN_CREATE
flags, initialized with "PRAGMA journal_mode=wal;", threading mode
is Serialized for the libsqlite build, uses default VFS ("unix").
All other config params are default (e.g. autovacuum is disabled
etc.).
- A separate thread runs "PRAGMA quick_check;" periodically every 5
minutes, in its own separate DB connection.
- Reader process: sqlite-3.11 + Python 2.7.11
- Periodically reads time series tables for a given timestamp range
(usually latest 5 minutes) using SELECT queries (no INSERT/UPDATE/
DELETE from this process).
- Uses same same "PRAGMA journal_mode=wal", uses the sqlite3 DBAPI
module from Python standard library.
Apart from above 2, no other processes are accessing the SQLite DB file.
We have updated both the reader and writer to use latest SQLite 3.21,
but without understanding the cause of corruption, we are unable to
say if this update to latest 3.21 would indeed prevent further
occurrences.
Thanks,
Nikhil