[sqlite] multiple processes working on one database file

Discussion:

Christoph P.U. Kukulies

2016-10-05 10:10:54 UTC

Would it be possible to have multiple processes (webapplication) working
on a singe Sqlite database file? WRT locking, deadlocks etc.

--
Christoph

Simon Slavin

2016-10-05 10:25:31 UTC

Permalink

Would it be possible to have multiple processes (webapplication) working on a singe Sqlite database file? WRT locking, deadlocks etc.

Yes. SQLite incorporates multi-process/multi-thread locking, exponential backoff, and such things. This all works fine unless you have intentionally turned off those features.

Don't forget to set a timeout (I'd suggest at least one minute) for each connection:

<https://www.sqlite.org/pragma.html#pragma_busy_timeout>

Simon.

Jens Alfke

2016-10-05 15:30:06 UTC

Permalink

Post by Simon Slavin
Yes. SQLite incorporates multi-process/multi-thread locking, exponential backoff, and such things. This all works fine unless you have intentionally turned off those features.

I did find that SQLite’s own locking was less effective/fair than using a mutex, and in situations with a lot of write contention could lead to starvation and timeouts. It’s possible I was doing things wrong, as the wrapper library I started with included a SQLite busy-handler that would wait-and-retry. (But IIRC, if I removed the busy handler altogether, SQLite would just immediately return a db-busy error if another thread had a lock.)

There were situations where one thread was rapidly writing to the database, in multiple transactions but with only brief gaps between them, and other connections on other threads that wanted to make small writes were starved.

Apparently the waiting connections were trying to spin-lock, i.e. attempting to get the write lock, and on failure sleeping a little while and retrying. But if the first thread has the write lock 99% of the time, it can take quite a while for another thread to get lucky and snatch the lock. If there are multiple other threads wanting the lock, it takes multiple times as long.

To fix this I added my own (pthread) mutex and used it as a bottleneck between my code and SQLite, so before trying any writes a thread would wait to lock the mutex. The advantage is that the mutex keeps a queue of waiting threads, so it grants access fairly, to the thread that’s been waiting the longest. But this gets harder to implement, of course, if the database is being used in multiple *processes*, not just threads in a single process.

—Jens

Simon Slavin

2016-10-05 16:53:33 UTC

Permalink

Post by Jens Alfke
I did find that SQLite’s own locking was less effective/fair than using a mutex, and in situations with a lot of write contention could lead to starvation and timeouts. It’s possible I was doing things wrong, as the wrapper library I started with included a SQLite busy-handler that would wait-and-retry. (But IIRC, if I removed the busy handler altogether, SQLite would just immediately return a db-busy error if another thread had a lock.)

SQLite has its own built-in busy-handler which is written to work extremely efficiently with the things SQLite usually needs to do (though not necessarily with the way you're using SQLite). It's not obvious whether the wrapper library you're using simply calls this one or implements its own, possibly less-efficient, one.

Post by Jens Alfke
There were situations where one thread was rapidly writing to the database, in multiple transactions but with only brief gaps between them, and other connections on other threads that wanted to make small writes were starved.
Apparently the waiting connections were trying to spin-lock, i.e. attempting to get the write lock, and on failure sleeping a little while and retrying. But if the first thread has the write lock 99% of the time, it can take quite a while for another thread to get lucky and snatch the lock. If there are multiple other threads wanting the lock, it takes multiple times as long.

The backoff algorithm implemented in SQLite is very likely to give each connection its fair turn. At least it did so in my tests of greedy vs. non-greedy three- and seven-process situations. But it does still fail to do this correctly sometimes depending on the timing of the requests.

Post by Jens Alfke
To fix this I added my own (pthread) mutex and used it as a bottleneck between my code and SQLite, so before trying any writes a thread would wait to lock the mutex. The advantage is that the mutex keeps a queue of waiting threads, so it grants access fairly, to the thread that’s been waiting the longest. But this gets harder to implement, of course, if the database is being used in multiple *processes*, not just threads in a single process.

Indeed. Yet another problem that nobody has completely solved yet.

Simon.

Jay Kreibich

2016-10-06 15:02:26 UTC

Permalink

Post by Jens Alfke

Post by Jens Alfke
I did find that SQLite’s own locking was less effective/fair than using

a mutex, and in situations with a lot of write contention could lead to
starvation and timeouts. It’s possible I was doing things wrong, as the
wrapper library I started with included a SQLite busy-handler that would
wait-and-retry. (But IIRC, if I removed the busy handler altogether, SQLite
would just immediately return a db-busy error if another thread had a lock.)
SQLite has its own built-in busy-handler which is written to work
extremely efficiently with the things SQLite usually needs to do (though
not necessarily with the way you're using SQLite). It's not obvious
whether the wrapper library you're using simply calls this one or
implements its own, possibly less-efficient, one.

I'll admit I haven't done this low level type of SQLite programming in some
versions, but the traditional issue is that very, very few applications
handle deadlocking correctly. Even with an intelligent busy-handler, there
are situations when multiple processes can get into a deadlock and the only
way out is for one or more processes to abort their current transactions.
I've seen very few applications that handle that situation correctly, and
it is generally something that can't be put in a wrapper or utility library
(unless it is abstracting out all transactions and database interactions).

-j