Discussion:
[sqlite] Is it possible that dropping a big table takes very long
Cecil Westerhof
2016-04-16 09:59:06 UTC
Permalink
I am playing a bit with SQLite. I first had a table with 1E8 elements. When
trying to drop this it looked like SQLite got hung. I tried it from DB
Browser and a Java program.
I just tried it with a table of 1E7 elements. That was dropped in about 13
seconds.
I will try it again with 1E8 elements, but it takes 4½ hours to generated.
Is it possible that SQLite has trouble dropping very large tables? It was
5.2 GB. With 1E7 elements the table is 512 MB.

The definition of the table:
CREATE TABLE testUniqueUUID (
UUID blob,

PRIMARY KEY(UUID)
CHECK(TYPEOF(UUID) = 'blob' AND
LENGTH(UUID) = 16 AND
SUBSTR(HEX(UUID), 13, 1) == '4' AND
SUBSTR(HEX(UUID), 17, 1) IN ('8', '9', 'A', 'B')
)
);
--
Cecil Westerhof
Simon Slavin
2016-04-16 12:51:49 UTC
Permalink
Post by Cecil Westerhof
I first had a table with 1E8 elements. When
trying to drop this it looked like SQLite got hung.
Please tell us which version of SQLite and which journal mode you're using.

My guess is that the operations for 1e7 rows fit in one of the caches involved but that the same is not true of the 1e8 row table.

Simon.
Cecil Westerhof
2016-04-16 13:32:08 UTC
Permalink
Post by Simon Slavin
Post by Cecil Westerhof
I first had a table with 1E8 elements. When
trying to drop this it looked like SQLite got hung.
Please tell us which version of SQLite and which journal mode you're using.
​I work with Java. With:
SELECT SQLITE_VERSION()
I get:
3.8.11​


​How can I get the journal mode in Jav?​
With DB Browser I get Delete.
But when I in sqlite3 give:
PRAGMA schema.journal_mode;
I get:
Error: unknown database schema
--
Cecil Westerhof
Simon Slavin
2016-04-16 13:41:16 UTC
Permalink
Post by Cecil Westerhof
SELECT SQLITE_VERSION()
3.8.11​
Thanks.
Post by Cecil Westerhof
​How can I get the journal mode in Jav?​
With DB Browser I get Delete.
PRAGMA schema.journal_mode;
Error: unknown database schema
That is not well explained. Try just

PRAGMA journal_mode;

Simon.
Cecil Westerhof
2016-04-16 13:46:58 UTC
Permalink
Post by Simon Slavin
Post by Cecil Westerhof
​How can I get the journal mode in Jav?​
With DB Browser I get Delete.
PRAGMA schema.journal_mode;
Error: unknown database schema
That is not well explained. Try just
PRAGMA journal_mode;
​That gives delete.
--
Cecil Westerhof
R Smith
2016-04-16 12:52:25 UTC
Permalink
Post by Cecil Westerhof
I am playing a bit with SQLite. I first had a table with 1E8 elements. When
trying to drop this it looked like SQLite got hung. I tried it from DB
Browser and a Java program.
I just tried it with a table of 1E7 elements. That was dropped in about 13
seconds.
I will try it again with 1E8 elements, but it takes 4½ hours to generated.
Is it possible that SQLite has trouble dropping very large tables? It was
5.2 GB. With 1E7 elements the table is 512 MB.
CREATE TABLE testUniqueUUID (
UUID blob,
PRIMARY KEY(UUID)
CHECK(TYPEOF(UUID) = 'blob' AND
LENGTH(UUID) = 16 AND
SUBSTR(HEX(UUID), 13, 1) == '4' AND
SUBSTR(HEX(UUID), 17, 1) IN ('8', '9', 'A', 'B')
)
);
I just duplicated the experiment (though I did not include the CHECK
constraint) on SQLitespeed using standard SQLite library, and here is
the results for the 10 mil rows (1E+7):


-- SQLite version 3.9.2 [ Release: 2015-11-02 ] on SQLitespeed
version 2.0.2.4.

-- Script Items: 4 Parameter Count: 0
-- 2016-04-16 14:42:43.333 | [Info] Script Initialized,
Started executing...
--
================================================================================================

CREATE TABLE testUniqueUUID (
UUID BLOB PRIMARY KEY
);

-- Item Stats: Item No: 1 Query Size
(Chars): 62
-- VM Work Steps: 43 Rows
Modified: 0
-- Full Query Time: 0d 00h 00m and 00.001s
-- Query Result: Success.
--
------------------------------------------------------------------------------------------------

WITH RndGen(i,RndBlob) AS (
SELECT 0, (randomblob(16))
UNION ALL
SELECT i+1, (randomblob(16)) FROM RndGen WHERE i<10000000
)
INSERT INTO testUniqueUUID (UUID) SELECT RndBlob FROM RndGen;


-- Item Stats: Item No: 2 Query Size
(Chars): 199
-- Result Columns: 0 Result Rows: 0
-- VM Work Steps: 290000050 Rows
Modified: 10000001
-- Full Query Time: 0d 00h 02m and 10.878s
-- Query Result: Success.
--
------------------------------------------------------------------------------------------------

SELECT UUID FROM testUniqueUUID LIMIT 10;


-- UUID
-- ------------------------------------
-- 0xA3044750B1A8567E7FD9DACD5C0C64CF
-- 0xC6C6AAFAE6179E7B28867D5FB6AED7A6
-- 0x2267D5856D5D7601FA9E0D8A1E6A66BC
-- 0x63BEB2ECC58EA6D02D30ED27A3A50971
-- 0x18477B93BD35C7A2ED83010619CA3887
-- 0x47D7F3284B094CBE3BF6D77DC974F147
-- 0x77736E93FAFE0436199CE84760A1072A
-- 0x015E14BEA6D3C889958329CAF9C11F5C
-- 0x1805A44908518BE6D6DE6BA63B5A9B71
-- 0xE21DA4DFD367286DE89343FB02B9F8EF

-- Item Stats: Item No: 3 Query Size
(Chars): 43
-- Result Columns: 1 Result Rows: 10
-- VM Work Steps: 48 Rows
Modified: 0
-- Full Query Time: 0d 00h 00m and 00.001s
-- Query Result: Success.
--
------------------------------------------------------------------------------------------------

DROP TABLE testUniqueUUID;

-- Item Stats: Item No: 4 Query Size
(Chars): 28
-- VM Work Steps: 149 Rows
Modified: 0
-- Full Query Time: 0d 00h 00m and 00.721s
-- Query Result: Success.
--
------------------------------------------------------------------------------------------------

-- Script Stats: Total Script Execution Time: 0d 00h 02m and
11.733s
-- Total Script Query Time: 0d 00h 02m and
11.601s
-- Total Database Rows Changed: 10000001
-- Total Virtual-Machine Steps: 290000290
-- Last executed Item Index: 4
-- Last Script Error:
--
------------------------------------------------------------------------------------------------

-- 2016-04-16 14:44:55.054 | [Success] Script Success.

As you can see, the INSERT obviously takes some time (even more-so if
the CHECK constraint is added), but the DROP Table takes almost no time
here...

Let me try the 100 million rows, this may take some time - I will post
again when it is done.

Cheers,
Ryan
Cecil Westerhof
2016-04-16 13:39:33 UTC
Permalink
Post by R Smith
Post by Cecil Westerhof
I am playing a bit with SQLite. I first had a table with 1E8 elements. When
trying to drop this it looked like SQLite got hung. I tried it from DB
Browser and a Java program.
I just tried it with a table of 1E7 elements. That was dropped in about 13
seconds.
I will try it again with 1E8 elements, but it takes 4½ hours to generated.
Is it possible that SQLite has trouble dropping very large tables? It was
5.2 GB. With 1E7 elements the table is 512 MB.
CREATE TABLE testUniqueUUID (
UUID blob,
PRIMARY KEY(UUID)
CHECK(TYPEOF(UUID) = 'blob' AND
LENGTH(UUID) = 16 AND
SUBSTR(HEX(UUID), 13, 1) == '4' AND
SUBSTR(HEX(UUID), 17, 1) IN ('8', '9', 'A', 'B')
)
);
I just duplicated the experiment (though I did not include the CHECK
constraint) on SQLitespeed using standard SQLite library, and here is the
-- SQLite version 3.9.2 [ Release: 2015-11-02 ] on SQLitespeed version
2.0.2.4.
-- Script Items: 4 Parameter Count: 0
-- 2016-04-16 14:42:43.333 | [Info] Script Initialized, Started
executing...
--
================================================================================================
CREATE TABLE testUniqueUUID (
UUID BLOB PRIMARY KEY
);
62
0
-- Full Query Time: 0d 00h 00m and 00.001s
-- Query Result: Success.
--
------------------------------------------------------------------------------------------------
WITH RndGen(i,RndBlob) AS (
SELECT 0, (randomblob(16))
UNION ALL
SELECT i+1, (randomblob(16)) FROM RndGen WHERE i<10000000
)
INSERT INTO testUniqueUUID (UUID) SELECT RndBlob FROM RndGen;
199
-- Result Columns: 0 Result Rows: 0
10000001
-- Full Query Time: 0d 00h 02m and 10.878s
-- Query Result: Success.
--
------------------------------------------------------------------------------------------------
​For me this took about 7 minutes. But I do also more,​
Post by R Smith
SELECT UUID FROM testUniqueUUID LIMIT 10;
-- UUID
-- ------------------------------------
-- 0xA3044750B1A8567E7FD9DACD5C0C64CF
-- 0xC6C6AAFAE6179E7B28867D5FB6AED7A6
-- 0x2267D5856D5D7601FA9E0D8A1E6A66BC
-- 0x63BEB2ECC58EA6D02D30ED27A3A50971
-- 0x18477B93BD35C7A2ED83010619CA3887
-- 0x47D7F3284B094CBE3BF6D77DC974F147
-- 0x77736E93FAFE0436199CE84760A1072A
-- 0x015E14BEA6D3C889958329CAF9C11F5C
-- 0x1805A44908518BE6D6DE6BA63B5A9B71
-- 0xE21DA4DFD367286DE89343FB02B9F8EF
43
-- Result Columns: 1 Result Rows: 10
0
-- Full Query Time: 0d 00h 00m and 00.001s
-- Query Result: Success.
--
------------------------------------------------------------------------------------------------
DROP TABLE testUniqueUUID;
28
0
-- Full Query Time: 0d 00h 00m and 00.721s
-- Query Result: Success.
--
------------------------------------------------------------------------------------------------
-- Script Stats: Total Script Execution Time: 0d 00h 02m and
11.733s
-- Total Script Query Time: 0d 00h 02m and
11.601s
-- Total Database Rows Changed: 10000001
-- Total Virtual-Machine Steps: 290000290
-- Last executed Item Index: 4
--
------------------------------------------------------------------------------------------------
-- 2016-04-16 14:44:55.054 | [Success] Script Success.
As you can see, the INSERT obviously takes some time (even more-so if the
CHECK constraint is added), but the DROP Table takes almost no time here...
​The drop is a very big difference: .7 seconds or 13. Is almost 20 times as
long. Could I be doing something wrong?​
Post by R Smith
Let me try the 100 million rows, this may take some time - I will post
again when it is done.
​I am curious.

If useful, I could share the program I am using.
--
Cecil Westerhof
R Smith
2016-04-16 14:00:47 UTC
Permalink
Post by Cecil Westerhof
Post by R Smith
-- 2016-04-16 14:44:55.054 | [Success] Script Success.
As you can see, the INSERT obviously takes some time (even more-so if the
CHECK constraint is added), but the DROP Table takes almost no time here...
​The drop is a very big difference: .7 seconds or 13. Is almost 20 times as
long. Could I be doing something wrong?​
The big one is still running, so we'll see - but I do think already
there is something very different. Simon might be on to something with
the cache. The journal mode should also make a difference, but not that
big I think.
Once this run finishes, I will try the 10-Mil one with different journal
modes. I have never noticed a large time taken for dropping tables
though, but then I do not often drop very large tables.
Post by Cecil Westerhof
If useful, I could share the program I am using.
As long as the version is known and the modes used, the rest of the
software differences should be small - it's usually down to hardware.
Cecil Westerhof
2016-04-16 14:09:27 UTC
Permalink
Post by R Smith
-- 2016-04-16 14:44:55.054 | [Success] Script Success.
Post by R Smith
As you can see, the INSERT obviously takes some time (even more-so if the
CHECK constraint is added), but the DROP Table takes almost no time here...
​The drop is a very big difference: .7 seconds or 13. Is almost 20 times
as
long. Could I be doing something wrong?​
The big one is still running, so we'll see - but I do think already there
is something very different. Simon might be on to something with the cache.
The journal mode should also make a difference, but not that big I think.
Once this run finishes, I will try the 10-Mil one with different journal
modes. I have never noticed a large time taken for dropping tables though,
but then I do not often drop very large tables.
​I am known for doing strange things. ;-)​
If useful, I could share the program I am using.
As long as the version is known and the modes used, the rest of the
software differences should be small - it's usually down to hardware.
​One strange thing the commandline and DB Browser are using ​3.8.10.2 while
Java is using 3.8.11.
--
Cecil Westerhof
R Smith
2016-04-16 14:19:26 UTC
Permalink
Post by Cecil Westerhof
One strange thing the commandline and DB Browser are using ​3.8.10.2
while Java is using 3.8.11.
Your command-line is simply outdated - you can download the newest from
http://sqlite.org/download/
DB-Browser might have a newer version also, you can check their Github
space.
JAVA gets released when it does, not much you can do there.

Latest version is 3.12.1

That said, I don't think the time difference is version-related.
Cecil Westerhof
2016-04-16 14:27:32 UTC
Permalink
Post by R Smith
Post by Cecil Westerhof
One strange thing the commandline and DB Browser are using ​3.8.10.2
while Java is using 3.8.11.
Your command-line is simply outdated - you can download the newest from
http://sqlite.org/download/
DB-Browser might have a newer version also, you can check their Github
space.
JAVA gets released when it does, not much you can do there.
Latest version is 3.12.1
That said, I don't think the time difference is version-related.
​I have to much fun playing, so I postpone the updates. ;-)

I find it a bit strange, because I thought openSUSE was mostly reasonable
up to date.

And I want to switch to Debian and that has even 3.8.7.
--
Cecil Westerhof
R Smith
2016-04-16 18:36:34 UTC
Permalink
Post by Cecil Westerhof
Post by R Smith
Let me try the 100 million rows, this may take some time - I will post
again when it is done.
​I am curious.
Well, here it is then, 100-million rows: The INSERT took a lot of time,
near 5 hours, but again, the DROP was only a few seconds.

The journal mode used here is also DELETE - which is quite fast. I am
not quite sure why your DROP would take that long. I think that this
exercise was large enough to reproduce any Cache-related problems on my
system too, so I'm not suspecting that anymore. Perhaps someone else
might have a thought....



-- SQLite version 3.9.2 [ Release: 2015-11-02 ] on SQLitespeed
version 2.0.2.4.

-- Script Items: 4 Parameter Count: 0
-- 2016-04-16 14:52:07.639 | [Info] Script Initialized,
Started executing...
--
================================================================================================

CREATE TABLE testUniqueUUID (
UUID BLOB PRIMARY KEY
);

-- Item Stats: Item No: 1 Query Size
(Chars): 62
-- VM Work Steps: 43 Rows
Modified: 0
-- Full Query Time: 0d 00h 00m and 00.001s
-- Query Result: Success.
--
------------------------------------------------------------------------------------------------

WITH RndGen(i,RndBlob) AS (
SELECT 0, (randomblob(16))
UNION ALL
SELECT i+1, (randomblob(16)) FROM RndGen WHERE i<100000000
)
INSERT INTO testUniqueUUID (UUID) SELECT RndBlob FROM RndGen;


-- Item Stats: Item No: 2 Query Size
(Chars): 200
-- Result Columns: 0 Result Rows: 0
-- VM Work Steps: -1394967246 Rows
Modified: 100000001
-- Full Query Time: 0d 04h 56m and 56.963s
-- Query Result: Success.
--
------------------------------------------------------------------------------------------------

SELECT UUID FROM testUniqueUUID LIMIT 10;


-- UUID
-- ------------------------------------
-- 0x504D9DE6FF67E1BB868E0A2C66C0BCD6
-- 0x85F00A6BECC22F7247569B38EE38FA4C
-- 0x2129D4229A9541F911FD7078C6F58885
-- 0xC07C44C16CBAF4447348EF9058ADCEB4
-- 0xF71DA6C5A845A8AA84561D42E06FA986
-- 0x3F36F6EAC1D9B02D3C96B2678E990F11
-- 0xF127EC52E4C915A7D5B136B5B0999E6A
-- 0x4660D0FC0BA0D860E1578372EEFD664E
-- 0xB73F66F8791C79F62A94529389B55641
-- 0x99E266627DC455E49E604858074629E6

-- Item Stats: Item No: 3 Query Size
(Chars): 43
-- Result Columns: 1 Result Rows: 10
-- VM Work Steps: 48 Rows
Modified: 0
-- Full Query Time: -- --- --- --- --.----
-- Query Result: Success.
--
------------------------------------------------------------------------------------------------

DROP TABLE testUniqueUUID;

-- Item Stats: Item No: 4 Query Size
(Chars): 28
-- VM Work Steps: 149 Rows
Modified: 0
-- Full Query Time: 0d 00h 00m and 08.052s
-- Query Result: Success.
--
------------------------------------------------------------------------------------------------

-- Script Stats: Total Script Execution Time: 0d 04h 57m and
05.231s
-- Total Script Query Time: 0d 04h 57m and
05.016s
-- Total Database Rows Changed: 100000001
-- Total Virtual-Machine Steps: -1394967006
-- Last executed Item Index: 4
-- Last Script Error:
--
------------------------------------------------------------------------------------------------

-- 2016-04-16 19:49:12.857 | [Success] Script Success.
Cecil Westerhof
2016-04-16 19:25:58 UTC
Permalink
Post by R Smith
Post by R Smith
Let me try the 100 million rows, this may take some time - I will post
Post by R Smith
again when it is done.
​I am curious.
Well, here it is then, 100-million rows: The INSERT took a lot of time,
near 5 hours, but again, the DROP was only a few seconds.
​That is what I would expect, but not what I see. :'-(

I am filling the database again, but now with text UUID instead of blob
UUID. That takes a ‘little’ more time. When it is filled I try again.

Have another problem also. My CPU is about 15%, but the load average is
also about 15. (This is on a Linux system.) This results (sometimes) in a
very sluggish system. Can the load be a SQLite problem, or is it a Java
problem? (When the program is not running, the load average is a lot lower.)
--
Cecil Westerhof
Simon Slavin
2016-04-16 19:39:57 UTC
Permalink
Post by Cecil Westerhof
I am filling the database again, but now with text UUID instead of blob
UUID. That takes a ‘little’ more time. When it is filled I try again.
I assume you're doing many INSERT commands between BEGIN and END.
Post by Cecil Westerhof
Have another problem also. My CPU is about 15%, but the load average is
also about 15. (This is on a Linux system.) This results (sometimes) in a
very sluggish system. Can the load be a SQLite problem, or is it a Java
problem? (When the program is not running, the load average is a lot lower.)
SQLite is usually input/output bound (except sometimes when your database is in memory). It's running at the speed of your storage device. It's normal to find SQLite executing commands as fast as it can but plenty of CPU left over.

Simon.
Rob Willett
2016-04-16 19:44:26 UTC
Permalink
Cecil,

If you have a load average of 15 then that normally means you have a
massively overloaded Linux box. I don’t know your system but I get
worried around a load average of 3-4 on our boxes. Load Average is a
very crude measurement but a high number tends to be bad.

If your CPU is only running at 15% (how do you know this?) then the
problem is likely elsewhere. My first thought is swap space, check what
the disk io is, the fact you mention Java would immediately make me look
at the amount of memory allocated to the JVM. For some reason many
JVM’s allocate a pitifully small amount of memory, Java then runs out
of memory pretty quickly and spends the next few secs/mins/hours
fighting with the OS for resources.

Top, netstat, ps etc are your friends here. You need to look at them and
see whats kicking your load average into the stratosphere. I’d be
surprised if its Sqlite, we don’t normally drop tables, but from
memory, dropping a 10GB table took no time. It might have been seconds
or a minute, we didn’t measure it as it wasn’t an issue.

Rob
Post by Cecil Westerhof
Post by R Smith
Post by R Smith
Let me try the 100 million rows, this may take some time - I will post
Post by R Smith
again when it is done.
​I am curious.
Well, here it is then, 100-million rows: The INSERT took a lot of time,
near 5 hours, but again, the DROP was only a few seconds.
​That is what I would expect, but not what I see. :'-(
I am filling the database again, but now with text UUID instead of blob
UUID. That takes a ‘little’ more time. When it is filled I try again.
Have another problem also. My CPU is about 15%, but the load average is
also about 15. (This is on a Linux system.) This results (sometimes) in a
very sluggish system. Can the load be a SQLite problem, or is it a Java
problem? (When the program is not running, the load average is a lot lower.)
--
Cecil Westerhof
_______________________________________________
sqlite-users mailing list
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Cecil Westerhof
2016-04-16 20:42:52 UTC
Permalink
Post by Rob Willett
If you have a load average of 15 then that normally means you have a
massively overloaded Linux box. I don’t know your system but I get worried
around a load average of 3-4 on our boxes. Load Average is a very crude
measurement but a high number tends to be bad.
​Yes, normally it is about 2-3 on my system.

Post by Rob Willett
If your CPU is only running at 15% (how do you know this?) then the
problem is likely elsewhere.
​I am using htop for that.


Post by Rob Willett
My first thought is swap space, check what the disk io is, the fact you
mention Java would immediately make me look at the amount of memory
allocated to the JVM. For some reason many JVM’s allocate a pitifully small
amount of memory, Java then runs out of memory pretty quickly and spends
the next few secs/mins/hours fighting with the OS for resources.
​At the moment it is difficult to check, because of the sluggishness, ​

​but luckily I have a service running that puts vmstat info in a SQLite
database. :-D So after the program is finished I can look into the data.​

The Java program I am running does not use swap and also not much memory
(113 MB). I wrote some scripts for getting that kind of information:
https://github.com/CecilWesterhof/BashLibrary/tree/master/bin


Top, netstat, ps etc are your friends here. You need to look at them and
Post by Rob Willett
see whats kicking your load average into the stratosphere. I’d be surprised
if its Sqlite,
​I think Java is more likely, but it does not hurt to check.


Post by Rob Willett
we don’t normally drop tables, but from memory, dropping a 10GB table took
no time. It might have been seconds or a minute, we didn’t measure it as it
wasn’t an issue.
​At the moment I am filling the table. After is is filled I will try
dropping it again.
--
Cecil Westerhof
R Smith
2016-04-17 02:16:11 UTC
Permalink
Post by Cecil Westerhof
I am filling the database again, but now with text UUID instead of blob
UUID. That takes a ‘little’ more time. When it is filled I try again.
Don't forget to copy the DB file once it is populated, that way you can
do multiple drop-tests with copies of the DB file, so no more filling it
with data.
If you can make a file that consistently drops the table for a long
time, the devs could use it to determine the glitch.
Cecil Westerhof
2016-04-17 07:48:54 UTC
Permalink
Post by Cecil Westerhof
I am filling the database again, but now with text UUID instead of blob
UUID. That takes a ‘little’ more time. When it is filled I try again.
Don't forget to copy the DB file once it is populated, that way you can do
multiple drop-tests with copies of the DB file, so no more filling it with
data.
If you can make a file that consistently drops the table for a long time,
the devs could use it to determine the glitch.
​I thought about that, but thanks anyway.

Have to start again, because I was generating it with text UUID instead of
blob UUID. It ran for almost 17 hours, but it looked like it would take at
least another three days. So I cancelled it. I will start the blob version
again. That takes about 4½ hour.​
--
Cecil Westerhof
Keith Medcalf
2016-04-16 23:03:59 UTC
Permalink
Post by Cecil Westerhof
Have another problem also. My CPU is about 15%, but the load average is
also about 15. (This is on a Linux system.) This results (sometimes) in a
very sluggish system. Can the load be a SQLite problem, or is it a Java
problem? (When the program is not running, the load average is a lot
lower.)
You have slow CPU with multiple cores (15% means you probably have quad core -- probably with Intel Hyper-Slowness as well (as in not SMT), or you have dual-core SMT), and the dispatcher is brain-dead and dispatching the single threaded application on the same CPU as more critical OS workers resulting in swamping that core with compute use and preventing competing threads (such as the OS or the gooey) from being other than -- well -- gooey.

You claim this is Linux. There are many brands and flavours (and versions) or Linux, all tuned to behave differently. You do not specify the particular version and brand it is.
Cecil Westerhof
2016-04-17 07:40:56 UTC
Permalink
Post by Keith Medcalf
Post by Cecil Westerhof
Have another problem also. My CPU is about 15%, but the load average is
also about 15. (This is on a Linux system.) This results (sometimes) in a
very sluggish system. Can the load be a SQLite problem, or is it a Java
problem? (When the program is not running, the load average is a lot lower.)
You have slow CPU with multiple cores (15% means you probably have quad
core -- probably with Intel Hyper-Slowness as well (as in not SMT), or you
have dual-core SMT), and the dispatcher is brain-dead and dispatching the
single threaded application on the same CPU as more critical OS workers
resulting in swamping that core with compute use and preventing competing
threads (such as the OS or the gooey) from being other than -- well --
gooey.
You claim this is Linux. There are many brands and flavours (and
versions) or Linux, all tuned to behave differently. You do not specify
the particular version and brand it is.
​It never hurts to give some extra information. I only expected SQLite
help. But I do not mind to get other help. :-D

The CPU is: Intel(R) Core(TM) i7-3632QM CPU @ 2.20GHz.
It is a little strange. It has four cores but Linux thinks it has eight.
None of the eight has a high load: sometimes one 60-70%, but most of the
time a lot lower.
The operating system is openSUSE 13.2.

free -m gives:
total used free shared buffers cached
Mem: 15923 15737 185 1300 105 8131
-/+ buffers/cache: 7500 8422
Swap: 2053 1659 394​


​RSSMemory usage for java
========================================
RSSMemory 6 MB by PID=9705
RSSMemory 54 MB by PID=9731
RSSMemory 75 MB by PID=28844
RSSMemory 112 MB by PID=18743
RSSMemory 121 MB by PID=28880
RSSMemory 345 MB by PID=1036
========================================
Total used RSSMemory: 713 MB

swap usage for java
========================================
swap 2 MB by PID=28880
swap 7 MB by PID=28844
swap 128 MB by PID=9731
swap 132 MB by PID=9705
swap 442 MB by PID=1036
========================================
Total used swap: 711 MB

The program has PID 18743 and does not use swap.

ps -l 18743 gives:
F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD
0 S 1000 18743 28670 4 80 0 - 1718302 futex_ pts/12 45:29 java
RandomUUID CheckUUID.properties

My own psPid 18743 gives:
USER PID PPID TT START ELAPSED TIME STAT COMMAND
cecil 18743 28670 pts/12 Apr16 16:29:42 00:45:27 Sl+ java
RandomUUID CheckUUID.properties

So it does not have a high CPU usage about 5%.

I think I have to cancel the program, because I have:
23:03:12: Inserted 6.00e+07 UUID's
00:50:33: Inserted 6.10e+07 UUID's
02:56:31: Inserted 6.20e+07 UUID's
04:56:06: Inserted 6.30e+07 UUID's
06:53:03: Inserted 6.40e+07 UUID's
09:14:08: Inserted 6.50e+07 UUID's

That will take at least another 3 days.

This is the text based UUID. I expected it to take longer, but the blob
based took 4½ hours. So that is a very big difference. But that is
something for another post.

It took almost half an hour to write, because of the sluggish behaviour.
:'-(
--
Cecil Westerhof
Rob Willett
2016-04-17 08:13:03 UTC
Permalink
Cecil,,

Linux reporting 8 cores is due to hyper threading on the four cores.
Thats normal.

One of the cores reporting a high usage is normal if you have single
threaded app that simply cannot be moved, e.g. many perl programs
exhibit this behaviour. Thats fine and to be expected.

I do not have the whole thread for what you reported but I did read
somebody ask if you have put your inserts between a BEGIN/END
transaction. That will make a massive difference to your speed. Also
I’m unclear as to how Java fits in all of this. Perhaps you gave a
better indication further up the thread I do not have.

There are a lot of very, very talented people on the list, I am an not
one of them :), I would strongly urge you to provide more information
rather than less, e.g. you provide summary information for Java by RSS.
It would be a lot more useful to have the full raw data so people can
look for quirks and anomalies rather than simply you interpreting it for
us. I am NOT the best person to talk about SQLite performance, however I
am more familiar with Unix/Linux performance and administration.

Since most people here do not have the same issues as you and I have no
doubt they are hitting Sqlite far harder than I or you can, I would look
at what you are doing that is different and Java keeps jumping out at
me. I have run (though not programmed as I’m not a Java developer) a
number of programs running Java as the backend onto large DB2 databases
and the first thing the developers would do would be to increase the
amount of memory that Java is allowed to allocate and use. They would
tend to throw 4GB at each JVM or more if the sysadmins would let them.

I still don’t get a feeling for where the problem is though as the
information is fragmentary.

Rob
Post by Cecil Westerhof
Post by Keith Medcalf
Post by Cecil Westerhof
Have another problem also. My CPU is about 15%, but the load average is
also about 15. (This is on a Linux system.) This results (sometimes) in a
very sluggish system. Can the load be a SQLite problem, or is it a Java
problem? (When the program is not running, the load average is a lot lower.)
You have slow CPU with multiple cores (15% means you probably have quad
core -- probably with Intel Hyper-Slowness as well (as in not SMT), or you
have dual-core SMT), and the dispatcher is brain-dead and dispatching the
single threaded application on the same CPU as more critical OS workers
resulting in swamping that core with compute use and preventing competing
threads (such as the OS or the gooey) from being other than -- well --
gooey.
You claim this is Linux. There are many brands and flavours (and
versions) or Linux, all tuned to behave differently. You do not specify
the particular version and brand it is.
​It never hurts to give some extra information. I only expected SQLite
help. But I do not mind to get other help. :-D
It is a little strange. It has four cores but Linux thinks it has eight.
None of the eight has a high load: sometimes one 60-70%, but most of the
time a lot lower.
The operating system is openSUSE 13.2.
total used free shared buffers
cached
Mem: 15923 15737 185 1300 105
8131
-/+ buffers/cache: 7500 8422
Swap: 2053 1659 394​
​RSSMemory usage for java
========================================
RSSMemory 6 MB by PID=9705
RSSMemory 54 MB by PID=9731
RSSMemory 75 MB by PID=28844
RSSMemory 112 MB by PID=18743
RSSMemory 121 MB by PID=28880
RSSMemory 345 MB by PID=1036
========================================
Total used RSSMemory: 713 MB
swap usage for java
========================================
swap 2 MB by PID=28880
swap 7 MB by PID=28844
swap 128 MB by PID=9731
swap 132 MB by PID=9705
swap 442 MB by PID=1036
========================================
Total used swap: 711 MB
The program has PID 18743 and does not use swap.
F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD
0 S 1000 18743 28670 4 80 0 - 1718302 futex_ pts/12 45:29 java
RandomUUID CheckUUID.properties

USER PID PPID TT START ELAPSED TIME STAT COMMAND
cecil 18743 28670 pts/12 Apr16 16:29:42 00:45:27 Sl+ java
RandomUUID CheckUUID.properties
So it does not have a high CPU usage about 5%.
23:03:12: Inserted 6.00e+07 UUID's
00:50:33: Inserted 6.10e+07 UUID's
02:56:31: Inserted 6.20e+07 UUID's
04:56:06: Inserted 6.30e+07 UUID's
06:53:03: Inserted 6.40e+07 UUID's
09:14:08: Inserted 6.50e+07 UUID's
That will take at least another 3 days.
This is the text based UUID. I expected it to take longer, but the blob
based took 4½ hours. So that is a very big difference. But that is
something for another post.
It took almost half an hour to write, because of the sluggish
behaviour.
:'-(
--
Cecil Westerhof
_______________________________________________
sqlite-users mailing list
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Cecil Westerhof
2016-04-17 09:38:36 UTC
Permalink
Post by Rob Willett
I do not have the whole thread for what you reported but I did read
somebody ask if you have put your inserts between a BEGIN/END transaction.
That will make a massive difference to your speed. Also I’m unclear as to
how Java fits in all of this. Perhaps you gave a better indication further
up the thread I do not have.
​Oops, the I did not read good enough. :-(

I start with a:


​conn.setAutoCommit(false);
but that is not the same? It does make a big difference. Without it 1E4
records take 25 minutest, with it it takes less as a second. I did not even
try 1E5 without it, but with it, it takes 2-3 seconds.

It is at least partly to do with SQLite. When I had the load of 15 I was
using the table definition:
CREATE TABLE testUniqueUUID (
UUID text,

PRIMARY KEY(UUID)
);

I am running it again with the table definition:
CREATE TABLE testUniqueUUID (
UUID blob,

PRIMARY KEY(UUID)
CHECK(TYPEOF(UUID) = 'blob' AND
LENGTH(UUID) = 16 AND
SUBSTR(HEX(UUID), 13, 1) == '4' AND
SUBSTR(HEX(UUID), 17, 1) IN ('8', '9', 'A', 'B')
)
);

and the load is now around ten. Still high, but it does not cripple my
system.​
Post by Rob Willett
There are a lot of very, very talented people on the list, I am an not one
of them :), I would strongly urge you to provide more information rather
than less, e.g. you provide summary information for Java by RSS. It would
be a lot more useful to have the full raw data so people can look for
quirks and anomalies rather than simply you interpreting it for us. I am
NOT the best person to talk about SQLite performance, however I am more
familiar with Unix/Linux performance and administration.
​Which information would be useful? Ihis is the /proc/PID/status
information:
Name: java
State: S (sleeping)
Tgid: 26455
Ngid: 0
Pid: 26455
PPid: 28670
TracerPid: 0
Uid: 1000 1000 1000 1000
Gid: 100 100 100 100
FDSize: 256
Groups: 7 33 100 486 498 1000
NStgid: 26455
NSpid: 26455
NSpgid: 26455
NSsid: 28670
VmPeak: 6935564 kB
VmSize: 6873208 kB
VmLck: 0 kB
VmPin: 0 kB
VmHWM: 111664 kB
VmRSS: 104784 kB
VmData: 6828392 kB
VmStk: 140 kB
VmExe: 4 kB
VmLib: 17148 kB
VmPTE: 492 kB
VmPMD: 32 kB
VmSwap: 0 kB
Threads: 20
SigQ: 1/63646
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000000000000
SigCgt: 0000000181005ccf
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: 0000003fffffffff
Seccomp: 0
Cpus_allowed: ff
Cpus_allowed_list: 0-7
Mems_allowed:
00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001
Mems_allowed_list: 0
voluntary_ctxt_switches: 6
nonvoluntary_ctxt_switches: 1

To my untrained eye VmPeak and VmSize look high.

Post by Rob Willett
Since most people here do not have the same issues as you and I have no
doubt they are hitting Sqlite far harder than I or you can, I would look at
what you are doing that is different and Java keeps jumping out at me.
​That was also my idea. I only wanted to confirm it was not a SQLite
problem. But looking at changing the UUID from text to blob makes such a
big difference, I suspect it is partly a SQLite problem.


Post by Rob Willett
I have run (though not programmed as I’m not a Java developer) a number of
programs running Java as the backend onto large DB2 databases and the first
thing the developers would do would be to increase the amount of memory
that Java is allowed to allocate and use. They would tend to throw 4GB at
each JVM or more if the sysadmins would let them.
​Seeing that ​the program does not use swap, I do not think that the amount
of memory the program has is a problem.
--
Cecil Westerhof
Simon Slavin
2016-04-17 10:03:31 UTC
Permalink
Post by Cecil Westerhof

​conn.setAutoCommit(false);
but that is not the same?
Yes, that does the same as BEGIN ... END. At least, according to the documentation it does.

But you caused me to look up how the JDBC works, especially for operations which involve a lot of memory. It turns out that this is not a 'thin' shim which just translates Java calls to SQLite. It's a 'thick' shim and does lots of things between the two to make all its databases look like they work the same way.

The result of this is that almost everything you see resulting from your calls is done by JDBC, not SQLite. This includes whatever caused your initial query about some operations taking a long time. Whatever it is, it's probably some consequence of how JDBC works, not how SQLite works, and experts on Java are going to understand it better than experts on SQLite.

You can probably verify this by downloading the SQLite shell tool and performing the same operations in it (e.g. DROP TABLE) as you do in your Java code. I'm betting you don't get the same slowdowns in the same places.

Simon.
Cecil Westerhof
2016-04-17 10:44:21 UTC
Permalink
Post by Simon Slavin
Post by Cecil Westerhof

​conn.setAutoCommit(false);
but that is not the same?
Yes, that does the same as BEGIN ... END. At least, according to the
documentation it does.
But you caused me to look up how the JDBC works, especially for operations
which involve a lot of memory. It turns out that this is not a 'thin' shim
which just translates Java calls to SQLite. It's a 'thick' shim and does
lots of things between the two to make all its databases look like they
work the same way.
Another field I should expand my knowledge in. ;-)​



The result of this is that almost everything you see resulting from your
Post by Simon Slavin
calls is done by JDBC, not SQLite. This includes whatever caused your
initial query about some operations taking a long time. Whatever it is,
it's probably some consequence of how JDBC works, not how SQLite works, and
experts on Java are going to understand it better than experts on SQLite.
You can probably verify this by downloading the SQLite shell tool and
performing the same operations in it (e.g. DROP TABLE) as you do in your
Java code. I'm betting you don't get the same slowdowns in the same places.
​Another two hours before the database is filled and then I can start
experimenting on copies of it.

​Well the ‘simple’ exercise was not so simple, but it helps to understand
things better. :-)
--
Cecil Westerhof
Keith Medcalf
2016-04-17 15:13:12 UTC
Permalink
Perfectly linear. The time waster in creating the records is the index with the completely separate copy of all the data and the native primary key (record number) into a duplicate structure (the index btree). Creating the index and the base table at the same time, while linear as well, is even slower (as would be expected since you are doing multiple times the I/O for each row inserted).

Anyway, behaviour is linear, both in data insertion, index generation, and dropping the table (which, as one would expect, takes only as much time as one would take to walk the pages and move them to the free list, which may include writing them to the journal).

I don't see the issue you are having and "dropping" a table with 1e8 records and a single unique index takes about 30 seconds.

Perhaps you have really slow busy-wait style I/O? The laptop this was run on has the same CPU as you do, and the single thread ran maxxed out (100% of a core) using about 12% total of the CPU (one core single execution unit). I/O is irrelevant for me as this has a very fast SSD. (As a side note, a very fast and well cached SSD is indeed faster than a well cached spinning disk -- not by a lot, but it is faster -- especially on cache misses -- which, with a good cache, only occur when the cache is cold).

NOTE that I killed the 1e9 insert with the index update at insert time. Clearly doing it all in a single transaction does not work very well.
sqlite test.db < test.sql
.timer on
.eqp on
select sqlite_version();
3.13.0
Run Time: real 0.000 user 0.000000 sys 0.000000
Run Time: real 0.000 user 0.000000 sys 0.000000
Error: near line 5: no such table: x
vacuum;
Run Time: real 0.031 user 0.000000 sys 0.000000
pragma temp_store=1;
Run Time: real 0.000 user 0.000000 sys 0.000000
pragma cache_size=65535;
Run Time: real 0.000 user 0.000000 sys 0.000000

create table x (uuid blob not null);
Run Time: real 0.016 user 0.000000 sys 0.000000
insert into x select randomblob(16) from generate_series where start=1 and stop=1e2;
--EQP-- 0,0,0,SCAN TABLE generate_series VIRTUAL TABLE INDEX 3:
Run Time: real 0.031 user 0.000000 sys 0.000000
create unique index ux on x(uuid);
Run Time: real 0.016 user 0.000000 sys 0.000000
drop table x;
Run Time: real 0.031 user 0.000000 sys 0.000000
vacuum;
Run Time: real 0.031 user 0.000000 sys 0.031250

create table x (uuid blob not null);
Run Time: real 0.016 user 0.000000 sys 0.000000
insert into x select randomblob(16) from generate_series where start=1 and stop=1e3;
--EQP-- 0,0,0,SCAN TABLE generate_series VIRTUAL TABLE INDEX 3:
Run Time: real 0.031 user 0.000000 sys 0.000000
create unique index ux on x(uuid);
Run Time: real 0.015 user 0.000000 sys 0.000000
drop table x;
Run Time: real 0.031 user 0.000000 sys 0.015625
vacuum;
Run Time: real 0.032 user 0.000000 sys 0.000000

create table x (uuid blob not null);
Run Time: real 0.031 user 0.000000 sys 0.000000
insert into x select randomblob(16) from generate_series where start=1 and stop=1e4;
--EQP-- 0,0,0,SCAN TABLE generate_series VIRTUAL TABLE INDEX 3:
Run Time: real 0.031 user 0.015625 sys 0.000000
create unique index ux on x(uuid);
Run Time: real 0.031 user 0.015625 sys 0.000000
drop table x;
Run Time: real 0.016 user 0.000000 sys 0.000000
vacuum;
Run Time: real 0.016 user 0.000000 sys 0.000000

create table x (uuid blob not null);
Run Time: real 0.031 user 0.000000 sys 0.000000
insert into x select randomblob(16) from generate_series where start=1 and stop=1e5;
--EQP-- 0,0,0,SCAN TABLE generate_series VIRTUAL TABLE INDEX 3:
Run Time: real 0.094 user 0.031250 sys 0.015625
create unique index ux on x(uuid);
Run Time: real 0.109 user 0.078125 sys 0.000000
drop table x;
Run Time: real 0.031 user 0.000000 sys 0.000000
vacuum;
Run Time: real 0.031 user 0.000000 sys 0.000000

create table x (uuid blob not null);
Run Time: real 0.032 user 0.000000 sys 0.015625
insert into x select randomblob(16) from generate_series where start=1 and stop=1e6;
--EQP-- 0,0,0,SCAN TABLE generate_series VIRTUAL TABLE INDEX 3:
Run Time: real 0.547 user 0.453125 sys 0.046875
create unique index ux on x(uuid);
Run Time: real 1.230 user 1.125000 sys 0.046875
drop table x;
Run Time: real 0.079 user 0.046875 sys 0.000000
vacuum;
Run Time: real 0.047 user 0.000000 sys 0.015625

create table x (uuid blob not null);
Run Time: real 0.031 user 0.000000 sys 0.015625
insert into x select randomblob(16) from generate_series where start=1 and stop=1e7;
--EQP-- 0,0,0,SCAN TABLE generate_series VIRTUAL TABLE INDEX 3:
Run Time: real 5.375 user 4.468750 sys 0.453125
create unique index ux on x(uuid);
Run Time: real 15.948 user 14.671875 sys 0.812500
drop table x;
Run Time: real 0.594 user 0.406250 sys 0.156250
vacuum;
Run Time: real 0.125 user 0.000000 sys 0.093750

create table x (uuid blob not null);
Run Time: real 0.015 user 0.000000 sys 0.000000
insert into x select randomblob(16) from generate_series where start=1 and stop=1e8;
--EQP-- 0,0,0,SCAN TABLE generate_series VIRTUAL TABLE INDEX 3:
Run Time: real 51.468 user 45.500000 sys 4.656250
create unique index ux on x(uuid);
Run Time: real 177.413 user 162.921875 sys 13.343750
drop table x;
Run Time: real 7.219 user 3.843750 sys 3.328125
vacuum;
Run Time: real 0.891 user 0.031250 sys 0.828125

create table x (uuid blob not null);
Run Time: real 0.016 user 0.000000 sys 0.000000
insert into x select randomblob(16) from generate_series where start=1 and stop=1e9;
--EQP-- 0,0,0,SCAN TABLE generate_series VIRTUAL TABLE INDEX 3:
Run Time: real 562.266 user 502.359375 sys 57.875000
create unique index ux on x(uuid);
Run Time: real 3607.984 user 2197.453125 sys 478.156250
drop table x;
Run Time: real 976.447 user 81.703125 sys 383.546875
vacuum;
Run Time: real 2.527 user 0.046875 sys 2.390625

create table x (uuid blob not null primary key);
Run Time: real 0.035 user 0.000000 sys 0.000000
insert into x select randomblob(16) from generate_series where start=1 and stop=1e2;
--EQP-- 0,0,0,SCAN TABLE generate_series VIRTUAL TABLE INDEX 3:
Run Time: real 0.031 user 0.000000 sys 0.000000
drop table x;
Run Time: real 0.032 user 0.000000 sys 0.000000
vacuum;
Run Time: real 0.029 user 0.000000 sys 0.015625

create table x (uuid blob not null primary key);
Run Time: real 0.030 user 0.000000 sys 0.000000
insert into x select randomblob(16) from generate_series where start=1 and stop=1e3;
--EQP-- 0,0,0,SCAN TABLE generate_series VIRTUAL TABLE INDEX 3:
Run Time: real 0.034 user 0.000000 sys 0.000000
drop table x;
Run Time: real 0.034 user 0.000000 sys 0.000000
vacuum;
Run Time: real 0.036 user 0.000000 sys 0.000000

create table x (uuid blob not null primary key);
Run Time: real 0.020 user 0.000000 sys 0.000000
insert into x select randomblob(16) from generate_series where start=1 and stop=1e4;
--EQP-- 0,0,0,SCAN TABLE generate_series VIRTUAL TABLE INDEX 3:
Run Time: real 0.060 user 0.031250 sys 0.000000
drop table x;
Run Time: real 0.040 user 0.000000 sys 0.000000
vacuum;
Run Time: real 0.020 user 0.000000 sys 0.000000

create table x (uuid blob not null primary key);
Run Time: real 0.028 user 0.000000 sys 0.000000
insert into x select randomblob(16) from generate_series where start=1 and stop=1e5;
--EQP-- 0,0,0,SCAN TABLE generate_series VIRTUAL TABLE INDEX 3:
Run Time: real 0.387 user 0.328125 sys 0.015625
drop table x;
Run Time: real 0.028 user 0.015625 sys 0.000000
vacuum;
Run Time: real 0.034 user 0.000000 sys 0.000000

create table x (uuid blob not null primary key);
Run Time: real 0.020 user 0.000000 sys 0.000000
insert into x select randomblob(16) from generate_series where start=1 and stop=1e6;
--EQP-- 0,0,0,SCAN TABLE generate_series VIRTUAL TABLE INDEX 3:
Run Time: real 3.614 user 3.375000 sys 0.140625
drop table x;
Run Time: real 0.100 user 0.062500 sys 0.000000
vacuum;
Run Time: real 0.040 user 0.000000 sys 0.015625

create table x (uuid blob not null primary key);
Run Time: real 0.043 user 0.000000 sys 0.000000
insert into x select randomblob(16) from generate_series where start=1 and stop=1e7;
--EQP-- 0,0,0,SCAN TABLE generate_series VIRTUAL TABLE INDEX 3:
Run Time: real 46.307 user 42.921875 sys 2.812500
drop table x;
Run Time: real 0.954 user 0.593750 sys 0.312500
vacuum;
Run Time: real 0.184 user 0.015625 sys 0.140625

create table x (uuid blob not null primary key);
Run Time: real 0.031 user 0.000000 sys 0.000000
insert into x select randomblob(16) from generate_series where start=1 and stop=1e8;
--EQP-- 0,0,0,SCAN TABLE generate_series VIRTUAL TABLE INDEX 3:
Run Time: real 1765.842 user 801.453125 sys 889.921875
drop table x;
Run Time: real 23.707 user 5.796875 sys 8.375000
vacuum;
Run Time: real 1.031 user 0.015625 sys 0.984375
-----Original Message-----
Sent: Sunday, 17 April, 2016 06:44
To: SQLite mailing list
Subject: Re: [sqlite] Is it possible that dropping a big table takes very
long
Post by Simon Slavin
Post by Cecil Westerhof
​conn.setAutoCommit(false);
but that is not the same?
Yes, that does the same as BEGIN ... END. At least, according to the
documentation it does.
But you caused me to look up how the JDBC works, especially for
operations
Post by Simon Slavin
which involve a lot of memory. It turns out that this is not a 'thin'
shim
Post by Simon Slavin
which just translates Java calls to SQLite. It's a 'thick' shim and
does
Post by Simon Slavin
lots of things between the two to make all its databases look like they
work the same way.
Another field I should expand my knowledge in. ;-)​
The result of this is that almost everything you see resulting from your
Post by Simon Slavin
calls is done by JDBC, not SQLite. This includes whatever caused your
initial query about some operations taking a long time. Whatever it is,
it's probably some consequence of how JDBC works, not how SQLite works,
and
Post by Simon Slavin
experts on Java are going to understand it better than experts on
SQLite.
Post by Simon Slavin
You can probably verify this by downloading the SQLite shell tool and
performing the same operations in it (e.g. DROP TABLE) as you do in your
Java code. I'm betting you don't get the same slowdowns in the same
places.
​Another two hours before the database is filled and then I can start
experimenting on copies of it.
​Well the ‘simple’ exercise was not so simple, but it helps to understand
things better. :-)
--
Cecil Westerhof
_______________________________________________
sqlite-users mailing list
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Cecil Westerhof
2016-04-17 16:13:38 UTC
Permalink
Post by Keith Medcalf
Perfectly linear. The time waster in creating the records is the index
with the completely separate copy of all the data and the native primary
key (record number) into a duplicate structure (the index btree). Creating
the index and the base table at the same time, while linear as well, is
even slower (as would be expected since you are doing multiple times the
I/O for each row inserted).
Anyway, behaviour is linear, both in data insertion, index generation, and
dropping the table (which, as one would expect, takes only as much time as
one would take to walk the pages and move them to the free list, which may
include writing them to the journal).
I don't see the issue you are having and "dropping" a table with 1e8
records and a single unique index takes about 30 seconds.
Perhaps you have really slow busy-wait style I/O? The laptop this was run
on has the same CPU as you do, and the single thread ran maxxed out (100%
of a core) using about 12% total of the CPU (one core single execution
unit). I/O is irrelevant for me as this has a very fast SSD. (As a side
note, a very fast and well cached SSD is indeed faster than a well cached
spinning disk -- not by a lot, but it is faster -- especially on cache
misses -- which, with a good cache, only occur when the cache is cold).
NOTE that I killed the 1e9 insert with the index update at insert time.
Clearly doing it all in a single transaction does not work very well.
​I should change the program to not do all in one swoop. Auto commit is a
bad idea, but no commit also.​


The strange thing is that the blob variant takes a lot of time now also.
First it took only 4½ hour, now it is already busy for eight hours and only
has come to 8.9E7.

14:36:01: Inserted 8.40e+07 UUID's
14:54:47: Inserted 8.50e+07 UUID's
15:30:19: Inserted 8.60e+07 UUID's
15:54:02: Inserted 8.70e+07 UUID's
16:17:01: Inserted 8.80e+07 UUID's
17:24:20: Inserted 8.90e+07 UUID's


Something else I find strange (but maybe is not): there are 19 threads
beside the process:
ps -lL -p 26455
F S UID PID PPID LWP C PRI NI ADDR SZ WCHAN TTY TIME CMD
0 S 1000 26455 28670 26455 0 80 0 - 1718302 futex_ pts/12 00:00:00 java
1 D 1000 26455 28670 26458 14 80 0 - 1718302 - pts/12 01:06:35 java
1 S 1000 26455 28670 26459 0 80 0 - 1718302 futex_ pts/12 00:00:00 java
1 S 1000 26455 28670 26460 0 80 0 - 1718302 futex_ pts/12 00:00:00 java
1 S 1000 26455 28670 26461 0 80 0 - 1718302 futex_ pts/12 00:00:00 java
1 S 1000 26455 28670 26462 0 80 0 - 1718302 futex_ pts/12 00:00:00 java
1 S 1000 26455 28670 26463 0 80 0 - 1718302 futex_ pts/12 00:00:00 java
1 S 1000 26455 28670 26464 0 80 0 - 1718302 futex_ pts/12 00:00:00 java
1 S 1000 26455 28670 26465 0 80 0 - 1718302 futex_ pts/12 00:00:00 java
1 S 1000 26455 28670 26466 0 80 0 - 1718302 futex_ pts/12 00:00:00 java
1 S 1000 26455 28670 26467 0 80 0 - 1718302 futex_ pts/12 00:00:00 java
1 S 1000 26455 28670 26468 0 80 0 - 1718302 futex_ pts/12 00:00:00 java
1 S 1000 26455 28670 26469 0 80 0 - 1718302 futex_ pts/12 00:00:00 java
1 S 1000 26455 28670 26470 0 80 0 - 1718302 futex_ pts/12 00:00:00 java
1 S 1000 26455 28670 26471 0 80 0 - 1718302 futex_ pts/12 00:00:00 java
1 S 1000 26455 28670 26472 0 80 0 - 1718302 futex_ pts/12 00:00:00 java
1 S 1000 26455 28670 26473 0 80 0 - 1718302 futex_ pts/12 00:00:00 java
1 S 1000 26455 28670 26474 0 80 0 - 1718302 futex_ pts/12 00:00:00 java
1 S 1000 26455 28670 26475 0 80 0 - 1718302 futex_ pts/12 00:00:00 java
1 S 1000 26455 28670 26476 0 80 0 - 1718302 futex_ pts/12 00:00:13 java

The D of the first thread signifies uninterruptable sleep (usually IO). The
others are waiting for an event to complete.

It is a command line program and I do not create threads. Or is this normal
in Java programs?

If it is of interest: I am using Java 8.


​And the sluggish behaviour is back again. :'-(​
--
Cecil Westerhof
Simon Slavin
2016-04-17 16:23:40 UTC
Permalink
Post by Cecil Westerhof
Or is this normal
in Java programs?
Yes. You're discovering that Java is rubbish. 'futex' for Java is what most environments call 'mutex' and Java doesn't do locking well. It's this which is causing your delays, not SQLite or anything around it.

Google 'java futex' if you want details, but there's little point since you can't do much about it. Just stay updated to the latest OS and the latest Java available to you and hope someone figured it out.

Simon.
Cecil Westerhof
2016-04-17 22:55:57 UTC
Permalink
Post by Simon Slavin
Post by Cecil Westerhof
Or is this normal
in Java programs?
Yes. You're discovering that Java is rubbish. 'futex' for Java is what
most environments call 'mutex' and Java doesn't do locking well. It's this
which is causing your delays, not SQLite or anything around it.
Google 'java futex' if you want details, but there's little point since
you can't do much about it. Just stay updated to the latest OS and the
latest Java available to you and hope someone figured it out.
​I put a strace on it. This was what I got:
Process 26455 attached with 20 threads
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
99.80 11245.498406 42527 264435 130887 futex
0.09 10.480000 3493333 3 fsync
0.08 8.886784 0 39275508 read
0.02 2.552284 0 16397440 write
0.00 0.367716 0 50708676 lseek
0.00 0.048000 8000 6 6 restart_syscall
0.00 0.000442 1 737 sched_yield
0.00 0.000000 0 2 close
0.00 0.000000 0 1 stat
0.00 0.000000 0 4 fstat
0.00 0.000000 0 4 mmap
0.00 0.000000 0 128 mprotect
0.00 0.000000 0 6 rt_sigprocmask
0.00 0.000000 0 1 rt_sigreturn
0.00 0.000000 0 2 2 access
0.00 0.000000 0 4 madvise
0.00 0.000000 0 7 fcntl
0.00 0.000000 0 3 unlink
0.00 0.000000 0 1 gettid
0.00 0.000000 0 2 sched_getaffinity
------ ----------- ----------- --------- --------- ----------------
100.00 11267.833632 106646970 130895 total


​Pretty damaging for futex: almost half of the calls went wrong and 99.8%
of the time is spend there​.

​I have some finding out to do.​
--
Cecil Westerhof
Rowan Worth
2016-04-18 02:04:07 UTC
Permalink
Post by Cecil Westerhof
Process 26455 attached with 20 threads
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
99.80 11245.498406 42527 264435 130887 futex
0.09 10.480000 3493333 3 fsync
0.08 8.886784 0 39275508 read
0.02 2.552284 0 16397440 write
0.00 0.367716 0 50708676 lseek
------ ----------- ----------- --------- --------- ----------------
100.00 11267.833632 106646970 130895 total

​Pretty damaging for futex: almost half of the calls went wrong and 99.8%
of the time is spend there​.
​I have some finding out to do.​
I'm not sure this is damning for futex. You said you are not creating
threads yourself, which means you probably only have one I/O bound thread
actually doing anything. The other 19 threads are presumably spawned by the
APIs you are using and are likely just waiting for something to do. jstack
<PID> will give you details on each thread in the JVM so you can see what
they are up to.

It's not like the futex calls are wasting CPU (in this case) - the threads
are idle until the kernel wakes them up.

-Rowan
Cecil Westerhof
2016-04-18 18:01:39 UTC
Permalink
Post by Rowan Worth
Post by Cecil Westerhof
Process 26455 attached with 20 threads
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
99.80 11245.498406 42527 264435 130887 futex
0.09 10.480000 3493333 3 fsync
0.08 8.886784 0 39275508 read
0.02 2.552284 0 16397440 write
0.00 0.367716 0 50708676 lseek
------ ----------- ----------- --------- --------- ----------------
100.00 11267.833632 106646970 130895 total

​Pretty damaging for futex: almost half of the calls went wrong and 99.8%
of the time is spend there​.
​I have some finding out to do.​
I'm not sure this is damning for futex. You said you are not creating
threads yourself, which means you probably only have one I/O bound thread
actually doing anything. The other 19 threads are presumably spawned by the
APIs you are using and are likely just waiting for something to do. jstack
<PID> will give you details on each thread in the JVM so you can see what
they are up to.
It's not like the futex calls are wasting CPU (in this case) - the threads
are idle until the kernel wakes them up.
​99.8 percent of the time is spend on futex. That seems a tad much.
--
Cecil Westerhof
Rowan Worth
2016-04-19 02:43:43 UTC
Permalink
Post by Keith Medcalf
Post by Rowan Worth
Post by Cecil Westerhof
Process 26455 attached with 20 threads
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
99.80 11245.498406 42527 264435 130887 futex
0.09 10.480000 3493333 3 fsync
0.08 8.886784 0 39275508 read
0.02 2.552284 0 16397440 write
0.00 0.367716 0 50708676 lseek
------ ----------- ----------- --------- --------- ----------------
100.00 11267.833632 106646970 130895 total
It's not like the futex calls are wasting CPU (in this case) - the
threads
Post by Rowan Worth
are idle until the kernel wakes them up.
​99.8 percent of the time is spend on futex. That seems a tad much.
If you have a lot of idle threads of course you'll see a lot of time spent
in futex. The threads aren't *doing* anything or using resources in that
time, just waiting for something to happen. I would wager at least 130,800
of those "errors" were ETIMEDOUT.

It's like stracing bash and going "wow, look at how much time is spent in
read() and wait() those syscalls must be really inefficient".

Your syscall results are entirely unsurprising. The appearance of 20
threads in a single threaded java code is mildly surprising, but on an 8
core machine you may have 16 garbage collection threads. Then there's the
main thread, and your sqlite binding may create another, so there's not too
much left to account for.

-Rowan
Cecil Westerhof
2016-04-19 12:57:24 UTC
Permalink
Post by Rowan Worth
Post by Keith Medcalf
Post by Rowan Worth
Post by Cecil Westerhof
Process 26455 attached with 20 threads
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
99.80 11245.498406 42527 264435 130887 futex
0.09 10.480000 3493333 3 fsync
0.08 8.886784 0 39275508 read
0.02 2.552284 0 16397440 write
0.00 0.367716 0 50708676 lseek
------ ----------- ----------- --------- --------- ----------------
100.00 11267.833632 106646970 130895 total
It's not like the futex calls are wasting CPU (in this case) - the
threads
Post by Rowan Worth
are idle until the kernel wakes them up.
​99.8 percent of the time is spend on futex. That seems a tad much.
If you have a lot of idle threads of course you'll see a lot of time spent
in futex. The threads aren't *doing* anything or using resources in that
time, just waiting for something to happen. I would wager at least 130,800
of those "errors" were ETIMEDOUT.
It's like stracing bash and going "wow, look at how much time is spent in
read() and wait() those syscalls must be really inefficient".
Your syscall results are entirely unsurprising. The appearance of 20
threads in a single threaded java code is mildly surprising, but on an 8
core machine you may have 16 garbage collection threads. Then there's the
main thread, and your sqlite binding may create another, so there's not too
much left to account for.
​OK, I will take you word for it. This is new territory for me, so what
looks strange to me can be completely normal. Something else I have to
learn about. :-)
--
Cecil Westerhof
Cecil Westerhof
2016-04-17 19:44:30 UTC
Permalink
Post by Cecil Westerhof
The strange thing is that the blob variant takes a lot of time now also.
First it took only 4½ hour, now it is already busy for eight hours and only
has come to 8.9E7.
14:36:01: Inserted 8.40e+07 UUID's
14:54:47: Inserted 8.50e+07 UUID's
15:30:19: Inserted 8.60e+07 UUID's
15:54:02: Inserted 8.70e+07 UUID's
16:17:01: Inserted 8.80e+07 UUID's
17:24:20: Inserted 8.90e+07 UUID's
​It turned out that it is a combination of factors. Firefox sometimes takes
a lot of resources, so I decided to quit Firefox. And voila:​
17:24:20: Inserted 8.90e+07 UUID's
18:28:02: Inserted 9.00e+07 UUID's
19:24:13: Inserted 9.10e+07 UUID's
19:36:41: Inserted 9.20e+07 UUID's
19:42:18: Inserted 9.30e+07 UUID's
19:47:46: Inserted 9.40e+07 UUID's
19:52:43: Inserted 9.50e+07 UUID's
19:57:50: Inserted 9.60e+07 UUID's
20:02:36: Inserted 9.70e+07 UUID's
20:07:29: Inserted 9.80e+07 UUID's
20:12:17: Inserted 9.90e+07 UUID's
20:16:59: Inserted 1.00e+08 UUID's

I should continue this on a Linux/Java newsgroup. But I thought it was
interesting to know.

Now I can continue with where I started this tread for. ;-)
--
Cecil Westerhof
Scott Robison
2016-04-17 19:59:06 UTC
Permalink
Post by Cecil Westerhof
Post by Cecil Westerhof
The strange thing is that the blob variant takes a lot of time now also.
First it took only 4½ hour, now it is already busy for eight hours and
only
Post by Cecil Westerhof
has come to 8.9E7.
14:36:01: Inserted 8.40e+07 UUID's
14:54:47: Inserted 8.50e+07 UUID's
15:30:19: Inserted 8.60e+07 UUID's
15:54:02: Inserted 8.70e+07 UUID's
16:17:01: Inserted 8.80e+07 UUID's
17:24:20: Inserted 8.90e+07 UUID's
​It turned out that it is a combination of factors. Firefox sometimes takes
a lot of resources, so I decided to quit Firefox. And voila:​
17:24:20: Inserted 8.90e+07 UUID's
18:28:02: Inserted 9.00e+07 UUID's
19:24:13: Inserted 9.10e+07 UUID's
19:36:41: Inserted 9.20e+07 UUID's
19:42:18: Inserted 9.30e+07 UUID's
19:47:46: Inserted 9.40e+07 UUID's
19:52:43: Inserted 9.50e+07 UUID's
19:57:50: Inserted 9.60e+07 UUID's
20:02:36: Inserted 9.70e+07 UUID's
20:07:29: Inserted 9.80e+07 UUID's
20:12:17: Inserted 9.90e+07 UUID's
20:16:59: Inserted 1.00e+08 UUID's
I should continue this on a Linux/Java newsgroup. But I thought it was
interesting to know.
It can be hard to accurately measure performance on a modern multitasking
multiuser system. As you've observed, external processes can have a huge
impact of the "independent and isolated" process. Whenever I'm trying to
measure performance, I close all other applications, maybe disconnect from
the network, turn off services. Depending on just how accurate I want to be
with the measurement (not all measurements are as important / picky).

It's entirely possible that the table drop is related to something similar
to this.
--
Scott Robison
Post by Cecil Westerhof
Post by Cecil Westerhof
The strange thing is that the blob variant takes a lot of time now also.
First it took only 4½ hour, now it is already busy for eight hours and
only
Post by Cecil Westerhof
has come to 8.9E7.
14:36:01: Inserted 8.40e+07 UUID's
14:54:47: Inserted 8.50e+07 UUID's
15:30:19: Inserted 8.60e+07 UUID's
15:54:02: Inserted 8.70e+07 UUID's
16:17:01: Inserted 8.80e+07 UUID's
17:24:20: Inserted 8.90e+07 UUID's
​It turned out that it is a combination of factors. Firefox sometimes takes
a lot of resources, so I decided to quit Firefox. And voila:​
17:24:20: Inserted 8.90e+07 UUID's
18:28:02: Inserted 9.00e+07 UUID's
19:24:13: Inserted 9.10e+07 UUID's
19:36:41: Inserted 9.20e+07 UUID's
19:42:18: Inserted 9.30e+07 UUID's
19:47:46: Inserted 9.40e+07 UUID's
19:52:43: Inserted 9.50e+07 UUID's
19:57:50: Inserted 9.60e+07 UUID's
20:02:36: Inserted 9.70e+07 UUID's
20:07:29: Inserted 9.80e+07 UUID's
20:12:17: Inserted 9.90e+07 UUID's
20:16:59: Inserted 1.00e+08 UUID's
I should continue this on a Linux/Java newsgroup. But I thought it was
interesting to know.
Now I can continue with where I started this tread for. ;-)
--
Cecil Westerhof
_______________________________________________
sqlite-users mailing list
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
--
Scott Robison
Cecil Westerhof
2016-04-17 22:51:04 UTC
Permalink
Post by Scott Robison
Post by Cecil Westerhof
Post by Cecil Westerhof
The strange thing is that the blob variant takes a lot of time now
also.
Post by Cecil Westerhof
Post by Cecil Westerhof
First it took only 4½ hour, now it is already busy for eight hours and
only
Post by Cecil Westerhof
has come to 8.9E7.
14:36:01: Inserted 8.40e+07 UUID's
14:54:47: Inserted 8.50e+07 UUID's
15:30:19: Inserted 8.60e+07 UUID's
15:54:02: Inserted 8.70e+07 UUID's
16:17:01: Inserted 8.80e+07 UUID's
17:24:20: Inserted 8.90e+07 UUID's
​It turned out that it is a combination of factors. Firefox sometimes
takes
Post by Cecil Westerhof
a lot of resources, so I decided to quit Firefox. And voila:​
17:24:20: Inserted 8.90e+07 UUID's
18:28:02: Inserted 9.00e+07 UUID's
19:24:13: Inserted 9.10e+07 UUID's
19:36:41: Inserted 9.20e+07 UUID's
19:42:18: Inserted 9.30e+07 UUID's
19:47:46: Inserted 9.40e+07 UUID's
19:52:43: Inserted 9.50e+07 UUID's
19:57:50: Inserted 9.60e+07 UUID's
20:02:36: Inserted 9.70e+07 UUID's
20:07:29: Inserted 9.80e+07 UUID's
20:12:17: Inserted 9.90e+07 UUID's
20:16:59: Inserted 1.00e+08 UUID's
I should continue this on a Linux/Java newsgroup. But I thought it was
interesting to know.
It can be hard to accurately measure performance on a modern multitasking
multiuser system. As you've observed, external processes can have a huge
impact of the "independent and isolated" process. Whenever I'm trying to
measure performance, I close all other applications, maybe disconnect from
the network, turn off services. Depending on just how accurate I want to be
with the measurement (not all measurements are as important / picky).
​That I know, but I was only going for big O. I really did not expect
differences like I am getting now. Well, I learned something.​
Post by Scott Robison
It's entirely possible that the table drop is related to something similar
to this.
​I can give positive news about this. In DB Browser it now took 20 seconds.
The strange thing is that Revert Changes took much longer. I did not expect
this, so sadly I do not have timing for this.

The Java program took 2 minutes and 5 seconds. But this was with Firefox
running. (I have some things to do.) Inserting goes a lot faster:
21:56:59: Going to create 1.00e+08 random UUID's
21:57:26: Inserted 1.00e+06 UUID's
21:58:06: Inserted 2.00e+06 UUID's
21:58:51: Inserted 3.00e+06 UUID's
21:59:40: Inserted 4.00e+06 UUID's
22:00:31: Inserted 5.00e+06 UUID's
22:01:32: Inserted 6.00e+06 UUID's
22:02:37: Inserted 7.00e+06 UUID's
22:03:47: Inserted 8.00e+06 UUID's
22:04:52: Inserted 9.00e+06 UUID's
22:06:20: Inserted 1.00e+07 UUID's
22:07:58: Inserted 1.10e+07 UUID's
22:09:35: Inserted 1.20e+07 UUID's
22:11:18: Inserted 1.30e+07 UUID's

But it is possible that this has to do with the commits I do. Just before
displaying I do a commit, instead of after all the work is done.

Using the sqlite3 executable took 2 minutes. That makes the 20 seconds of
DB Browser a little strange.

Well, I have to do everything over this week on a system that is not doing
anything else.
--
Cecil Westerhof
Dan Kennedy
2016-04-18 10:30:02 UTC
Permalink
Post by Cecil Westerhof
I am playing a bit with SQLite. I first had a table with 1E8 elements. When
trying to drop this it looked like SQLite got hung. I tried it from DB
Browser and a Java program.
I just tried it with a table of 1E7 elements. That was dropped in about 13
seconds.
I will try it again with 1E8 elements, but it takes 4½ hours to generated.
Is it possible that SQLite has trouble dropping very large tables? It was
5.2 GB. With 1E7 elements the table is 512 MB.
CREATE TABLE testUniqueUUID (
UUID blob,
PRIMARY KEY(UUID)
CHECK(TYPEOF(UUID) = 'blob' AND
LENGTH(UUID) = 16 AND
SUBSTR(HEX(UUID), 13, 1) == '4' AND
SUBSTR(HEX(UUID), 17, 1) IN ('8', '9', 'A', 'B')
)
);
Are there any other tables in the db with foreign keys that refer to the
table being dropped?

If so, try with foreign keys disabled:

PRAGMA foreign_keys = 0;

Dan.
Olivier Mascia
2016-04-18 10:47:17 UTC
Permalink
Post by Cecil Westerhof
I am playing a bit with SQLite. I first had a table with 1E8 elements. When
trying to drop this it looked like SQLite got hung. I tried it from DB
Browser and a Java program.
I just tried it with a table of 1E7 elements. That was dropped in about 13
seconds.
I will try it again with 1E8 elements, but it takes 4½ hours to generated.
Is it possible that SQLite has trouble dropping very large tables? It was
5.2 GB. With 1E7 elements the table is 512 MB.
CREATE TABLE testUniqueUUID (
UUID blob,
PRIMARY KEY(UUID)
CHECK(TYPEOF(UUID) = 'blob' AND
LENGTH(UUID) = 16 AND
SUBSTR(HEX(UUID), 13, 1) == '4' AND
SUBSTR(HEX(UUID), 17, 1) IN ('8', '9', 'A', 'B')
)
);
Are there any other tables in the db with foreign keys that refer to the table being dropped?
PRAGMA foreign_keys = 0;
Dan.
Not sure if this was already discussed here this last days (I don't think so for having quickly browsed this discussion), I would suggest to check these on the filled-in DB before delete:

PRAGMA compile_options; (has SQLITE_SECURE_DELETE been set for compiling the library?)
PRAGMA auto_vacuum; (is it set to 0 or else?)
--
Meilleures salutations, Met vriendelijke groeten, Best Regards,
Olivier Mascia, integral.be/om
Cecil Westerhof
2016-04-19 12:49:23 UTC
Permalink
PRAGMA compile_options; (has SQLITE_SECURE_DELETE been set for compiling
the library?)
​sqlite3 checkUUID.sqlite
SQLite version 3.8.7.1 2014-10-29 13:59:56
Enter ".help" for usage hints.
sqlite> PRAGMA compile_options;
ENABLE_COLUMN_METADATA
ENABLE_FTS3
ENABLE_RTREE
ENABLE_UNLOCK_NOTIFY
ENABLE_UPDATE_DELETE_LIMIT
MAX_SCHEMA_RETRY=25
OMIT_LOOKASIDE
SECURE_DELETE
SOUNDEX
SYSTEM_MALLOC
TEMP_STORE=1
THREADSAFE=1

P
​​
RAGMA auto_vacuum; (is it set to 0 or else?)
​sqlite> PRAGMA auto_vacuum;
0



​Some other information. I did some testing on an old Aspire One. This is
with a table with 1E7 records. This gives a filesize of 512 MB.

sqlite3 checkUUID.sqlite <<<"DROP TABLE testUniqueUUID;"
real 24m20.05s
user 0m13.14s
sys 0m59.42s


I thougth what if I delete the records before I drop the table?
sqlite3 checkUUID.sqlite <<<"DELETE FROM testUniqueUUID; DROP TABLE
testUniqueUUID;"
real 13m59.44s
user 0m11.52s
sys 0m48.68s

This takes 40% less time and 15% less resources. That is quit a difference.


Then I thought: I am probably doing this wrong, the DELETE should be in a
transaction. So I did:
sqlite3 checkUUID.sqlite <<<"BEGIN TRANSACTION; DELETE FROM testUniqueUUID;
COMMIT TRANSACTION; DROP TABLE testUniqueUUID;"
real 17m21.39s
user 0m11.26s
sys 0m52.26s

But that is about the same performance.


​Is it not strange that first DELETE and then DROP is so more efficient as
just a DROP? I find it counter intuitive. But I am certainly not an expert
in databases.
--
Cecil Westerhof
Simon Slavin
2016-04-19 13:12:09 UTC
Permalink
Post by Cecil Westerhof
​Is it not strange that first DELETE and then DROP is so more efficient as
just a DROP?
Yes.

Can you please try the same comparison in the SQLite shell tool ? You can use

.timer ON
DELETE FROM myTable;
DROP TABLE myTable;
.timer OFF

Simon.
Olivier Mascia
2016-04-19 13:27:44 UTC
Permalink
Post by Olivier Mascia
SECURE_DELETE
Are you aware of the net effect of SQLITE_SECURE_DELETE?

https://www.sqlite.org/compile.html#secure_delete

The documentation talks about a "small performance penalty", yet on very large tables, especially if overflowing by large the cache size, the amount of IO for overwriting deleted data with zeroes can become substantial. Not to say this is the sole reason for your issue, but it should have an impact on it.

You should run "PRAGMA secure_delete=0" before doing this bulk deletion and see how it impacts your timings.
--
Meilleures salutations, Met vriendelijke groeten, Best Regards,
Olivier Mascia, integral.be/om
Cecil Westerhof
2016-04-19 14:04:35 UTC
Permalink
Post by Olivier Mascia
Post by Olivier Mascia
SECURE_DELETE
Are you aware of the net effect of SQLITE_SECURE_DELETE?
https://www.sqlite.org/compile.html#secure_delete
The documentation talks about a "small performance penalty", yet on very
large tables, especially if overflowing by large the cache size, the amount
of IO for overwriting deleted data with zeroes can become substantial. Not
to say this is the sole reason for your issue, but it should have an impact
on it.
You should run "PRAGMA secure_delete=0" before doing this bulk deletion
and see how it impacts your timings.
​Nope, I am just starting to play with SQLite. Maybe playtime is over and
reading time starts. ;-)

Thanks.
--
Cecil Westerhof
Cecil Westerhof
2016-04-20 08:36:55 UTC
Permalink
Post by Olivier Mascia
Post by Olivier Mascia
SECURE_DELETE
Are you aware of the net effect of SQLITE_SECURE_DELETE?
https://www.sqlite.org/compile.html#secure_delete
The documentation talks about a "small performance penalty", yet on very
large tables, especially if overflowing by large the cache size, the amount
of IO for overwriting deleted data with zeroes can become substantial. Not
to say this is the sole reason for your issue, but it should have an impact
on it.
You should run "PRAGMA secure_delete=0" before doing this bulk deletion
and see how it impacts your timings.
​I have some interesting metrics. Certainly the
PRAGMA SECURE_DELETE=0;
has significant influence. But that makes the DELETE before the DROP
only having a bigger influence.

I am not hindered by any deep knowledge with low level database IO, so it
is possible that not everything I say makes sense when you have this
knowledge. Just educate me. :-)


First of all DELETE before DROP with SECURE_DELETE looks to be always a
very good option. Travelling home yesterday and doing other stuff it took
7 minutes, while without the DELETE it took 14 minutes. At the moment I
have not reproduced it again and sadly I do not have the data from that
run. :'-( If I manage to do it again, I will share the data.

As a programmer I am lazy, so I have made a Bash script to make life
easier. I have attached it with two logs. The first is when I ran the
script without doing something else, the second is while Iceweasel was
running. (Firefox on Debian.)

On a clean system:
SECURE_DELETE=1, DROP ONLY 17 minutes
SECURE_DELETE=1, DELETE and DROP 14 minutes
SECURE_DELETE=0, DROP only 4œ minutes
SECURE_DELETE=0, DELETE and DROP œ minute

With Iceweasel running:
SECURE_DELETE=1, DROP ONLY 45 minutes
SECURE_DELETE=1, DELETE and DROP 41 minutes
SECURE_DELETE=0, DROP only 11 minutes
SECURE_DELETE=0, DELETE and DROP œ minute


I am baffled. Still DELETE before DROP is a lot more efficient. And it
looks that it is not bothered when other programs are running (most of
the time). I would think that a DROP should take the least time: it only
has to mark a complete table to be free space. So why does it take
significantly more time? And why is the DELETE before DROP not
influenced by running other programs while DROP only is?

Also: I understand that SECURE_DELETE=1 takes more time, but the
difference seems excessive. What could be happening here?


The SECURE_DELETE=1 certainly is something to warn people for. (In the
current implementation.) By the way: I agree that it is default on.
​
--
Cecil Westerhof
Dominique Devienne
2016-04-20 08:44:43 UTC
Permalink
Post by Cecil Westerhof
I am baffled. Still DELETE before DROP is a lot more efficient. And it
looks that it is not bothered when other programs are running (most of
I agree. That's weird. Needs investigating indeed.
Post by Cecil Westerhof
Also: I understand that SECURE_DELETE=1 takes more time, but the
difference seems excessive. What could be happening here?
Secure delete is basically writing to the disk several times, with random
data.
From [1], it appears to be between 1x and 35x times. Don't know how SQLite
does secure-delete. --DD

[1]
http://www.howtogeek.com/72130/learn-how-to-securely-delete-files-in-windows/
Cecil Westerhof
2016-04-20 08:50:25 UTC
Permalink
Post by Dominique Devienne
Post by Cecil Westerhof
I am baffled. Still DELETE before DROP is a lot more efficient. And it
looks that it is not bothered when other programs are running (most of
I agree. That's weird. Needs investigating indeed.
​Do the developers reed this list, or should I post a bug report?


Post by Dominique Devienne
Post by Cecil Westerhof
Also: I understand that SECURE_DELETE=1 takes more time, but the
difference seems excessive. What could be happening here?
Secure delete is basically writing to the disk several times, with random
data.
From [1], it appears to be between 1x and 35x times. Don't know how SQLite
does secure-delete. --DD
​I will look into it later. Thanks.
--
Cecil Westerhof
R Smith
2016-04-20 10:35:06 UTC
Permalink
Post by Cecil Westerhof
Post by Dominique Devienne
Post by Cecil Westerhof
I am baffled. Still DELETE before DROP is a lot more efficient. And it
looks that it is not bothered when other programs are running (most of
I agree. That's weird. Needs investigating indeed.
​Do the developers reed this list, or should I post a bug report?
The Devs do read the list, and often post, and they will be very
interested in what you have discovered if it is not a system anomaly on
your side. (Perhaps even if it is). Can you post the DB file somewhere
and the steps to reproduce?

Also, there is no need to file a bug report for this since it is not a
bug. A bug is typically considered to be a bug if the answer produced
from a query is wrong. Since everything happens as expected, you are
dealing with an efficiency regression at best - still of interest, but
not a bug. (The distinction matters because bugs gets the highest
priority for attention, and sometimes cause unscheduled releases to fix
- something an optimization would never cause).


Cheers,
Ryan
Cecil Westerhof
2016-04-20 13:31:12 UTC
Permalink
Post by R Smith
Post by Cecil Westerhof
I am baffled. Still DELETE before DROP is a lot more efficient. And it
Post by Cecil Westerhof
looks that it is not bothered when other programs are running (most of
I agree. That's weird. Needs investigating indeed.
​Do the developers reed this list, or should I post a bug report?
The Devs do read the list, and often post, and they will be very
interested in what you have discovered if it is not a system anomaly on
your side. (Perhaps even if it is). Can you post the DB file somewhere and
the steps to reproduce?
​It is 512 MB, thus that is a bit difficult. But would a Java program to
generate it suffice?

The steps to reproduce is to run the script I posted. That supposes you run
Linux, but it should be easy to adapt.
Post by R Smith
Also, there is no need to file a bug report for this since it is not a
bug. A bug is typically considered to be a bug if the answer produced from
a query is wrong. Since everything happens as expected, you are dealing
with an efficiency regression at best - still of interest, but not a bug.
(The distinction matters because bugs gets the highest priority for
attention, and sometimes cause unscheduled releases to fix - something an
optimization would never cause).
​Well, I would expect a DROP to be more efficient as a DELETE and DROP. ;-)
But I agree that wrong results aremuch more important.
--
Cecil Westerhof
Dominique Devienne
2016-04-20 14:23:44 UTC
Permalink
​Do the developers read this list, or should I post a bug report?
They do, and it's actually the preferred way to discuss
issues/bugs/performances, etc...

Dr. Hipp even wrote not long ago "bugs" entered would be aggressively
removed,
if I recall correctly, unless discussed here first.

SQLite is not a classical OSS project. It is public domain, but no
"outside" developers.
No pull requests here. And bug system is mostly reserved to SQLite
devs/authors, AFAIK. --DD
R Smith
2016-04-20 14:07:21 UTC
Permalink
Post by Cecil Westerhof
Post by R Smith
The Devs do read the list, and often post, and they will be very
interested in what you have discovered if it is not a system anomaly on
your side. (Perhaps even if it is). Can you post the DB file somewhere and
the steps to reproduce?
​It is 512 MB, thus that is a bit difficult. But would a Java program to
generate it suffice?
The steps to reproduce is to run the script I posted. That supposes you run
Linux, but it should be easy to adapt.
The problem is, our running the same script (as have been posted)
produces a DB that takes a mere few seconds to drop the table, as
expected. Something in your setup is producing DB files that, with your
setup, takes ages to drop and perhaps better to delete and then drop,
but we cannot reproduce those results, so if you can post the DB
somewhere, on some server or file upload site, the devs could examine it
and conclude whether your system produces a file which somehow takes
ages to drop with possible difference in time for drop and delete or
just drop, - OR - that your file takes little time to drop here with
delete or not.

Once the result is known the devs will know that either:
- The produced DB file is not the problem, nor is the SQLite code, your
setup is just weird (perhaps then further examine/test with your
compile options), OR
- The produced DB does have a problem with dropping vs.
delete-then-drop, so how can it happen?... and thus time to look at what
can be done to make it better.

But without the file, all the above is a guess.

Did you gzip it? Is it still 512MB?
Most file upload sites can handle half a gig easily.
Cecil Westerhof
2016-04-20 15:56:48 UTC
Permalink
Post by Cecil Westerhof
Post by R Smith
The Devs do read the list, and often post, and they will be very
interested in what you have discovered if it is not a system anomaly on
your side. (Perhaps even if it is). Can you post the DB file somewhere and
the steps to reproduce?
​It is 512 MB, thus that is a bit difficult. But would a Java program to
generate it suffice?
The steps to reproduce is to run the script I posted. That supposes you run
Linux, but it should be easy to adapt.
The problem is, our running the same script (as have been posted) produces
a DB that takes a mere few seconds to drop the table, as expected.
Something in your setup is producing DB files that, with your setup, takes
ages to drop and perhaps better to delete and then drop, but we cannot
reproduce those results, so if you can post the DB somewhere, on some
server or file upload site, the devs could examine it and conclude whether
your system produces a file which somehow takes ages to drop with possible
difference in time for drop and delete or just drop, - OR - that your file
takes little time to drop here with delete or not.
- The produced DB file is not the problem, nor is the SQLite code, your
setup is just weird (perhaps then further examine/test with your compile
options), OR
- The produced DB does have a problem with dropping vs. delete-then-drop,
so how can it happen?... and thus time to look at what can be done to make
it better.
But without the file, all the above is a guess.
Did you gzip it? Is it still 512MB?
Most file upload sites can handle half a gig easily.
​It is still 411 MB. When I am home I will try it on another system also to
look if there the timing is differently. Is there an upload site you
recommend?

By the way: it could be that Java is the culprit.

--
Cecil Westerhof
R Smith
2016-04-20 16:11:19 UTC
Permalink
Post by Cecil Westerhof
​It is still 411 MB. When I am home I will try it on another system also to
look if there the timing is differently. Is there an upload site you
recommend?
Dropbox usually works fine, so does M$'s OneDrive, or Google's
GoogleDrive, or anything else you can see on here:

http://lmgtfy.com/?q=file+upload+sites
Post by Cecil Westerhof
By the way: it could be that Java is the culprit.

It might well be, but without that file, we are all just guessing.
Cecil Westerhof
2016-04-20 20:00:14 UTC
Permalink
Post by R Smith
Post by Cecil Westerhof
​It is still 411 MB. When I am home I will try it on another system also to
look if there the timing is differently. Is there an upload site you
recommend?
Dropbox usually works fine, so does M$'s OneDrive, or Google's
How stupid that I did not think about Google Drive. :'-( Here it is:
https://drive.google.com/file/d/0BzW5q7uL-6z0SDdya2REaFNFUVE

​I am very curious.
--
Cecil Westerhof
Scott Robison
2016-04-20 20:21:30 UTC
Permalink
Post by Cecil Westerhof
https://drive.google.com/file/d/0BzW5q7uL-6z0SDdya2REaFNFUVE
​I am very curious.
I downloaded this and opened it with sqlite3.exe shell version 3.11.0.

D:\>\bin\sqlite3.exe checkUUID.sqlite
SQLite version 3.11.0 2016-02-15 17:29:24
Enter ".help" for usage hints.
sqlite> .schema
sqlite>

It's a big enough file, but there is no schema present.
--
Scott Robison
Cecil Westerhof
2016-04-20 20:35:19 UTC
Permalink
Post by Scott Robison
Post by Cecil Westerhof
https://drive.google.com/file/d/0BzW5q7uL-6z0SDdya2REaFNFUVE
​I am very curious.
I downloaded this and opened it with sqlite3.exe shell version 3.11.0.
D:\>\bin\sqlite3.exe checkUUID.sqlite
SQLite version 3.11.0 2016-02-15 17:29:24
Enter ".help" for usage hints.
sqlite> .schema
sqlite>
It's a big enough file, but there is no schema present.
​That is strange, because I get:
sqlite3 checkUUID.sqlite
SQLite version 3.8.7.1 2014-10-29 13:59:56
Enter ".help" for usage hints.
sqlite> .schema
CREATE TABLE testUniqueUUID (
UUID blob,

PRIMARY KEY(UUID)
CHECK(TYPEOF(UUID) = 'blob' AND
LENGTH(UUID) = 16 AND
SUBSTR(HEX(UUID), 13, 1) == '4' AND
SUBSTR(HEX(UUID), 17, 1) IN ('8', '9', 'A', 'B')
)
);


​I am going to investigate.​
Cecil Westerhof
2016-04-20 20:40:52 UTC
Permalink
Post by Cecil Westerhof
Post by Scott Robison
Post by Cecil Westerhof
https://drive.google.com/file/d/0BzW5q7uL-6z0SDdya2REaFNFUVE
​I am very curious.
I downloaded this and opened it with sqlite3.exe shell version 3.11.0.
D:\>\bin\sqlite3.exe checkUUID.sqlite
SQLite version 3.11.0 2016-02-15 17:29:24
Enter ".help" for usage hints.
sqlite> .schema
sqlite>
It's a big enough file, but there is no schema present.
sqlite3 checkUUID.sqlite
SQLite version 3.8.7.1 2014-10-29 13:59:56
Enter ".help" for usage hints.
sqlite> .schema
CREATE TABLE testUniqueUUID (
UUID blob,
PRIMARY KEY(UUID)
CHECK(TYPEOF(UUID) = 'blob' AND
LENGTH(UUID) = 16 AND
SUBSTR(HEX(UUID), 13, 1) == '4' AND
SUBSTR(HEX(UUID), 17, 1) IN ('8', '9', 'A', 'B')
)
);

​I am going to investigate.​
​Interesting: when I unzip the .gz file, I have the same problem. Something
went wrong. I will try to correct it.​
--
Cecil Westerhof
Cecil Westerhof
2016-04-20 21:55:07 UTC
Permalink
Post by Cecil Westerhof
Post by Cecil Westerhof
Post by Scott Robison
Post by Cecil Westerhof
https://drive.google.com/file/d/0BzW5q7uL-6z0SDdya2REaFNFUVE
​I am very curious.
I downloaded this and opened it with sqlite3.exe shell version 3.11.0.
D:\>\bin\sqlite3.exe checkUUID.sqlite
SQLite version 3.11.0 2016-02-15 17:29:24
Enter ".help" for usage hints.
sqlite> .schema
sqlite>
It's a big enough file, but there is no schema present.
sqlite3 checkUUID.sqlite
SQLite version 3.8.7.1 2014-10-29 13:59:56
Enter ".help" for usage hints.
sqlite> .schema
CREATE TABLE testUniqueUUID (
UUID blob,
PRIMARY KEY(UUID)
CHECK(TYPEOF(UUID) = 'blob' AND
LENGTH(UUID) = 16 AND
SUBSTR(HEX(UUID), 13, 1) == '4' AND
SUBSTR(HEX(UUID), 17, 1) IN ('8', '9', 'A', 'B')
)
);

​I am going to investigate.​
​Interesting: when I unzip the .gz file, I have the same problem.
Something went wrong. I will try to correct it.​
​This one I download, unpacked and tried. It worked. So it should be
correct now:
https://drive.google.com/file/d/0BzW5q7uL-6z0UjFhRWZWSENnZW8/

​Made with zip.
--
Cecil Westerhof
Scott Robison
2016-04-20 22:17:31 UTC
Permalink
Post by Cecil Westerhof
​This one I download, unpacked and tried. It worked. So it should be
https://drive.google.com/file/d/0BzW5q7uL-6z0UjFhRWZWSENnZW8/
​Made with zip.
Windows 10 1511 with Feb 2016 updates, x64
i7-6700K 4.00GHz
32 GiB RAM
3 TB 7200 RPM SATA HD
SQLite version 3.11.0 2016-02-15 17:29:24

secure delete on, drop only, Run Time: real 44.058 user 2.406250 sys
11.453125

secure delete on, delete all, Run Time: real 57.858 user 2.343750 sys
13.203125
secure delete on, drop after delete all, Run Time: real 0.125 user 0.000000
sys 0.015625

secure delete off, drop only, Run Time: real 6.070 user 0.546875 sys
1.171875

secure delete off, delete all, Run Time: real 6.024 user 0.453125 sys
1.250000
secure delete off, drop after delete all, Run Time: real 0.125 user
0.000000 sys 0.015625

Nothing seems out of sorts here. Note that I'm running two idle VMs, a
remote desktop instance, Hyper-V manager, two idle command prompts, 9
chrome browser tabs open (each tab being an independent process), and an
idle bittorrent client. Given the CPU & RAM and the idle nature of most
processes, it's not a heavy load.
--
Scott Robison
jungle Boogie
2016-04-20 23:04:08 UTC
Permalink
Post by Cecil Westerhof
This one I download, unpacked and tried. It worked. So it should be
https://drive.google.com/file/d/0BzW5q7uL-6z0UjFhRWZWSENnZW8/
Windows 8.1 4 gigs of RAM, dell latitude with i5 process from a few
years ago, lots of tabs and apps open, here's some results.

SQLite version 3.12.1 2016-04-08 15:09:49

sqlite> drop table testuniqueuuid;
Run Time: real 9.763 user 1.015625 sys 2.031250

sqlite> delete from testuniqueuuid;
Run Time: real 19.149 user 1.156250 sys 2.109375

I didn't bother with secure deletes in my case.

This whole thread sounds like a good use case for flame graphs:
http://brendangregg.com/flamegraphs
--
-------
inum: 883510009027723
sip: ***@sip2sip.info
xmpp: jungle-***@jit.si
J Decker
2016-04-20 23:37:46 UTC
Permalink
If you upload a file to google drive and later try to replace it with
a different version google drive often corrupts it. Always delete and
upload a new version.
Post by jungle Boogie
Post by Cecil Westerhof
This one I download, unpacked and tried. It worked. So it should be
https://drive.google.com/file/d/0BzW5q7uL-6z0UjFhRWZWSENnZW8/
Windows 8.1 4 gigs of RAM, dell latitude with i5 process from a few
years ago, lots of tabs and apps open, here's some results.
SQLite version 3.12.1 2016-04-08 15:09:49
sqlite> drop table testuniqueuuid;
Run Time: real 9.763 user 1.015625 sys 2.031250
sqlite> delete from testuniqueuuid;
Run Time: real 19.149 user 1.156250 sys 2.109375
I didn't bother with secure deletes in my case.
http://brendangregg.com/flamegraphs
--
-------
inum: 883510009027723
_______________________________________________
sqlite-users mailing list
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Cecil Westerhof
2016-04-21 05:50:13 UTC
Permalink
Post by jungle Boogie
Post by Cecil Westerhof
This one I download, unpacked and tried. It worked. So it should be
https://drive.google.com/file/d/0BzW5q7uL-6z0UjFhRWZWSENnZW8/
Windows 8.1 4 gigs of RAM, dell latitude with i5 process from a few
years ago, lots of tabs and apps open, here's some results.
​I think it is an edge case. On my real system I only got this when there
where 1E8 records. I am now testing on very old (8 year) hardware to and
from work.
The processor is:
Intel(R) Atom(TM) CPU N270 @ 1.60GHz
with 1 GB of RAM. I am now generating a database with 1E6 records (will
take some time) and will check then with the small database. I expect that
I then will not see this kind of things. That would explain why nobody sees
something.

But I am afraid that uploading a file from 4 GB could be a challenge.
Post by jungle Boogie
http://brendangregg.com/flamegraphs
​I have to look into that.
--
Cecil Westerhof
Cecil Westerhof
2016-04-21 06:16:00 UTC
Permalink
Post by Cecil Westerhof
​I think it is an edge case. On my real system I only got this when there
where 1E8 records. I am now testing on very old (8 year) hardware to and
from work.
with 1 GB of RAM. I am now generating a database with 1E6 records (will
take some time) and will check then with the small database. I expect that
I then will not see this kind of things. That would explain why nobody sees
something.
​As I expected: with a tenth of the records, the performance is as
expected: DROP is faster as DELETE and DROP. (Even faster as the DELETE.)


But I am afraid that uploading a file from 4 GB could be a challenge.
​I have an idea how to get this done, but I have to try some things out.
That is a little difficult while traveling. ;-)
--
Cecil Westerhof
Cecil Westerhof
2016-04-22 08:00:12 UTC
Permalink
​I think it is an edge case. On my real system I only got this when there
where 1E8 records. I am now testing on very old (8 year) hardware to and
from work.
with 1 GB of RAM. I am now generating a database with 1E6 records (will
take some time) and will check then with the small database. I expect that
I then will not see this kind of things. That would explain why nobody sees
something.
​As I expected: with a tenth of the records, the performance is as
expected: DROP is faster as DELETE and DROP. (Even faster as the DELETE.)
But I am afraid that uploading a file from 4 GB could be a challenge.
​I have an idea how to get this done, but I have to try some things out.
That is a little difficult while traveling. ;-)
​Well, I did it. And I want to share the results. I think I found a way to
get the results reproduced on a ‘normal’ system.​


​With createBigTable.sh it is possible to create the big table needed to
reproduce the problem. In my case multiplication with a factor ten was
enough, but when more is needed the script is easily modified. (Just change
NR_OF_COPIES to 25 for example.)

What I find very interesting is that the user time and the sys time does
not increase significantly, but the real time does. Does this point to the
problem, or is this to be expected?


Then you have to cp checkUUID.sqlite to checkUUID.sqlite.bck and the
testing can begin.


For this I wrote performanceTest.sh. It does a drop for both tables and a
delete with drop for both tables. First with SECURE_DELETE=1 and then with
SECURE_DELETE=0
​
This shows that it is not only the size of the table, but also the size of
the database that is significant. The table which had no problems before
takes now also long. In all cases the DELETE and DROP is significantly
faster as the DROP only. The strange thing that the difference is bigger
for the small table as the big table.


I hope that it is now possible to reproduce the problem on ‘normal’
systems. When more information is needed: let me know.
--
Cecil Westerhof
Rowan Worth
2016-04-22 08:12:19 UTC
Permalink
Post by Cecil Westerhof
What I find very interesting is that the user time and the sys time does
not increase significantly, but the real time does. Does this point to the
problem, or is this to be expected?
It suggests the extra time is spent waiting for I/O (user/sys are measures
of cpu time).
-Rowan
Cecil Westerhof
2016-04-22 08:20:54 UTC
Permalink
Post by Rowan Worth
Post by Cecil Westerhof
What I find very interesting is that the user time and the sys time does
not increase significantly, but the real time does. Does this point to
the
Post by Cecil Westerhof
problem, or is this to be expected?
It suggests the extra time is spent waiting for I/O (user/sys are measures
of cpu time).
​OK, so that is expected/normal?
--
Cecil Westerhof
E.Pasma
2016-04-22 12:06:21 UTC
Permalink
​With createBigTable.sh ...
Can you paste the svript in the message? Attachments are not sent.
Regards, E.Pasma
Cecil Westerhof
2016-04-22 12:23:01 UTC
Permalink
Post by E.Pasma
​With createBigTable.sh ...
Can you paste the svript in the message? Attachments are not sent.
​createBigTable.sh:
#/usr/bin/env bash

# An error should terminate the script
# An unset variable is also an error
set -o errexit
set -o nounset


declare -r INSERT_TEMPLATE="INSERT INTO testUniqueUUIDBig
SELECT uuid, %d FROM testUniqueUUID
;
"
declare -r NR_OF_COPIES=10

declare insert=""


function getInsertStr {
printf "${INSERT_TEMPLATE}" "${1}"
}


for i in $(seq "${NR_OF_COPIES}") ; do
insert+="$(getInsertStr ${i})
"
done

sqlite3 checkUUID.sqlite <<EOT
.echo ON
.timer ON
DROP TABLE IF EXISTS testUniqueUUIDBig;
CREATE TABLE testUniqueUUIDBig (
UUID blob,
count int,

PRIMARY KEY(UUID, count)
CHECK(TYPEOF(UUID) = 'blob' AND
LENGTH(UUID) = 16 AND
SUBSTR(HEX(UUID), 13, 1) == '4' AND
SUBSTR(HEX(UUID), 17, 1) IN ('8', '9', 'A', 'B')
)
);
${insert}
EOT


The logging:
.timer ON
DROP TABLE IF EXISTS testUniqueUUIDBig;
Run Time: real 293.257 user 6.708000 sys 28.844000
CREATE TABLE testUniqueUUIDBig (
UUID blob,
count int,

PRIMARY KEY(UUID, count)
CHECK(TYPEOF(UUID) = 'blob' AND
LENGTH(UUID) = 16 AND
SUBSTR(HEX(UUID), 13, 1) == '4' AND
SUBSTR(HEX(UUID), 17, 1) IN ('8', '9', 'A', 'B')
)
);
Run Time: real 0.277 user 0.000000 sys 0.000000
INSERT INTO testUniqueUUIDBig
SELECT uuid, 1 FROM testUniqueUUID
;
Run Time: real 89.930 user 48.872000 sys 28.196000
INSERT INTO testUniqueUUIDBig
SELECT uuid, 2 FROM testUniqueUUID
;
Run Time: real 133.674 user 56.416000 sys 43.032000
INSERT INTO testUniqueUUIDBig
SELECT uuid, 3 FROM testUniqueUUID
;
Run Time: real 269.029 user 59.520000 sys 48.840000
INSERT INTO testUniqueUUIDBig
SELECT uuid, 4 FROM testUniqueUUID
;
Run Time: real 356.622 user 61.196000 sys 51.956000
INSERT INTO testUniqueUUIDBig
SELECT uuid, 5 FROM testUniqueUUID
;
Run Time: real 398.048 user 61.924000 sys 57.540000
INSERT INTO testUniqueUUIDBig
SELECT uuid, 6 FROM testUniqueUUID
;
Run Time: real 413.252 user 61.684000 sys 59.816000
INSERT INTO testUniqueUUIDBig
SELECT uuid, 7 FROM testUniqueUUID
;
Run Time: real 464.911 user 61.672000 sys 63.200000
INSERT INTO testUniqueUUIDBig
SELECT uuid, 8 FROM testUniqueUUID
;
Run Time: real 545.974 user 61.900000 sys 66.916000
INSERT INTO testUniqueUUIDBig
SELECT uuid, 9 FROM testUniqueUUID
;
Run Time: real 695.315 user 64.016000 sys 69.692000
INSERT INTO testUniqueUUIDBig
SELECT uuid, 10 FROM testUniqueUUID
;
Run Time: real 1129.854 user 64.428000 sys 76.704000


performanceTest.sh:
#/usr/bin/env bash

# An error should terminate the script
# An unset variable is also an error
set -o errexit
set -o nounset


declare -r DB=checkUUIDSmall.sqlite
declare -r DEINIT=".timer OFF
.echo OFF"
declare -r INIT=".echo ON
.timer ON"
declare -r TABLE=testUniqueUUID
declare -r TABLE_BIG=testUniqueUUIDBig

declare -r DELETE_AND_DROP="DELETE FROM ${TABLE};
DROP TABLE ${TABLE};
DELETE FROM ${TABLE_BIG};
DROP TABLE ${TABLE_BIG};"
declare -r DROP="DROP TABLE ${TABLE};
DROP TABLE ${TABLE_BIG};"


function cpDB {
giveMessage "Copying database"
rm --force "${DB}"-journal
cp "${DB}".bck "${DB}"
}

function giveMessage {
printf "%s: %s\n" "$(date +%T)" "${1}"
}


cpDB
giveMessage "DROP only"
sqlite3 "${DB}" <<EOT
${INIT}
${DROP}
${DEINIT}
EOT

cpDB
giveMessage "DELETE and DROP"
sqlite3 "${DB}" <<EOT
${INIT}
${DELETE_AND_DROP}
${DEINIT}
EOT

cpDB
giveMessage "SECURE_DELETE=0; DROP only"
sqlite3 "${DB}" <<EOT
${INIT}
PRAGMA SECURE_DELETE=0;
${DROP}
PRAGMA SECURE_DELETE=1;
${DEINIT}
EOT

cpDB
giveMessage "SECURE_DELETE=0; DELETE and DROP"
sqlite3 "${DB}" <<EOT
${INIT}
PRAGMA SECURE_DELETE=0;
${DELETE_AND_DROP}
PRAGMA SECURE_DELETE=1;
${DEINIT}
EOT

giveMessage "Done"

The logging:
23:31:24: Copying database
23:31:50: DROP only
.timer ON
DROP TABLE testUniqueUUID;
Run Time: real 92.543 user 1.216000 sys 6.324000
DROP TABLE testUniqueUUIDBig;
Run Time: real 1474.282 user 12.812000 sys 63.832000
.timer OFF
.echo OFF
23:57:58: Copying database
23:58:23: DELETE and DROP
.timer ON
DELETE FROM testUniqueUUID;
Run Time: real 13.811 user 1.064000 sys 4.556000
DROP TABLE testUniqueUUID;
Run Time: real 0.258 user 0.000000 sys 0.004000
DELETE FROM testUniqueUUIDBig;
Run Time: real 1287.970 user 12.984000 sys 63.412000
DROP TABLE testUniqueUUIDBig;
Run Time: real 0.525 user 0.004000 sys 0.000000
.timer OFF
.echo OFF
00:20:06: Copying database
00:20:32: SECURE_DELETE=0; DROP only
.timer ON
PRAGMA SECURE_DELETE=0;
0
Run Time: real 0.000 user 0.000000 sys 0.000000
DROP TABLE testUniqueUUID;
Run Time: real 75.249 user 0.632000 sys 1.596000
DROP TABLE testUniqueUUIDBig;
Run Time: real 872.330 user 6.900000 sys 15.956000
PRAGMA SECURE_DELETE=1;
1
Run Time: real 0.001 user 0.000000 sys 0.000000
.timer OFF
.echo OFF
00:36:20: Copying database
00:36:45: SECURE_DELETE=0; DELETE and DROP
.timer ON
PRAGMA SECURE_DELETE=0;
0
Run Time: real 0.000 user 0.000000 sys 0.000000
DELETE FROM testUniqueUUID;
Run Time: real 3.085 user 0.452000 sys 0.324000
DROP TABLE testUniqueUUID;
Run Time: real 0.277 user 0.000000 sys 0.000000
DELETE FROM testUniqueUUIDBig;
Run Time: real 209.797 user 5.564000 sys 10.860000
DROP TABLE testUniqueUUIDBig;
Run Time: real 0.243 user 0.000000 sys 0.000000
PRAGMA SECURE_DELETE=1;
1
Run Time: real 0.001 user 0.000000 sys 0.000000
.timer OFF
.echo OFF
00:40:19: Done
--
Cecil Westerhof
Adam Devita
2016-04-22 13:24:54 UTC
Permalink
In general, CPUs got much faster than disk IO a long time ago, so it
is expected that a single thread, write through to disk program would
have lots of time where the is CPU waiting for disk IO to complete.
(BTW: A common error of novice db programmers is using a disk based db
to store variables instead of handling them in memory, thus reducing a
L1 cache or register operation to the speed of a disk.)

The technology of caching is an attempt to win performance with
smaller, faster caches closer to the CPU, exploiting temporal or
(memory) spacial locality to not need to go all the way to the disk as
little as possible. The list has more than a few discussions of
people using SSDs to increase performance or even caching, mission
critical, RAID controllers to win speed.

That said, why is the dropping of a table dependent on the size of
the table? Does Sqlite have to mark every block of memory it used as
dropped? (This is obvious for high security mode, but otherwise?)

regards,
Adam DeVita
Post by Cecil Westerhof
Post by E.Pasma
With createBigTable.sh ...
Can you paste the svript in the message? Attachments are not sent.
#/usr/bin/env bash
# An error should terminate the script
# An unset variable is also an error
set -o errexit
set -o nounset
declare -r INSERT_TEMPLATE="INSERT INTO testUniqueUUIDBig
SELECT uuid, %d FROM testUniqueUUID
;
"
declare -r NR_OF_COPIES=10
declare insert=""
function getInsertStr {
printf "${INSERT_TEMPLATE}" "${1}"
}
for i in $(seq "${NR_OF_COPIES}") ; do
insert+="$(getInsertStr ${i})
"
done
sqlite3 checkUUID.sqlite <<EOT
.echo ON
.timer ON
DROP TABLE IF EXISTS testUniqueUUIDBig;
CREATE TABLE testUniqueUUIDBig (
UUID blob,
count int,
PRIMARY KEY(UUID, count)
CHECK(TYPEOF(UUID) = 'blob' AND
LENGTH(UUID) = 16 AND
SUBSTR(HEX(UUID), 13, 1) == '4' AND
SUBSTR(HEX(UUID), 17, 1) IN ('8', '9', 'A', 'B')
)
);
${insert}
EOT
.timer ON
DROP TABLE IF EXISTS testUniqueUUIDBig;
Run Time: real 293.257 user 6.708000 sys 28.844000
CREATE TABLE testUniqueUUIDBig (
UUID blob,
count int,
PRIMARY KEY(UUID, count)
CHECK(TYPEOF(UUID) = 'blob' AND
LENGTH(UUID) = 16 AND
SUBSTR(HEX(UUID), 13, 1) == '4' AND
SUBSTR(HEX(UUID), 17, 1) IN ('8', '9', 'A', 'B')
)
);
Run Time: real 0.277 user 0.000000 sys 0.000000
INSERT INTO testUniqueUUIDBig
SELECT uuid, 1 FROM testUniqueUUID
;
Run Time: real 89.930 user 48.872000 sys 28.196000
INSERT INTO testUniqueUUIDBig
SELECT uuid, 2 FROM testUniqueUUID
;
Run Time: real 133.674 user 56.416000 sys 43.032000
INSERT INTO testUniqueUUIDBig
SELECT uuid, 3 FROM testUniqueUUID
;
Run Time: real 269.029 user 59.520000 sys 48.840000
INSERT INTO testUniqueUUIDBig
SELECT uuid, 4 FROM testUniqueUUID
;
Run Time: real 356.622 user 61.196000 sys 51.956000
INSERT INTO testUniqueUUIDBig
SELECT uuid, 5 FROM testUniqueUUID
;
Run Time: real 398.048 user 61.924000 sys 57.540000
INSERT INTO testUniqueUUIDBig
SELECT uuid, 6 FROM testUniqueUUID
;
Run Time: real 413.252 user 61.684000 sys 59.816000
INSERT INTO testUniqueUUIDBig
SELECT uuid, 7 FROM testUniqueUUID
;
Run Time: real 464.911 user 61.672000 sys 63.200000
INSERT INTO testUniqueUUIDBig
SELECT uuid, 8 FROM testUniqueUUID
;
Run Time: real 545.974 user 61.900000 sys 66.916000
INSERT INTO testUniqueUUIDBig
SELECT uuid, 9 FROM testUniqueUUID
;
Run Time: real 695.315 user 64.016000 sys 69.692000
INSERT INTO testUniqueUUIDBig
SELECT uuid, 10 FROM testUniqueUUID
;
Run Time: real 1129.854 user 64.428000 sys 76.704000
#/usr/bin/env bash
# An error should terminate the script
# An unset variable is also an error
set -o errexit
set -o nounset
declare -r DB=checkUUIDSmall.sqlite
declare -r DEINIT=".timer OFF
.echo OFF"
declare -r INIT=".echo ON
.timer ON"
declare -r TABLE=testUniqueUUID
declare -r TABLE_BIG=testUniqueUUIDBig
declare -r DELETE_AND_DROP="DELETE FROM ${TABLE};
DROP TABLE ${TABLE};
DELETE FROM ${TABLE_BIG};
DROP TABLE ${TABLE_BIG};"
declare -r DROP="DROP TABLE ${TABLE};
DROP TABLE ${TABLE_BIG};"
function cpDB {
giveMessage "Copying database"
rm --force "${DB}"-journal
cp "${DB}".bck "${DB}"
}
function giveMessage {
printf "%s: %s\n" "$(date +%T)" "${1}"
}
cpDB
giveMessage "DROP only"
sqlite3 "${DB}" <<EOT
${INIT}
${DROP}
${DEINIT}
EOT
cpDB
giveMessage "DELETE and DROP"
sqlite3 "${DB}" <<EOT
${INIT}
${DELETE_AND_DROP}
${DEINIT}
EOT
cpDB
giveMessage "SECURE_DELETE=0; DROP only"
sqlite3 "${DB}" <<EOT
${INIT}
PRAGMA SECURE_DELETE=0;
${DROP}
PRAGMA SECURE_DELETE=1;
${DEINIT}
EOT
cpDB
giveMessage "SECURE_DELETE=0; DELETE and DROP"
sqlite3 "${DB}" <<EOT
${INIT}
PRAGMA SECURE_DELETE=0;
${DELETE_AND_DROP}
PRAGMA SECURE_DELETE=1;
${DEINIT}
EOT
giveMessage "Done"
23:31:24: Copying database
23:31:50: DROP only
.timer ON
DROP TABLE testUniqueUUID;
Run Time: real 92.543 user 1.216000 sys 6.324000
DROP TABLE testUniqueUUIDBig;
Run Time: real 1474.282 user 12.812000 sys 63.832000
.timer OFF
.echo OFF
23:57:58: Copying database
23:58:23: DELETE and DROP
.timer ON
DELETE FROM testUniqueUUID;
Run Time: real 13.811 user 1.064000 sys 4.556000
DROP TABLE testUniqueUUID;
Run Time: real 0.258 user 0.000000 sys 0.004000
DELETE FROM testUniqueUUIDBig;
Run Time: real 1287.970 user 12.984000 sys 63.412000
DROP TABLE testUniqueUUIDBig;
Run Time: real 0.525 user 0.004000 sys 0.000000
.timer OFF
.echo OFF
00:20:06: Copying database
00:20:32: SECURE_DELETE=0; DROP only
.timer ON
PRAGMA SECURE_DELETE=0;
0
Run Time: real 0.000 user 0.000000 sys 0.000000
DROP TABLE testUniqueUUID;
Run Time: real 75.249 user 0.632000 sys 1.596000
DROP TABLE testUniqueUUIDBig;
Run Time: real 872.330 user 6.900000 sys 15.956000
PRAGMA SECURE_DELETE=1;
1
Run Time: real 0.001 user 0.000000 sys 0.000000
.timer OFF
.echo OFF
00:36:20: Copying database
00:36:45: SECURE_DELETE=0; DELETE and DROP
.timer ON
PRAGMA SECURE_DELETE=0;
0
Run Time: real 0.000 user 0.000000 sys 0.000000
DELETE FROM testUniqueUUID;
Run Time: real 3.085 user 0.452000 sys 0.324000
DROP TABLE testUniqueUUID;
Run Time: real 0.277 user 0.000000 sys 0.000000
DELETE FROM testUniqueUUIDBig;
Run Time: real 209.797 user 5.564000 sys 10.860000
DROP TABLE testUniqueUUIDBig;
Run Time: real 0.243 user 0.000000 sys 0.000000
PRAGMA SECURE_DELETE=1;
1
Run Time: real 0.001 user 0.000000 sys 0.000000
.timer OFF
.echo OFF
00:40:19: Done
--
Cecil Westerhof
_______________________________________________
sqlite-users mailing list
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
--
--------------
VerifEye Technologies Inc.
151 Whitehall Dr. Unit 2
Markham, ON
L3R 9T1
Rowan Worth
2016-04-22 15:02:41 UTC
Permalink
Post by Adam Devita
That said, why is the dropping of a table dependent on the size of
the table? Does Sqlite have to mark every block of memory it used as
dropped? (This is obvious for high security mode, but otherwise?)
In rollback journal mode, every modified page of the database is first
written to the rollback journal. When the transaction is committed the
actual database pages are updated on disk and the rollback journal removed
to finalise the commit. Obviously the larger the table, the more database
pages required to store it.

I'm less familiar with WAL mode but I believe it's a similar story;
modified pages will be written to the write-ahead log. The i/o cost may be
spread out more in this mode though, as the database itself won't be
updated until the next checkpoint.

I've written this under the presumption that sqlite touches every database
page that was associated with a table during the delete/drop... I can think
of some optimisations allowing much of the i/o to be skipped (at least
least when secure_delete isn't set), but I'm not sure exactly what sqlite
does.

-Rowan
Richard Hipp
2016-04-22 15:04:32 UTC
Permalink
Post by Rowan Worth
I've written this under the presumption that sqlite touches every database
page that was associated with a table during the delete/drop... I can think
of some optimisations allowing much of the i/o to be skipped (at least
least when secure_delete isn't set), but I'm not sure exactly what sqlite
does.
It does the optimizations.
--
D. Richard Hipp
***@sqlite.org
E.Pasma
2016-04-23 11:59:41 UTC
Permalink
Hello,
I tried the scripts but..

createBigTable.sh is beyond the capacity of my system. Instead I used
SQL script like in
www.mail-archive.com/sqlite-users%40mailinglists.sqlite.org/msg08044.html

My point is that the definition of the table is a waste of capacity,
even though it serves on many systems. Because it has a primary key
and also rowid which are not the same. Is there any practical use of
retaining the rowid?

I tested the 1E7 case WITHOUT ROWID. The size of the database is then
reduced to 222M. Drop table is a matter of seconds (for me too now).

I may do further testing with more rows. Until then I have the feeling
that this will scale linearly and not show instable timings any longer.

Below is the output of my tests.

Thanks, E Pasma


case 1, like original (sqlite version 3.12 and page size 4096 here)

.timer on
create table uuid (uuid blob, primary key (uuid))
;
insert into uuid
with r as (select 1 as i union all select i+1 from r where i<10000000)
select randomblob(16) from r
;
Run Time: real 6043.491 user 332.250625 sys 671.583469
.sys du -h west1.db*
505M west1.db
begin
;
drop table uuid
;
Run Time: real 40.378 user 2.259595 sys 5.500557
rollback
;
.quit



case 1, drop once again (completely different timing)

.sys du -h west1.db*
505M west1.db
begin
;
delete from uuid
;
Run Time: real 241.711 user 2.246336 sys 5.981215
drop table uuid
;
Run Time: real 0.000 user 0.000567 sys 0.000230
rollback
;
.quit




case 2, without rowid


.timer on
create table uuid (uuid blob, primary key (uuid)) without rowid
;
insert into uuid
with r as (select 1 as i union all select i+1 from r where i<10000000)
select randomblob(16) from r
;
Run Time: real 1141.098 user 294.535994 sys 573.902807
.sys du -h west2.db*
222M west2.db
begin
;
drop table uuid
;
Run Time: real 1.974 user 0.844361 sys 1.095968
rollback
;
begin
;
delete from uuid
;
Run Time: real 1.924 user 0.829793 sys 1.060908
drop table uuid
;
Run Time: real 0.006 user 0.000734 sys 0.002387
rollback
;
E.Pasma
2016-04-24 23:43:18 UTC
Permalink
Post by E.Pasma
Hello,
I tried the scripts but..
createBigTable.sh is beyond the capacity of my system. Instead I
used SQL script like in
www.mail-archive.com/sqlite-users%40mailinglists.sqlite.org/msg08044.html
My point is that the definition of the table is a waste of capacity,
even though it serves on many systems. Because it has a primary key
and also rowid which are not the same. Is there any practical use of
retaining the rowid?
I tested the 1E7 case WITHOUT ROWID. The size of the database is
then reduced to 222M. Drop table is a matter of seconds (for me too
now).
I may do further testing with more rows. Until then I have the
feeling that this will scale linearly and not show instable timings
any longer.
Below is the output of my tests.
Thanks, E Pasma
testing with more rows learned me that the "WITHOUT ROWID" version
does not scale linearly either (case 2b below).
Started yet another version, where the uuid column is not defined as
primary key and where a tradional unique index is added instead. This
DOES scale linearly (case 3 and 3b).
I can't see why this new version (in case 3b) does not suffer from
excessive response times. After all the UI is bigger as the PK-only
index.
But it is awesome that these tests are possible on a minimal system.
Thanks, E Pasma


case indexes #rows DBsize insert drop
1 rowid+PK 1E7 505M 6043s 241s
2 PK only 1E7 222M 1141s 2s
3 rowid+UI 1E7 480M 304s 44s
2b PK only 2E7 445M 58035s 886s
3b rowid+UI 2E7 970M 802s 61s


log of case 1

create table uuid (uuid blob, primary key (uuid))
;
insert into uuid
with r as (select 1 as i union all select i+1 from r where i<10000000)
select randomblob(16) from r
;
Run Time: real 6043.491 user 332.250625 sys 671.583469
.sys du -h west1.db*
505M west1.db
begin
;
drop table uuid
;
Run Time: real 241.746 user 2.296482 sys 5.978103
rollback
;



log of case 2

create table uuid (uuid blob, primary key (uuid)) without rowid
;
insert into uuid
with r as (select 1 as i union all select i+1 from r where i<10000000)
select randomblob(16) from r
;
Run Time: real 1141.098 user 294.535994 sys 573.902807
.sys du -h west2.db*
222M west2.db
begin
;
drop table uuid
;
Run Time: real 1.974 user 0.844361 sys 1.095968
rollback
;



log of case 3

create table uuid (uuid blob)
;
insert into uuid
with r as (select 1 as i union all select i+1 from r where i<10000000)
select randomblob(16) from r
;
Run Time: real 65.923 user 44.141960 sys 2.980705
.sys du -h west3.db*
238M west3.db
create unique index uuidprime on uuid(uuid)
;
Run Time: real 304.453 user 70.280531 sys 11.903746
.sys du -h west3.db
480M west3.db
begin
;
drop table uuid
;
Run Time: real 44.634 user 2.050981 sys 2.988656
rollback
;



log of case 2b

create table uuid (uuid blob, primary key (uuid)) without rowid
;
insert into uuid
with r as (select 1 as i union all select i+1 from r where i<20000000)
select randomblob(16) from r
;
Run Time: real 58035.455 user 673.901377 sys 1518.352532
445M west2b.db
.sys du -h west2b.db*
begin
;
drop table uuid
;
Run Time: real 886.570 user 2.287946 sys 9.943783
rollback
;



log of case 3b

create table uuid (uuid blob)
;
insert into uuid
with r as (select 1 as i union all select i+1 from r where i<20000000)
select randomblob(16) from r
;
Run Time: real 119.143 user 88.404970 sys 6.073330
.sys du -h west3b.db*
479M west3b.db
create unique index uuidprime on uuid(uuid)
;
Run Time: real 683.631 user 145.164905 sys 25.329360
.sys du -h west3b.db
970M west3b.db
begin
;
drop table uuid
;
Run Time: real 61.205 user 4.096998 sys 5.253853
rollback
;

E.Pasma
2016-04-21 10:40:46 UTC
Permalink
Post by Cecil Westerhof
​I think it is an edge case. On my real system I only got this when there
where 1E8 records. I am now testing on very old (8 year) hardware to and
from work.
Hello,
the answer to Cecils question is YES here. I tested on a computer with
just 512 Mb RAM. It took almost an hour to drop the testuniqueUUID
table. Same result for delete. This is "very long" in terms of SQLite.
I also tested dropping a 500Mb table with only integer data and that
took a minute. Below is the output of both tests.
Note the integer test is done on a more recent SQLite version as to
use the series extension. I believe the version does not matter as I
started drop testuniqueUUID db also on this later version.
Unfortunately interrupted that after 10 minutes.
Thanks for all the info, E. Pasma

$ uname -aDarwin mac-mini-van-epasma.local 9.8.0 Darwin Kernel Version
9.8.0: Wed Jul 15 16:57:01 PDT 2009; root:xnu-1228.15.4~1/RELEASE_PPC
Power Macintosh

$ sqlite3 checkUUID.sqlite
SQLite version 3.8.11 2015-07-27 13:49:41
Enter ".help" for usage hints.
sqlite> begin;
sqlite> .timer on
sqlite> drop table testUniqueUUID;
Run Time: real 3846.582 user 5.418944 sys 25.144210
sqlite> rollback;
Run Time: real 36.348 user 0.036132 sys 0.218740
sqlite> .quit



$ sqlite3 westerhof2.db
SQLite version 3.12.0 2016-03-22 15:26:03
Enter ".help" for usage hints.
sqlite> create table t (t integer primary key);
sqlite> .load series
sqlite> insert into t select value from generate_series(1,60000000);
sqlite> .sys du -h westerhof2*
516M westerhof2.db
sqlite> begin;
sqlite> .timer on
sqlite> drop table t;
Run Time: real 52.705 user 6.383107 sys 5.870330
sqlite> rollback;
Run Time: real 0.085 user 0.001651 sys 0.006027
sqlite> .quit


1,1 Top
Simon Slavin
2016-04-20 09:01:11 UTC
Permalink
Post by Cecil Westerhof
SECURE_DELETE=1, DROP ONLY 17 minutes
SECURE_DELETE=1, DELETE and DROP 14 minutes
SECURE_DELETE=0, DROP only 4½ minutes
SECURE_DELETE=0, DELETE and DROP ½ minute
[...]
I am baffled.
/You're/ baffled ?

There have been recent optimizations for "DELETE FROM myTable" with no WHERE clause. They now recognise that deleting all rows from a table is a special case and do it very efficiently. But, with settings remaining the same, there's no way that DELETE DROP should take less time than DROP by itself. The development team /does/ read this list and I'm sure someone is looking at this. No need to post it as a bug anywhere.

My only guess is that somehow caching is playing a big part in this.

Simon.
R Smith
2016-04-19 13:29:38 UTC
Permalink
Post by Cecil Westerhof
PRAGMA compile_options; (has SQLITE_SECURE_DELETE been set for compiling
the library?)
​sqlite3 checkUUID.sqlite
SQLite version 3.8.7.1 2014-10-29 13:59:56
Enter ".help" for usage hints.
sqlite> PRAGMA compile_options;
ENABLE_COLUMN_METADATA
ENABLE_FTS3
ENABLE_RTREE
ENABLE_UNLOCK_NOTIFY
ENABLE_UPDATE_DELETE_LIMIT
MAX_SCHEMA_RETRY=25
OMIT_LOOKASIDE
SECURE_DELETE
SOUNDEX
SYSTEM_MALLOC
TEMP_STORE=1
THREADSAFE=1
Ahh, the riddle is solved... SECURE_DELETE is the culprit. See if you
can turn that off in your setup.

Good luck!
Ryan
Cecil Westerhof
2016-04-19 14:07:21 UTC
Permalink
PRAGMA compile_options; (has SQLITE_SECURE_DELETE been set for compiling
Post by Olivier Mascia
the library?)
​sqlite3 checkUUID.sqlite
SQLite version 3.8.7.1 2014-10-29 13:59:56
Enter ".help" for usage hints.
sqlite> PRAGMA compile_options;
ENABLE_COLUMN_METADATA
ENABLE_FTS3
ENABLE_RTREE
ENABLE_UNLOCK_NOTIFY
ENABLE_UPDATE_DELETE_LIMIT
MAX_SCHEMA_RETRY=25
OMIT_LOOKASIDE
SECURE_DELETE
SOUNDEX
SYSTEM_MALLOC
TEMP_STORE=1
THREADSAFE=1
Ahh, the riddle is solved... SECURE_DELETE is the culprit. See if you can
turn that off in your setup.
​That is not completely sure, but very reasonable to expect. I will try it
(later) and let the list know the results.
--
Cecil Westerhof
Cecil Westerhof
2016-04-18 18:02:45 UTC
Permalink
Post by Dan Kennedy
Post by Cecil Westerhof
I am playing a bit with SQLite. I first had a table with 1E8 elements. When
trying to drop this it looked like SQLite got hung. I tried it from DB
Browser and a Java program.
I just tried it with a table of 1E7 elements. That was dropped in about 13
seconds.
I will try it again with 1E8 elements, but it takes 4½ hours to generated.
Is it possible that SQLite has trouble dropping very large tables? It was
5.2 GB. With 1E7 elements the table is 512 MB.
CREATE TABLE testUniqueUUID (
UUID blob,
PRIMARY KEY(UUID)
CHECK(TYPEOF(UUID) = 'blob' AND
LENGTH(UUID) = 16 AND
SUBSTR(HEX(UUID), 13, 1) == '4' AND
SUBSTR(HEX(UUID), 17, 1) IN ('8', '9', 'A', 'B')
)
);
Are there any other tables in the db with foreign keys that refer to the
table being dropped?
​No, it is the only table.​
--
Cecil Westerhof
Cecil Westerhof
2016-04-19 14:00:56 UTC
Permalink
Post by Simon Slavin
Post by Cecil Westerhof
​Is it not strange that first DELETE and then DROP is so more efficient
as
Post by Cecil Westerhof
just a DROP?
Yes.
Can you please try the same comparison in the SQLite shell tool ? You can
use
.timer ON
DELETE FROM myTable;
DROP TABLE myTable;
.timer OFF
​Was done in the shell tool. But with timing done by Bash.
I have to go, but I will try this later.
--
Cecil Westerhof
Keith Medcalf
2016-04-20 22:49:20 UTC
Permalink
For me, source_id = 2016-04-18 15:46:14 eba27d4d17a76884292667d570d542e580ee3e77
Windows 10 1511 Pro, i7-3632QM 2.4Ghz, 16 GB Ram, 1 TB 850 Pro SSD

with secure_delete=0

either

drop table testuniqueuuid;

or the sum of both

delete from testuniqueuuid;
drop table testuniqueuuid;

takes under 2 seconds total real time,


with secure_delete=1

either

drop table testuniqueuuid;

or the sum of both

delete from testuniqueuuid;
drop table testuniqueuuid;

takes just over 10 seconds.

In the cases with separate delete from and drop, the majority of time is spent in delete from (where it presumably overwrites and frees the rows and index) and the drop takes just milliseconds. With only the drop, the total time is more or less the same. If the delete from and drop are run in a single transaction, the total time for the transaction is closer to the time for just the drop of the populated tables, but only by a few milliseconds.

Nothing seems out of sorts to me, at least with the database file.
-----Original Message-----
Sent: Wednesday, 20 April, 2016 17:55
To: SQLite mailing list
Subject: Re: [sqlite] Is it possible that dropping a big table takes very
long
Post by Cecil Westerhof
Post by Cecil Westerhof
On Wed, Apr 20, 2016 at 2:00 PM, Cecil Westerhof
Post by Cecil Westerhof
https://drive.google.com/file/d/0BzW5q7uL-6z0SDdya2REaFNFUVE
​I am very curious.
I downloaded this and opened it with sqlite3.exe shell version 3.11.0.
D:\>\bin\sqlite3.exe checkUUID.sqlite
SQLite version 3.11.0 2016-02-15 17:29:24
Enter ".help" for usage hints.
sqlite> .schema
sqlite>
It's a big enough file, but there is no schema present.
sqlite3 checkUUID.sqlite
SQLite version 3.8.7.1 2014-10-29 13:59:56
Enter ".help" for usage hints.
sqlite> .schema
CREATE TABLE testUniqueUUID (
UUID blob,
PRIMARY KEY(UUID)
CHECK(TYPEOF(UUID) = 'blob' AND
LENGTH(UUID) = 16 AND
SUBSTR(HEX(UUID), 13, 1) == '4' AND
SUBSTR(HEX(UUID), 17, 1) IN ('8', '9', 'A', 'B')
)
);
​I am going to investigate.​
​Interesting: when I unzip the .gz file, I have the same problem.
Something went wrong. I will try to correct it.​
​This one I download, unpacked and tried. It worked. So it should be
https://drive.google.com/file/d/0BzW5q7uL-6z0UjFhRWZWSENnZW8/
​Made with zip.
--
Cecil Westerhof
_______________________________________________
sqlite-users mailing list
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Cecil Westerhof
2016-04-21 05:58:40 UTC
Permalink
Post by J Decker
If you upload a file to google drive and later try to replace it with
a different version google drive often corrupts it. Always delete and
upload a new version.
​That was not the problem. When I unzipped the file I uploaded, it was
different as the original. I can not remember having had this problem
before. Gremlins bite you when it matters the most I suppose.
But thanks for the tip. Something to keep in mind.
--
Cecil Westerhof
Loading...