INSERT OR IGNORE - returning new or existing rowid

Discussion:

Krzysztof

2013-01-04 09:55:43 UTC

Hi,

When I use INSERT OR IGNORE, if insertion fail (record exists),
then sqlite3_last_insert_rowid does return nothing. Is exists similar
solution which:
1. If insert success then return new rowid
2. If insert fail (record exists) then return rowid of existing record

Can I get this information in one command?

Regards

Simon Slavin

2013-01-04 10:18:52 UTC

Permalink

Post by Krzysztof
When I use INSERT OR IGNORE, if insertion fail (record exists),
then sqlite3_last_insert_rowid does return nothing. Is exists similar
1. If insert success then return new rowid
2. If insert fail (record exists) then return rowid of existing record

No. You are correct about how it works. The best solution is to do the INSERT OR IGNORE as you're doing it already, then to use a SELECT to find the rowid.

Simon.

Patrik Nilsson

2013-01-04 11:27:43 UTC

Permalink

Maybe is faster to swap the commands:

1. Select. If found done.
2. Insert (not doing ignore to get an error if it fails) and then get
the rowid from sqlite3_last_insert_rowid.

Patrik

Post by Simon Slavin

No. You are correct about how it works. The best solution is to do the INSERT OR IGNORE as you're doing it already, then to use a SELECT to find the rowid.
Simon.
_______________________________________________
sqlite-users mailing list
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Krzysztof

2013-01-04 12:08:18 UTC

Permalink

Ok I have done this with two commnads where first check if record exist. I
tried do this with one command because I want to speed up importing from
csv. CSV has one table for example: name | surname | country. I want split
repeated columns to dictionary tables like:

table customers
- name
- surname
- id_country // FK to table countries

I'm looking for one command which before inserting to "customers" will look
to table countries and check if country exists, if not then insert and
return id, else return existing id.

Igor Tandetnik

2013-01-04 13:04:04 UTC

Permalink

Post by Krzysztof
CSV has one table for example: name | surname | country. I want split
table customers
- name
- surname
- id_country // FK to table countries
I'm looking for one command which before inserting to "customers" will look
to table countries and check if country exists, if not then insert and
return id, else return existing id.

Have your program keep a list of already-inserted countries, complete
with their IDs, in memory. Look up against this in-memory structure.
There are only about 200 countries in the world.

--
Igor Tandetnik

Simon Slavin

2013-01-04 13:08:49 UTC

Permalink

Have your program keep a list of already-inserted countries, complete with their IDs, in memory. Look up against this in-memory structure. There are only about 200 countries in the world.

Or just premake the table with the 200 countries in, and you should never come across any country not already in that table. It's ISO 3166, I think. Ah, here we go:

<http://www.iso.org/iso/country_names_and_code_elements>

248 countries, with the internally-agreed forms of their names.

Simon.

Krzysztof

2013-01-04 13:40:32 UTC

Permalink

"Country" was only example, there are much other "dictionary" tables. I
finally solved this by second command which first check if record exists. I
asked this question because I thought that maybe there is some "magic"
trick to do this in one command ;) For example PostgreSQL has INSERT
command which has also RETURNING clause which return any of inserted column
(but what I remember PostgreSQL doesn't have OR IGNORE clause, so same
problem :P ) . Anyway problem solved

BTW: About storing dictionary tables in memory. What is faster:
- Load all "contries" table to memory and locate it locally (I'm using Free
Pascal / Lazarus)
- Create unique index on country name and select from countries table on
each insert to customers

What consume more time if table has 100-200 record, sequence locate in
memory or firing index search?

Igor Tandetnik

2013-01-04 14:08:01 UTC

Permalink

Post by Krzysztof
- Load all "contries" table to memory and locate it locally (I'm using Free
Pascal / Lazarus)
- Create unique index on country name and select from countries table on
each insert to customers

My guess is that loading all countries into memory up front would be
faster, but either way the cost of inserting a row into Customers table
is greater by an order or two of magnitude, and will dwarf any
difference. So in the end, it wouldn't matter much. But if you want to
be sure, measure it both ways.

--
Igor Tandetnik

Simon Slavin

2013-01-04 13:05:14 UTC

Permalink

Post by Krzysztof
I want split
table customers
- name
- surname
- id_country // FK to table countries
I'm looking for one command which before inserting to "customers" will look
to table countries and check if country exists, if not then insert and
return id, else return existing id.

I think you do understand the problem correctly. Foreign keys have two sets of actions:

ON UPDATE
ON DELETE

What you want would be dealt with correctly if there was an 'ON CREATE' action which made a new related row when necessary. However there is no such type of action (it would be difficult to get right anyway) so you have to write that logic into your own code.

If just occurred to me that it might be possible to do what you want using an AFTER TRIGGER. I'm not sure exactly how to do it since it would have to do the INSERT OR IGNORE, then modify the newly created record with the right id_country. However, doing this in a TRIGGER would only be the same amount of work as doing it in your own application logic and I think it would take even more work than just writing your own two commands.

Simon.

BareFeetWare

2013-01-05 08:19:41 UTC

Permalink

This kind of question often results in answers along the lines of application logic such as: "If the row exists get the ID else create it and then get the ID". But the SQL way of doing things is subtly different: "insert a new row where it doesn't already exist. Get the ID of the row".

Based on your description, it seems you have a schema something like this:

create table Countries
( ID integer primary key not null
, Name text collate nocase
)
;
create table Customers
( ID integer primary key not null
, Name text collate nocase
, Surname text collate nocase
, ID_Country integer references Countries (ID) on delete restrict on update cascade
)
;

And it seems you have a long list of CSV data that could be imported like this:

insert into Countries (Name)
select @Country where @Country not in (select Name from Countries)
;
insert into Customers (Name, Surname, ID_Country)
select @Name, @Surname, (select ID from Countries where Name = @Country)
;

Possibly a faster way to do it is to create a temporary table, import all your raw flat data into it, then run a single transaction to import it all into your normalised tables. If you imported it into a table called "Import", eg:

create temp table Import (Name, Surname, Country)

then your transaction to insert it into your normalised tables would be:

begin immediate
;
insert into Countries (Name)
select Country from Import where Country not in (select Name from Countries)
;
insert into Customers (Name, Surname, ID_Country)
select Name, Surname, (select ID from Countries where Name = Country)
from Import
;
commit
;

Tom

Tom Brodhurst-Hill
BareFeetWare

--
iPhone/iPad/iPod and Mac software development, specialising in databases
developer-***@public.gmane.org
--
Follow us on Twitter: http://twitter.com/barefeetware/
Like us on Facebook: http://www.facebook.com/BareFeetWare

Post by Krzysztof
Ok I have done this with two commnads where first check if record exist. I
tried do this with one command because I want to speed up importing from
csv. CSV has one table for example: name | surname | country. I want split
table customers
- name
- surname
- id_country // FK to table countries
I'm looking for one command which before inserting to "customers" will look
to table countries and check if country exists, if not then insert and
return id, else return existing id.

Clemens Ladisch

2013-01-04 11:23:57 UTC

Permalink

Post by Krzysztof
When I use INSERT OR IGNORE, if insertion fail (record exists),
then sqlite3_last_insert_rowid does return nothing.

If your unique key is the rowid, then you already know the ID that
you tried to insert.
If your unique key is not the rowid, then why do you need the rowid?

Regards,
Clemens

Yuriy Kaminskiy

2013-01-04 12:12:08 UTC

Permalink

Post by Clemens Ladisch

Post by Krzysztof
When I use INSERT OR IGNORE, if insertion fail (record exists),
then sqlite3_last_insert_rowid does return nothing.

If your unique key is the rowid, then you already know the ID that
you tried to insert.
If your unique key is not the rowid, then why do you need the rowid?

To insert it into another table?

Jay A. Kreibich

2013-01-04 12:55:07 UTC

Permalink

Post by Krzysztof
Hi,
When I use INSERT OR IGNORE, if insertion fail (record exists),
then sqlite3_last_insert_rowid does return nothing. Is exists similar
1. If insert success then return new rowid
2. If insert fail (record exists) then return rowid of existing record
Can I get this information in one command?

No. Mainly because your assumption that one and only one row is
responsible for triggering the IGNORE conflict resolution is
incorrect. For example, if a table has two or more UNIQUE indexes,
the IGNORE resolution may be triggered by different rows through each
index.

-j

--
Jay A. Kreibich < J A Y @ K R E I B I.C H >

"Intelligence is like underwear: it is important that you have it,
but showing it to the wrong people has the tendency to make them
feel uncomfortable." -- Angela Johnson

Yuriy Kaminskiy

2013-01-04 21:14:47 UTC

Permalink

Post by Jay A. Kreibich

No. Mainly because your assumption that one and only one row is
responsible for triggering the IGNORE conflict resolution is
incorrect. For example, if a table has two or more UNIQUE indexes,

How sqlite handles case when e.g. (SELECT) expression returned more than one
row, and only one value is expected? (e.g. foo >= (SELECT))
It uses first row and silently ignores remaining ones.
Same applicable in this case: return any random rowid that results in insert
failure. And ignore the rest.
If sqlite decided to be sloppy, it can be at least /consistently/ sloppy.

Post by Jay A. Kreibich
the IGNORE resolution may be triggered by different rows through each
index.