Discussion:
cidr data type
John Stanton
2008-04-19 01:59:17 UTC
Permalink
Hi,
does SQLite have anything similar to PostgreSQL's cidr data type (see
http://www.postgresql.org/docs/8.3/interactive/datatype-net-types.html).
I would be particularly interested in being able to use a "contained in"
select inet '192.168.18.52' << cidr '192.168.0.0/16';
What would be the preferred way to do it in SQLite?
Currently, I am thinking of storing start and end IP addresses as a blob in
the database - that way I would be able to use the "between" operator in
selects, e.g.
select X'c0a81234' between X'c0a80000' and X'c0a8ffff';
Are there any other/better ideas?
Christof
Use a function and convert the IP address from dotted format to a 32 bit
unsigned integer. You can AND and OR these to establish inclusion and
exclusion.
Dennis Cote
2008-04-18 21:59:14 UTC
Permalink
does SQLite have anything similar to PostgreSQL's cidr data type (see
http://www.postgresql.org/docs/8.3/interactive/datatype-net-types.html).
I would be particularly interested in being able to use a "contained in"
select inet '192.168.18.52' << cidr '192.168.0.0/16';
What would be the preferred way to do it in SQLite?
Currently, I am thinking of storing start and end IP addresses as a blob in
the database - that way I would be able to use the "between" operator in
selects, e.g.
select X'c0a81234' between X'c0a80000' and X'c0a8ffff';
Are there any other/better ideas?
I would store the IP addresses, network addresses, and subnet width as
integers. Then create a few custom function to manipulate them.

You will need functions to convert between integer and standard dotted
quad text format. You will also need a function to do the containedIn
range check.

This last function can be implemented using bit manipulation operators
in SQL or in a custom function in C.

containedIn(ip_addr, network_addr, network_size)

can be replaced by

nework_addr == (ip_addr & (-1 << network_size))

which will be true if the IP address is in the network.

This can be done in a custom function as well which may be more efficient.

HTH
Dennis Cote
Christof Meerwald
2008-04-19 09:56:47 UTC
Permalink
Post by Dennis Cote
Currently, I am thinking of storing start and end IP addresses as a blob in
the database - that way I would be able to use the "between" operator in
selects, e.g.
select X'c0a81234' between X'c0a80000' and X'c0a8ffff';
[...]
Post by Dennis Cote
I would store the IP addresses, network addresses, and subnet width as
integers. Then create a few custom function to manipulate them.
Ok, that would work for IPv4 addresses, but if I ever wanted to use IPv6
addresses, then I would have to go back to using blobs - as integers are
limited to 64 bits in SQLite.


Christof
--
http://cmeerw.org sip:cmeerw at cmeerw.org
mailto:cmeerw at cmeerw.org xmpp:cmeerw at cmeerw.org
John Stanton
2008-04-20 06:18:45 UTC
Permalink
What stops you from building in the 128 bit address logic?
Post by Christof Meerwald
Post by Dennis Cote
Currently, I am thinking of storing start and end IP addresses as a blob in
the database - that way I would be able to use the "between" operator in
selects, e.g.
select X'c0a81234' between X'c0a80000' and X'c0a8ffff';
[...]
Post by Dennis Cote
I would store the IP addresses, network addresses, and subnet width as
integers. Then create a few custom function to manipulate them.
Ok, that would work for IPv4 addresses, but if I ever wanted to use IPv6
addresses, then I would have to go back to using blobs - as integers are
limited to 64 bits in SQLite.
Christof
Florian Weimer
2008-04-20 09:13:56 UTC
Permalink
Post by Dennis Cote
This last function can be implemented using bit manipulation operators
in SQL or in a custom function in C.
containedIn(ip_addr, network_addr, network_size)
can be replaced by
nework_addr == (ip_addr & (-1 << network_size))
which will be true if the IP address is in the network.
Is this Java or C? For C, this breaks if network_size == 32.
Jay A. Kreibich
2008-04-20 13:47:59 UTC
Permalink
Post by Florian Weimer
Post by Dennis Cote
This last function can be implemented using bit manipulation operators
in SQL or in a custom function in C.
containedIn(ip_addr, network_addr, network_size)
can be replaced by
nework_addr == (ip_addr & (-1 << network_size))
which will be true if the IP address is in the network.
Is this Java or C? For C, this breaks if network_size == 32.
It breaks for everything except network_size == 16.

You want something closer to (ip_addr & (~(~0 << network_size)))

Again, that only works for v4. Part of the beauty of the INET and
CIDR types in PostgreSQL is that they take both v4 and v6
addresses/networks and all the operations work on both address types
automatically.

-j
--
Jay A. Kreibich < J A Y @ K R E I B I.C H >

"'People who live in bamboo houses should not throw pandas.' Jesus said that."
- "The Ninja", www.AskANinja.com, "Special Delivery 10: Pop!Tech 2006"
John Stanton
2008-04-20 14:34:45 UTC
Permalink
Post by Jay A. Kreibich
Post by Florian Weimer
Post by Dennis Cote
This last function can be implemented using bit manipulation operators
in SQL or in a custom function in C.
containedIn(ip_addr, network_addr, network_size)
can be replaced by
nework_addr == (ip_addr & (-1 << network_size))
which will be true if the IP address is in the network.
Is this Java or C? For C, this breaks if network_size == 32.
It breaks for everything except network_size == 16.
You want something closer to (ip_addr & (~(~0 << network_size)))
Again, that only works for v4. Part of the beauty of the INET and
CIDR types in PostgreSQL is that they take both v4 and v6
addresses/networks and all the operations work on both address types
automatically.
-j
There is nothing to stop you adding a CIDR type to Sqlite by using the
declared type capability. You could lift the IP address handling code
from PostgreSQL if that made it simpler. Just intercept the CIDR type
in your wrapper.

You should always appreciate that Sqlite is not a database server, it is
a kit of tools to implement embedded SQL in any number of ingenious
ways. If you want PostgreSQL functionality out of the box why not just
use PostgreSQL?
Dennis Cote
2008-04-20 15:29:34 UTC
Permalink
Post by Jay A. Kreibich
It breaks for everything except network_size == 16.
Why do you say that?
Post by Jay A. Kreibich
You want something closer to (ip_addr & (~(~0 << network_size)))
In SQLite ~0 is -1.

sqlite> select ~0;
-1

So your inner expression is only a more complicated way of saying the
same thing.

By complementing the result of the inner expression you have generated
an invalid netmask pattern with zeros in the high bits and ones in the
low bit positions. This can' t be used to mask off the network portion
of an IP adresss. It would return the host address within the network
which can't be used to test if the original IP address is within a
particular network.
Post by Jay A. Kreibich
Again, that only works for v4.
It was only intended to work for IPv4 as shown in the OP.

Dennis Cote
Jay A. Kreibich
2008-04-20 23:39:52 UTC
Permalink
Post by Dennis Cote
Post by Jay A. Kreibich
It breaks for everything except network_size == 16.
Why do you say that?
Post by Jay A. Kreibich
You want something closer to (ip_addr & (~(~0 << network_size)))
In SQLite ~0 is -1.
sqlite> select ~0;
-1
So your inner expression is only a more complicated way of saying the
same thing.
Each is a simple literal with a unary operator. How is that more complex?

~0 is also a bit-level operator that works with both signed and
unsigned types, while -1 depends on a specific signed integer
representation.
Post by Dennis Cote
By complementing the result of the inner expression you have generated
an invalid netmask pattern with zeros in the high bits and ones in the
low bit positions.
Yeah, I screwed that up. I was too caught up on the other error.

In the original function (-1 << network_size) returns the wrong bit
mask. For example, 10.0.0.0/8 should return a netmask of 255.0.0.0,
or 0xFF000000. The original function will return 255.255.255.0,
which is a /24 mask.

My mind was thinking "you need to flip that", but did the wrong
thing. We're looking for (ip_addr & (~0 << (32 - network_size))).

Technically, you also need to mask the network side, as it is an
acceptable notation to have a network like "10.0.0.1/8".
Post by Dennis Cote
Post by Jay A. Kreibich
Again, that only works for v4.
It was only intended to work for IPv4 as shown in the OP.
Clearly, although in a followup to one of your earlier posts the
original poster said he was interested in both v4 and v6. It would
still be easy enough to build functions that can deal with those.



This thread got me thinking, although not really about IP addresses.
INET and CIDR are built-in types for Postgres, but part of the reason
Postgres has so many weird types is that it is very easy to build
user-defined types in the Postgres engine. SQLite already allows you
to put pretty much anything you want in the type field of a CREATE
TABLE statement. It also has the most flexible type systems of any
RDBMS environment that I've worked with. I'm wondering how hard it
would be to create a "USER DEFINED TYPE" affinity that could be
associated with a series of user-functions like "input text"->type,
or type->"display text." In many cases, I would assume the "storage
type" would be a blob, but SQLite's manifest typing means it could be
just about any native type. Some other affinity conversion functions and
collation functions and it might not be that hard to define arbitrary
user types. I don't understand the way SQLite tracks values types
internally enough to really understand the cost... it just got me
thinking.

Given some of the unique ways that SQLite is used, there might be
some value in user types.

-j
--
Jay A. Kreibich < J A Y @ K R E I B I.C H >

"'People who live in bamboo houses should not throw pandas.' Jesus said that."
- "The Ninja", www.AskANinja.com, "Special Delivery 10: Pop!Tech 2006"
Florian Weimer
2008-04-21 07:11:05 UTC
Permalink
Post by Jay A. Kreibich
Yeah, I screwed that up. I was too caught up on the other error.
In the original function (-1 << network_size) returns the wrong bit
mask. For example, 10.0.0.0/8 should return a netmask of 255.0.0.0,
or 0xFF000000. The original function will return 255.255.255.0,
which is a /24 mask.
My mind was thinking "you need to flip that", but did the wrong
thing. We're looking for (ip_addr & (~0 << (32 - network_size))).
That's why it's called network_size and not prefix_length, I think.
Florian Weimer
2008-04-20 19:38:51 UTC
Permalink
Post by Jay A. Kreibich
Post by Florian Weimer
Is this Java or C? For C, this breaks if network_size == 32.
It breaks for everything except network_size == 16.
I was alluding to the fact that a popular architecture implements
modulo-32 shifts for 32-bit integers (and modulo-64 shifts for 64-bit
integers), for example:

sqlite> SELECT 1 << 64;
1
sqlite>

(I wasn't aware that SQLite supports bitwise operators.)
Dennis Cote
2008-04-20 14:58:05 UTC
Permalink
Post by Florian Weimer
Is this Java or C? For C, this breaks if network_size == 32.
It is SQL (with SQLite extensions which are modeled after C). SQLite
has a 64 bit integer type so the bit shifting works as expected for all
values up to 63. The same logic can be used in other languages.

This was intended to work for IPv4 addresses only as shown in the OP
example. For these addresses the practical range of network width is
from 2, not really very practical, but a legal minimum subnet width, to
25, the entire range of a class A network.

Dennis Cote
Nicolas Williams
2008-04-18 21:34:07 UTC
Permalink
does SQLite have anything similar to PostgreSQL's cidr data type (see
http://www.postgresql.org/docs/8.3/interactive/datatype-net-types.html).
No.
Currently, I am thinking of storing start and end IP addresses as a blob in
the database - that way I would be able to use the "between" operator in
selects, e.g.
select X'c0a81234' between X'c0a80000' and X'c0a8ffff';
Are there any other/better ideas?
Write some user-defined functions and a user-defined collation.

You could have them use text in CIDR notation, but that could be pretty
slow.

You could have functions to convert to/from display notation, and then
the internal storage format could be an integer, or even as a bit string
encoded in text (10/8 -> '00001010', 10.128/9 -> '0000010101') -- not
space efficient, but you don't have to write a collation function this
way, and you can cut down on the number of user-defined functions you
need to write to, I think, just the conversion functions).

Incidentally, I find it interesting that there's no way (or did I miss
it) to enter numeric literals in bases other than 10. Also, CAST(<blob>
AS INTEGER) always returns 0.

If, in addition to BLOBs, SQLite had: a native BIT STRING type that
differs from BLOBs in that it can be an arbitrary string of bits, rather
than bytes, and a way to cast bit strings into integers (where they fit,
and/or multi-precision integers), then dealing with CIDR might be easier
still.

Nico
--
Nicolas Williams
2008-04-18 21:49:36 UTC
Permalink
Post by Nicolas Williams
You could have functions to convert to/from display notation, and then
the internal storage format could be an integer, or even as a bit string
^^^^^^^
Here I had in mind a 64-bit integer where half the bits encode the
address part and half the prefix length/mask, with all other bits being
zeros.
Christof Meerwald
2008-04-18 21:08:24 UTC
Permalink
Hi,

does SQLite have anything similar to PostgreSQL's cidr data type (see
http://www.postgresql.org/docs/8.3/interactive/datatype-net-types.html).

I would be particularly interested in being able to use a "contained in"
operator in selects - in PostgreSQL you are able to do:

select inet '192.168.18.52' << cidr '192.168.0.0/16';

What would be the preferred way to do it in SQLite?

Currently, I am thinking of storing start and end IP addresses as a blob in
the database - that way I would be able to use the "between" operator in
selects, e.g.

select X'c0a81234' between X'c0a80000' and X'c0a8ffff';


Are there any other/better ideas?


Christof
--
http://cmeerw.org sip:cmeerw at cmeerw.org
mailto:cmeerw at cmeerw.org xmpp:cmeerw at cmeerw.org
Nicolas Williams
2008-04-19 15:54:43 UTC
Permalink
Post by John Stanton
Use a function and convert the IP address from dotted format to a 32 bit
unsigned integer. You can AND and OR these to establish inclusion and
exclusion.
That doesn't get you the prefix length. If you represent the
mask/prefix length as a separate value then you have to be careful when
sorting. If you represent prefixes as bit strings then you lose that
problem, but without a way to cast them to integers you have to add
user-defined functions to do bit-wise operations on them -- or at least
functions to convert to/from integert (but that only works well for
IPv4).

BTW, this can be very useful when one stores data on networks and
aggregations, such as when building a model of routing, or a database of
network and aggregate allocations. It then becomes important to be able
to determine whether one prefix is inside the other -- that's almost the
most important operation -- and to be able to query for prefixes that
fit inside another, ...

Nico
--
Continue reading on narkive:
Loading...