single bit integer (TINYINT) revisited for 8.5

Lists: pgsql-hackers
From: Caleb Cushing <xenoterracide(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: single bit integer (TINYINT) revisited for 8.5
Date: 2009-07-01 15:19:48
Message-ID: 81bfc67a0907010819m32f39ac3m13e4e59b4c97e422@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I'd like to see this topic revisited since as far as I can see it
hasn't been seriously discussed in years. I believe the main arguments
against are why do we need more more numeric datatypes and increased
maintenance. It would seem to me that a tinyint datatype maintenance
wise would get all the same updates as the other int types, making it
only a slight increase in maintenance. I think there was 1 more reason
but I can't find the original thread now.

most (if not all?) of posgresql's major competitor's (mysql, sql
server, db2, etc) support a single bit integer datatype. it would
bring increased compatibility with existing mysql apps esp, making
them easier to port.

It (in theory?) should also bring a speed enhancement where usable
since it would take less disk space.

A couple of times I've been told "you don't need tinyint, use boolean"
which is not true, several projects I've worked on I've needed and
integer field that supports number within a small range 0-5 1-10 1-100
or something similar. I end up using smallint but it's range is huge
for the actual requirements.
--
Caleb Cushing

http://xenoterracide.blogspot.com


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Caleb Cushing" <xenoterracide(at)gmail(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: single bit integer (TINYINT) revisited for 8.5
Date: 2009-07-01 15:41:18
Message-ID: 4A4B3D4E02000025000281D4@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Caleb Cushing <xenoterracide(at)gmail(dot)com> wrote:

> most (if not all?) of posgresql's major competitor's (mysql, sql
> server, db2, etc) support a single bit integer datatype.

> A couple of times I've been told "you don't need tinyint, use
> boolean" which is not true, several projects I've worked on I've
> needed and integer field that supports number within a small range
> 0-5 1-10 1-100 or something similar.

I think you mean byte where you've said bit. Boolean would be
adequate for a single bit, and I haven't (so far) seen any database
which supports both a single-bit type and a boolean. Many databases
support a TINYINT type as a single-byte value, although I'm not sure
there's consistency on whether that's a signed or unsigned value.

-Kevin


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Caleb Cushing <xenoterracide(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: single bit integer (TINYINT) revisited for 8.5
Date: 2009-07-01 16:09:42
Message-ID: 4A4B8A46.8060209@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Caleb.

> I'd like to see this topic revisited since as far as I can see it
> hasn't been seriously discussed in years. I believe the main arguments
> against are why do we need more more numeric datatypes and increased
> maintenance. It would seem to me that a tinyint datatype maintenance
> wise would get all the same updates as the other int types, making it
> only a slight increase in maintenance. I think there was 1 more reason
> but I can't find the original thread now.

The main reason not to have one is that given byte-alignment, 95% of the
time using a tinyint would save no actual disk space or memory over just
using INT2 (or indeed INT4). I'll point out that the MySQLers are
enamored of the 3-byte integer, which even on MySQL saves zero space or
memory over using a 4-byte. Just because people like it doesn't mean it
works.

So the only real point in having an INT1 would be if you had a table
with a lot of them in a row; an unusual but not unheard-of design.
Still, that makes this feature less-than-critical for most users.

But ... the nice thing about PostgreSQL is that data types can be loaded
at runtime. Which means that you don't need INT1 in core for it to be
useful to you and others; just write the data type and put it on
pgFoundry. Then submit it for /contrib for 8.5, and we'll see how
popular the idea is.

Overall, I'm not keen on it. For the handful of times when
peformance-optimization-by-datatype makes sense, there's a large number
where it's develpers who have no idea what they're doing. We should be
moving in the direction of having the database engine take care of space
optimizations, not having the user do it.

--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Caleb Cushing <xenoterracide(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: single bit integer (TINYINT) revisited for 8.5
Date: 2009-07-01 16:23:59
Message-ID: 12382.1246465439@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Josh Berkus <josh(at)agliodbs(dot)com> writes:
> But ... the nice thing about PostgreSQL is that data types can be loaded
> at runtime. Which means that you don't need INT1 in core for it to be
> useful to you and others; just write the data type and put it on
> pgFoundry.

Yeah. The argument against that used to be that you couldn't integrate
it properly into the numeric type hierarchy, but I think that's no
longer the case now that add-on types can define which type category
they belong to. At the very least it would be useful to try to do it
as an add-on and see if there are still any obstacles to that.

Which of course brings up the other argument against doing it, which
was that it'd possibly mess up the rather delicate arrangement of
implicit promotions among the numeric types. That was based on very
old bad experiences, though, so I'm not certain if there's still a
problem. Again, if there is a reason why an add-on type couldn't solve
the problem while a built-in type could, I'd be more interested in
finding and fixing that problem than in the value of INT1 per se.

regards, tom lane


From: Caleb Cushing <xenoterracide(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: single bit integer (TINYINT) revisited for 8.5
Date: 2009-07-01 16:27:46
Message-ID: 81bfc67a0907010927j5d300053p4d5ae16528644fff@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jul 1, 2009 at 11:41 AM, Kevin
Grittner<Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
> I think you mean byte where you've said bit.

you're correct. I'm being a nerf.

>  Boolean would be
> adequate for a single bit, and I haven't (so far) seen any database
> which supports both a single-bit type and a boolean.

wasn't aware of that. I'm admittedly most familiar with sqlite,
postgres, and mysql

>  Many databases
> support a TINYINT type as a single-byte value, although I'm not sure
> there's consistency on whether that's a signed or unsigned value.

wouldn't any implementation in pg support both?

--
Caleb Cushing

http://xenoterracide.blogspot.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Caleb Cushing <xenoterracide(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: single bit integer (TINYINT) revisited for 8.5
Date: 2009-07-01 16:35:46
Message-ID: 12600.1246466146@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Caleb Cushing <xenoterracide(at)gmail(dot)com> writes:
> On Wed, Jul 1, 2009 at 11:41 AM, Kevin
> Grittner<Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
>> Many databases
>> support a TINYINT type as a single-byte value, although I'm not sure
>> there's consistency on whether that's a signed or unsigned value.

> wouldn't any implementation in pg support both?

Introducing unsigned types into PG is a whole different discussion.
The problem there is designing reasonable automatic promotion rules.
Considering that C's rules still confuse people after nigh 40 years,
I'm not enthusiastic about it ...

regards, tom lane


From: Caleb Cushing <xenoterracide(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: single bit integer (TINYINT) revisited for 8.5
Date: 2009-07-01 17:03:19
Message-ID: 81bfc67a0907011003t12455490n35ddcf0e8fd57066@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jul 1, 2009 at 12:09 PM, Josh Berkus<josh(at)agliodbs(dot)com> wrote:
> The main reason not to have one is that given byte-alignment, 95% of the
> time using a tinyint would save no actual disk space or memory over just
> using INT2 (or indeed INT4).  I'll point out that the MySQLers are enamored
> of the 3-byte integer, which even on MySQL saves zero space or memory over
> using a 4-byte.  Just because people like it doesn't mean it works.
>
> So the only real point in having an INT1 would be if you had a table with a
> lot of them in a row; an unusual but not unheard-of design. Still, that
> makes this feature less-than-critical for most users.

The primary type of field that I see it used as involves some sort of
statistics collection. like ratings, or maybe "strongly agree -
strongly disagree" (this could be in hundreds of colums), even
collecting gender's is appropriate if I recall.. as the iso for
genders is an integer with like 4 possible value's. At least this is
my thought, and a lot of these things end up in the same table.

It's not critical, but I'd suggest the benefits outweigh the costs.

> But ... the nice thing about PostgreSQL is that data types can be loaded at
> runtime.  Which means that you don't need INT1 in core for it to be useful
> to you and others; just write the data type and put it on pgFoundry.  Then
> submit it for /contrib for 8.5, and we'll see how popular the idea is.

I suppose I can see what I can do to this end (since it seems to be
the consensus) although it'll take me a while given I've no idea what
I'm doing... I can read C better than I write it, which doesn't say
much. On the other hand I'd think this would be on the list of
'easiest things to do' which means I should be able to get it done.

> Overall, I'm not keen on it.  For the handful of times when
> peformance-optimization-by-datatype makes sense, there's a large number
> where it's develpers who have no idea what they're doing.  We should be
> moving in the direction of having the database engine take care of space
> optimizations, not having the user do it.

eh... not sure I agree on that 100%... but to some degree that's what
sql is for... at the same time that kinda sounds like sqlite's more
dynamic like typing. but it is another discussion

--
Caleb Cushing

http://xenoterracide.blogspot.com


From: Greg Stark <gsstark(at)mit(dot)edu>
To: Caleb Cushing <xenoterracide(at)gmail(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: single bit integer (TINYINT) revisited for 8.5
Date: 2009-07-02 00:41:29
Message-ID: 407d949e0907011741w3ade5f5apa0468be32073be43@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Incidentally there *is* a single-byte integer data type in Postgres,
it's called "char" (the quote marks are necessary in SQL due to the
char(n) data type).

It's a bit weird though, mainly because its output format is to output
ascii characters -- kind of like how C's single-byte integer data type
is mainly used to hold ascii characters...


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Caleb Cushing <xenoterracide(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: single bit integer (TINYINT) revisited for 8.5
Date: 2009-07-02 16:40:49
Message-ID: 1246552849.27964.502.camel@dn-x300-willij
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Wed, 2009-07-01 at 11:19 -0400, Caleb Cushing wrote:
> I'd like to see this topic revisited since as far as I can see it
> hasn't been seriously discussed in years. I believe the main arguments
> against are why do we need more more numeric datatypes and increased
> maintenance. It would seem to me that a tinyint datatype maintenance
> wise would get all the same updates as the other int types, making it
> only a slight increase in maintenance. I think there was 1 more reason
> but I can't find the original thread now.
>
> most (if not all?) of posgresql's major competitor's (mysql, sql
> server, db2, etc) support a single bit integer datatype. it would
> bring increased compatibility with existing mysql apps esp, making
> them easier to port.
>
> It (in theory?) should also bring a speed enhancement where usable
> since it would take less disk space.
>
> A couple of times I've been told "you don't need tinyint, use boolean"
> which is not true, several projects I've worked on I've needed and
> integer field that supports number within a small range 0-5 1-10 1-100
> or something similar. I end up using smallint but it's range is huge
> for the actual requirements.

Completely agree.

I'm most or the way through working on this as an add-on module, rather
than a new datatype in core. I don't see much reason to include it in
core: its not an SQL standard datatype, it complicates catalog entries
and most people don't need or want it.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Caleb Cushing <xenoterracide(at)gmail(dot)com>
Subject: Re: single bit integer (TINYINT) revisited for 8.5
Date: 2009-07-03 17:38:11
Message-ID: 200907031338.11653.xzilla@users.sourceforge.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thursday 02 July 2009 12:40:49 Simon Riggs wrote:
> On Wed, 2009-07-01 at 11:19 -0400, Caleb Cushing wrote:
> > A couple of times I've been told "you don't need tinyint, use boolean"
> > which is not true, several projects I've worked on I've needed and
> > integer field that supports number within a small range 0-5 1-10 1-100
> > or something similar. I end up using smallint but it's range is huge
> > for the actual requirements.
>
> Completely agree.
>

Blech. More often than not, I find people using all these granular types to be
nothing more than premature optimization. And if you really do need a single
byte type, you can use "char" (though again I'm not a big fan of that)

> I'm most or the way through working on this as an add-on module, rather
> than a new datatype in core. I don't see much reason to include it in
> core: its not an SQL standard datatype, it complicates catalog entries
> and most people don't need or want it.
>

That's too bad. I'd much rather see someone implement something closer to
Oracle's number type.

--
Robert Treat
Conjecture: http://www.xzilla.net
Consulting: http://www.omniti.com


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org, Caleb Cushing <xenoterracide(at)gmail(dot)com>
Subject: Re: single bit integer (TINYINT) revisited for 8.5
Date: 2009-07-04 09:34:09
Message-ID: 1246700050.27964.753.camel@dn-x300-willij
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Fri, 2009-07-03 at 13:38 -0400, Robert Treat wrote:
> On Thursday 02 July 2009 12:40:49 Simon Riggs wrote:
> > On Wed, 2009-07-01 at 11:19 -0400, Caleb Cushing wrote:
> > > A couple of times I've been told "you don't need tinyint, use boolean"
> > > which is not true, several projects I've worked on I've needed and
> > > integer field that supports number within a small range 0-5 1-10 1-100
> > > or something similar. I end up using smallint but it's range is huge
> > > for the actual requirements.
> >
> > Completely agree.
> >
>
> Blech. More often than not, I find people using all these granular types to be
> nothing more than premature optimization. And if you really do need a single
> byte type, you can use "char" (though again I'm not a big fan of that)

I agree that many optimizations are used inappropriately. Another reason
for making it an add-on module.

I'm aware of "char" and it doesn't do all I would wish.

> > I'm most or the way through working on this as an add-on module, rather
> > than a new datatype in core. I don't see much reason to include it in
> > core: its not an SQL standard datatype, it complicates catalog entries
> > and most people don't need or want it.
> >
>
> That's too bad. I'd much rather see someone implement something closer to
> Oracle's number type.

Please explain what you mean?

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support