Re: Storing null bytes in bytea

Lists: pgsql-sql
From: Andy Shellam <andy-lists(at)networkmail(dot)eu>
To: pgsql-sql(at)postgresql(dot)org
Subject: Storing null bytes in bytea
Date: 2009-04-27 17:49:30
Message-ID: 49F5F02A.1070505@networkmail.eu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-sql

Hi all,

I was going to post this on the pgsql-php list but I think the issue is
more on the PostgreSQL side of things.

I'm using PHP 5.2.9 connected to a PostgreSQL 8.3.7 server running on
Solaris 10 to try to store the session data for an application using a
custom session handler class. The session data (objects/class instances
etc) is serialized into raw bytes in PHP and contains a few nul/zero
bytes (which are meaningful to PHP when it comes to deserializing the data.)

Because of the nul bytes, I've set the session_data column to be a bytea
column in my database table. However I cannot get PostgreSQL to read
past the first nul byte on an insert, so the unserialize call fails when
it reads it back out the database and the remaining data is omitted.

An example of such query is this:

INSERT INTO system.session (user_id, session_key, session_name,
client_browser, date_created, date_expires, ip_address, session_data)
VALUES (NULL, '4pc4sjciahoc4fuk1bt4kohe91'::character varying(32),
'AppName'::character varying(50), 'Mozilla/5.0 (Windows; U; Windows NT
6.0; en-GB; rv:1.9.0.9) Gecko/2009040821 Firefox/3.0.9 (.NET CLR
3.5.30729)'::character varying(200),
public.get_pg_timestamp(1240853862::integer),
public.get_pg_timestamp(1240854162::integer), '192.168.0.8'::inet,
E'IsLoggedIn|b:1;CurrentUser|O:17:"Class_SystemUser":4:{s:26:"\\000Class_SystemUser}'::bytea);

All other columns are fine, but when it comes to the session_data
column, all I end up with is
'IsLoggedIn|b:1;CurrentUser|O:17:"Class_SystemUser":4:{s:26:"'.
Everything past the first "\\0000" byte sequence is ignored.

I've tried this with and without the 'E' at the beginning of the value
string.

Any pointers as to what I'm doing wrong?

Thanks,
Andy


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andy Shellam <andy-lists(at)networkmail(dot)eu>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: Storing null bytes in bytea
Date: 2009-04-27 20:29:09
Message-ID: 11364.1240864149@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-sql

Andy Shellam <andy-lists(at)networkmail(dot)eu> writes:
> Because of the nul bytes, I've set the session_data column to be a bytea
> column in my database table. However I cannot get PostgreSQL to read
> past the first nul byte on an insert, so the unserialize call fails when
> it reads it back out the database and the remaining data is omitted.

Your example works fine in psql:

regression=# create table t1 (f1 bytea);
CREATE TABLE
regression=# insert into t1 values (E'IsLoggedIn|b:1;CurrentUser|O:17:"Class_SystemUser":4:{s:26:"\\000Class_SystemUser}'::bytea);
INSERT 0 1
regression=# select * from t1;
f1
-----------------------------------------------------------------------------------
IsLoggedIn|b:1;CurrentUser|O:17:"Class_SystemUser":4:{s:26:"\000Class_SystemUser}
(1 row)

I suspect what is happening is that some layer on the client side is
doubling (or perhaps undoubling?) the backslashes for you. Exactly
what are you doing with that literal as you build the query? It might
help to turn on log_statements so that you can see just what the
server is getting.

regards, tom lane


From: Andy Shellam <andy-lists(at)networkmail(dot)eu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: Storing null bytes in bytea
Date: 2009-04-28 16:48:56
Message-ID: 49F73378.7000100@networkmail.eu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-sql

Hi Tom,
> Your example works fine in psql:
>
> regression=# create table t1 (f1 bytea);
> CREATE TABLE
> regression=# insert into t1 values (E'IsLoggedIn|b:1;CurrentUser|O:17:"Class_SystemUser":4:{s:26:"\\000Class_SystemUser}'::bytea);
> INSERT 0 1
> regression=# select * from t1;
> f1
> -----------------------------------------------------------------------------------
> IsLoggedIn|b:1;CurrentUser|O:17:"Class_SystemUser":4:{s:26:"\000Class_SystemUser}
> (1 row)
>
>
> I suspect what is happening is that some layer on the client side is
> doubling (or perhaps undoubling?) the backslashes for you. Exactly
> what are you doing with that literal as you build the query? It might
> help to turn on log_statements so that you can see just what the
> server is getting.
>

Many thanks for your reply. I was a bit naive about the quality of the
client I was using (Navicat 8.1.) It turns out in the "Memo" view it
doesn't show anything past the first zero byte. However switch it to
hex view and you see the full bytes.

I think my issue that it wasn't working on the client side (and caused
me to examine it in more detail in the database) was because it wasn't
unescaping correctly in the application code when it read the data back
out of the database.

Next time I won't be so lazy and try it out in psql first...

Regards,
Andy