Re: pg_[un]escape_bytea, pgsql 8.2.1, php 5.1.6, Linux

Lists: pgsql-php
From: "Gary Chambers" <gwchamb(at)gmail(dot)com>
To: pgsql-php(at)postgresql(dot)org
Subject: pg_[un]escape_bytea, pgsql 8.2.1, php 5.1.6, Linux
Date: 2007-02-02 02:27:23
Message-ID: 302670f20702011827x28d9a689m388b2c8a703f31af@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-php

All...

Apache 2.2.3, PostgreSQL 8.2.1, PHP 5.1.6, Linux

I have inserted (via pg_query_params) into a bytea field some binary
data (a JPEG image in this case) which I have escaped using
pg_escape_bytea. It appears, however, that the extracted data is
corrupt (NOT unescaped, more precisely), even after unescaping it with
pg_unescape_bytea. If I perform another (a subsequent)
pg_unescape_bytea, it appears to be partially unescaped, but there
still remain errors because the rest of the image is severely
distorted -- but minimally recognizeable as part of the original
image. What am I missing? I'm using the lo_* functions as an
alternative, but it's hard to dismiss the ease with which it appears
to deal with binary data with a bytea field.

Many thanks in advance!

-- Gary Chambers

// Nothing fancy and nothing Microsoft!


From: ellis(at)spinics(dot)net
To: pgsql-php(at)postgresql(dot)org
Subject: Re: pg_[un]escape_bytea, pgsql 8.2.1, php 5.1.6, Linux
Date: 2007-02-02 05:32:59
Message-ID: 1170394379.757745@no.spam
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-php

In article <302670f20702011827x28d9a689m388b2c8a703f31af(at)mail(dot)gmail(dot)com>,
Gary Chambers <gwchamb(at)gmail(dot)com> wrote:

>I have inserted (via pg_query_params) into a bytea field some
>binary data (a JPEG image in this case) which I have escaped
>using pg_escape_bytea.

I just use a text field and base64 encode the image. It works great.

--
http://yosemitephotos.net/


From: ljb <lbayuk(at)pobox(dot)com>
To: pgsql-php(at)postgresql(dot)org
Subject: Re: pg_[un]escape_bytea, pgsql 8.2.1, php 5.1.6, Linux
Date: 2007-02-03 02:50:48
Message-ID: eq0ta7$15us$1@news.hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-php

gwchamb(at)gmail(dot)com wrote:
> Apache 2.2.3, PostgreSQL 8.2.1, PHP 5.1.6, Linux
>
> I have inserted (via pg_query_params) into a bytea field some binary
> data (a JPEG image in this case) which I have escaped using
> pg_escape_bytea. It appears, however, that the extracted data is
> corrupt (NOT unescaped, more precisely), even after unescaping it with
> pg_unescape_bytea. If I perform another (a subsequent)
> pg_unescape_bytea, it appears to be partially unescaped, but there
> still remain errors because the rest of the image is severely
> distorted -- but minimally recognizeable as part of the original
> image. What am I missing? I'm using the lo_* functions as an
> alternative, but it's hard to dismiss the ease with which it appears
> to deal with binary data with a bytea field.

Interesting problem.

pg_query_params() should have been made binary-safe, but it isn't. It only
accepts and passes 'text' mode arguments to PostgreSQL. So you cannot put
raw bytea data into a query parameter.

But you cannot use pg_escape_bytea() on the data either. pg_escape_bytea()
escapes the data in preparation for two levels of parsing/unescaping: once
by the SQL parser, and once by the bytea-type input function. This is what
you need for a non-parameterized query, like "INSERT INTO mytable (bd)
VALUES ('$data')" where bd is a bytea column, and $data went through
pg_escape_bytea().

The escaping done by pg_escape_bytea() is wrong for parameterized queries.
With a binary-mode query parameter (which pg_query_params() can't do
anyway), you want no escaping at all. With a text-mode parameter (as
pg_query_params() does), you need to escape for only the bytea-input
parsing, not the SQL parsing. So for example if your data has a byte with
value 1, you need to pass that as the 4 characters: \001.
pg_escape_bytea() returns that as the 5 characters: \\001 (unless the new
'standard conforming strings' is on), so it won't work. Nor can I think of
another PHP escaping function that does work here.

To me, this means that you should probably do non-parameterized queries
instead, with pg_query() and pg_escape_bytea(), with your bytea data.


From: "Gary Chambers" <gwchamb(at)gmail(dot)com>
To: ljb <lbayuk(at)pobox(dot)com>
Cc: pgsql-php(at)postgresql(dot)org
Subject: Re: pg_[un]escape_bytea, pgsql 8.2.1, php 5.1.6, Linux
Date: 2007-02-03 04:09:08
Message-ID: 302670f20702022009x417a210fm5e0fa36d860a225@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-php

Thanks for the reply!

> pg_query_params() should have been made binary-safe, but it isn't. It only
> accepts and passes 'text' mode arguments to PostgreSQL. So you cannot put
> raw bytea data into a query parameter.

Hmmm... Disappointing. Will pg_query_params ever become binary safe?
I'm evaluating Postgres as an alternative to Oracle, so that's where
the majority of my experience lies.

> you need for a non-parameterized query, like "INSERT INTO mytable (bd)
> VALUES ('$data')" where bd is a bytea column, and $data went through
> pg_escape_bytea().

Understood. I do not like for several reasons that method of
inserting data. It exposes me to SQL injection attacks, it's very
inefficient (in Oracle, anyway -- perhaps you can correct me where
Postgres is concerned), it seems uncharacteristic of a database with
the qualities of Postgres, I can't have all my queries in a single
source file, and I can't take advantage of the ease with which I can
handle binary data with a bytea field.

> To me, this means that you should probably do non-parameterized queries
> instead, with pg_query() and pg_escape_bytea(), with your bytea data.

Would there be any advantage to simply using a text field and base64
encoding and decoding the binary data? I really don't want to use
non-parameterized queries.

-- Gary Chambers

// Nothing fancy and nothing Microsoft!


From: Vincent de Phily <vdephily(at)bluemetrix(dot)com>
To: pgsql-php(at)postgresql(dot)org
Subject: Re: pg_[un]escape_bytea, pgsql 8.2.1, php 5.1.6, Linux
Date: 2007-02-05 11:01:40
Message-ID: 200702051101.40315.vdephily@bluemetrix.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-php

On Saturday 03 February 2007, Gary Chambers wrote:
> > you need for a non-parameterized query, like "INSERT INTO mytable (bd)
> > VALUES ('$data')" where bd is a bytea column, and $data went through
> > pg_escape_bytea().
>
> Understood. I do not like for several reasons that method of
> inserting data. It exposes me to SQL injection attacks, it's very
> inefficient (in Oracle, anyway -- perhaps you can correct me where
> Postgres is concerned), it seems uncharacteristic of a database with
> the qualities of Postgres, I can't have all my queries in a single
> source file, and I can't take advantage of the ease with which I can
> handle binary data with a bytea field.

This would be a problem related to php, not postgres. I'm handling binary data
in parameterized and COPY queries just fine with c++.

> > To me, this means that you should probably do non-parameterized queries
> > instead, with pg_query() and pg_escape_bytea(), with your bytea data.
>
> Would there be any advantage to simply using a text field and base64
> encoding and decoding the binary data? I really don't want to use
> non-parameterized queries.

base64 would solve your binary problem, but it is costly (disk space and cpu).

I think you can instead use prepared statements via SQL directly (as php
probably does in the end) :

// initialisation
pg_query('PREPARE mystatement (bytea) AS INSERT INTO mytable (bd) VALUES
($1);');
// insert loop
pg_query("EXECUTE mystatement (' . pg_escape_bytea($data) . "');");

Annoying to have to do all this yourself, but it should work (and it *is* a
parameterized query).

BTW, if you're doing bulk inserts, consider pg_copy_from() instead.
--
Vincent de Phily


From: ljb <lbayuk(at)pobox(dot)com>
To: pgsql-php(at)postgresql(dot)org
Subject: Re: pg_[un]escape_bytea, pgsql 8.2.1, php 5.1.6, Linux
Date: 2007-02-06 02:20:51
Message-ID: eq8om3$4om$1@news.hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-php

vdephily(at)bluemetrix(dot)com wrote:
>...
> I think you can instead use prepared statements via SQL directly (as php
> probably does in the end) :
>
> // initialisation
> pg_query('PREPARE mystatement (bytea) AS INSERT INTO mytable (bd) VALUES
> ($1);');
> // insert loop
> pg_query("EXECUTE mystatement (' . pg_escape_bytea($data) . "');");
>
> Annoying to have to do all this yourself, but it should work (and it *is* a
> parameterized query).

It's sort of a parameterized query, but not really in the sense the
original poster wants. That is, it does not protect against SQL injection
attacks the way a true parameterized query does (with the variable data
passed outside of the SQL statement itself, and not subject to SQL
parsing). If some way around pg_escape_bytea were to be found (as perhaps
happened before with multi-byte characters and PQescapeString), the above
could be vulnerable. In fact I don't see where it is any safer than
just doing pg_query("INSERT ... '" . pg_escape_bytea($data) . "')");


From: "Gary Chambers" <gwchamb(at)gmail(dot)com>
To: "Vincent de Phily" <vdephily(at)bluemetrix(dot)com>
Cc: pgsql-php(at)postgresql(dot)org
Subject: Re: pg_[un]escape_bytea, pgsql 8.2.1, php 5.1.6, Linux
Date: 2007-02-07 14:58:34
Message-ID: 302670f20702070658vb0913e2xa51d153ddbe5b34d@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-php

Vincent...

> This would be a problem related to php, not postgres. I'm handling binary data
> in parameterized and COPY queries just fine with c++.

I'm not ruling-out PHP, but I've seen claims of pg_[un]escape_bytea
being only wrappers to the Pg functions.

-- Gary Chambers

// Nothing fancy and nothing Microsoft!


From: ljb <lbayuk(at)pobox(dot)com>
To: pgsql-php(at)postgresql(dot)org
Subject: Re: pg_[un]escape_bytea, pgsql 8.2.1, php 5.1.6, Linux
Date: 2007-02-08 02:08:41
Message-ID: eqe0n8$1ohp$1@news.hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-php

gwchamb(at)gmail(dot)com wrote:
> Vincent...
>
>> This would be a problem related to php, not postgres. I'm handling binary data
>> in parameterized and COPY queries just fine with c++.
>
> I'm not ruling-out PHP, but I've seen claims of pg_[un]escape_bytea
> being only wrappers to the Pg functions.

That is correct, the PHP function relies on the PostgreSQL library to do
the work. But the problem of using a parameterized query with bytea data
is specific to PHP. PostgreSQL allows you to either escape the bytea
data for use as a text-mode parameter, or pass it 'raw' as a binary-mode
parameter. I don't know a good way to do the first in PHP, and the PHP
PostgreSQL interface doesn't support the second at all.


From: Vincent de Phily <vdephily(at)bluemetrix(dot)com>
To: pgsql-php(at)postgresql(dot)org
Subject: Re: pg_[un]escape_bytea, pgsql 8.2.1, php 5.1.6, Linux
Date: 2007-02-08 12:12:43
Message-ID: 200702081212.43200.vdephily@bluemetrix.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-php

On Thursday 08 February 2007, ljb wrote:
> gwchamb(at)gmail(dot)com wrote:
> > Vincent...
> >
> >> This would be a problem related to php, not postgres. I'm handling
> >> binary data in parameterized and COPY queries just fine with c++.
> >
> > I'm not ruling-out PHP, but I've seen claims of pg_[un]escape_bytea
> > being only wrappers to the Pg functions.
>
> That is correct, the PHP function relies on the PostgreSQL library to do
> the work. But the problem of using a parameterized query with bytea data
> is specific to PHP. PostgreSQL allows you to either escape the bytea
> data for use as a text-mode parameter, or pass it 'raw' as a binary-mode
> parameter. I don't know a good way to do the first in PHP, and the PHP
> PostgreSQL interface doesn't support the second at all.

This sparked my interest, so I checked the libpq docs
(http://www.postgresql.org/docs/8.2/interactive/libpq-exec.html) and wrote
some php tests.

The postgres doc say that the escape functions should only be used "for
Inclusion in SQL Commands", and that "it is not necessary nor correct to do
escaping when a data value is passed as a separate parameter in
PQexecParams".
But when I try to pass the raw data using pg_query_params(), postgres
complains about UTF8 (on a supposedly bytea field). Makes you wonder which
postgres function does pg_query_params() really wrap around...

I reallize I went all this way to just re-discover what has been said in this
thread, but if I needed the extra research, maybe other will find it
insightfull.
--
Vincent de Phily


From: Marco Colombo <pgsql(at)esiway(dot)net>
To: ljb <lbayuk(at)pobox(dot)com>
Cc: pgsql-php(at)postgresql(dot)org
Subject: Re: pg_[un]escape_bytea, pgsql 8.2.1, php 5.1.6, Linux
Date: 2007-02-08 13:48:15
Message-ID: 45CB2A1F.4080800@esiway.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-php

ljb wrote:
> gwchamb(at)gmail(dot)com wrote:
>> Apache 2.2.3, PostgreSQL 8.2.1, PHP 5.1.6, Linux
>>
>> I have inserted (via pg_query_params) into a bytea field some binary
>> data (a JPEG image in this case) which I have escaped using
>> pg_escape_bytea. It appears, however, that the extracted data is
>> corrupt (NOT unescaped, more precisely), even after unescaping it with
>> pg_unescape_bytea. If I perform another (a subsequent)
>> pg_unescape_bytea, it appears to be partially unescaped, but there
>> still remain errors because the rest of the image is severely
>> distorted -- but minimally recognizeable as part of the original
>> image. What am I missing? I'm using the lo_* functions as an
>> alternative, but it's hard to dismiss the ease with which it appears
>> to deal with binary data with a bytea field.
>
> Interesting problem.
...
> The escaping done by pg_escape_bytea() is wrong for parameterized queries.
> With a binary-mode query parameter (which pg_query_params() can't do
> anyway), you want no escaping at all. With a text-mode parameter (as
> pg_query_params() does), you need to escape for only the bytea-input
> parsing, not the SQL parsing. So for example if your data has a byte with
> value 1, you need to pass that as the 4 characters: \001.
> pg_escape_bytea() returns that as the 5 characters: \\001 (unless the new
> 'standard conforming strings' is on), so it won't work. Nor can I think of
> another PHP escaping function that does work here.

Isn't it possible to partly undo the effect of pg_escape_bytea()?

Something like this:

<?php
/* some binary data */
$data = implode("", range("\000", "\377"));

echo "pg_escape_bytea():\n";
echo pg_escape_bytea($data), "\n\n";

echo "pg_escape_bytea(), corrected:\n";
echo str_replace("\\\\", "\\", pg_escape_bytea($data)), "\n\n";
?>

Here's the result:

$ php -f test.php | fold -w 70
pg_escape_bytea():
\\000\\001\\002\\003\\004\\005\\006\\007\\010\\011\\012\\013\\014\\015
\\016\\017\\020\\021\\022\\023\\024\\025\\026\\027\\030\\031\\032\\033
\\034\\035\\036\\037 !"#$%&''()*+,-./0123456789:;<=>?(at)ABCDEFGHIJKLMNOP
QRSTUVWXYZ[\\\\]^_`abcdefghijklmnopqrstuvwxyz{|}~\\177\\200\\201\\202\
\203\\204\\205\\206\\207\\210\\211\\212\\213\\214\\215\\216\\217\\220\
\221\\222\\223\\224\\225\\226\\227\\230\\231\\232\\233\\234\\235\\236\
\237\\240\\241\\242\\243\\244\\245\\246\\247\\250\\251\\252\\253\\254\
\255\\256\\257\\260\\261\\262\\263\\264\\265\\266\\267\\270\\271\\272\
\273\\274\\275\\276\\277\\300\\301\\302\\303\\304\\305\\306\\307\\310\
\311\\312\\313\\314\\315\\316\\317\\320\\321\\322\\323\\324\\325\\326\
\327\\330\\331\\332\\333\\334\\335\\336\\337\\340\\341\\342\\343\\344\
\345\\346\\347\\350\\351\\352\\353\\354\\355\\356\\357\\360\\361\\362\
\363\\364\\365\\366\\367\\370\\371\\372\\373\\374\\375\\376\\377

pg_escape_bytea(), corrected:
\000\001\002\003\004\005\006\007\010\011\012\013\014\015\016\017\020\0
21\022\023\024\025\026\027\030\031\032\033\034\035\036\037 !"#$%&''()*
+,-./0123456789:;<=>?(at)ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmno
pqrstuvwxyz{|}~\177\200\201\202\203\204\205\206\207\210\211\212\213\21
4\215\216\217\220\221\222\223\224\225\226\227\230\231\232\233\234\235\
236\237\240\241\242\243\244\245\246\247\250\251\252\253\254\255\256\25
7\260\261\262\263\264\265\266\267\270\271\272\273\274\275\276\277\300\
301\302\303\304\305\306\307\310\311\312\313\314\315\316\317\320\321\32
2\323\324\325\326\327\330\331\332\333\334\335\336\337\340\341\342\343\
344\345\346\347\350\351\352\353\354\355\356\357\360\361\362\363\364\36
5\366\367\370\371\372\373\374\375\376\377

Isn't the latter ready for pg_query_params()?

The only doubt I have is about the double '. Is it needed for bytea
parsing or sql parsing? You may have to replace it with a single ', if
it's only for SQL:

str_replace(array("\\\\", "''"), array("\\", "'"),
pg_escape_bytea($data)), "\n\n";

See http://www.postgresql.org/docs/8.2/interactive/datatype-binary.html.
The man page is not clear, the single quote is listed among the escaped
chars, but from the examples is seems that only SQL escaping is
performed. Compare with \:

SELECT E'\\\\'::bytea; -- after SQL parsing, you get \\
SELECT E'\''::bytea; -- after SQL parsing, you get '

it seems that the bytea parser is feeded with a double slash, but a
single quote. If so, the octal value 39 should be removed from the table
in the manual page, since it requires no escaping at bytea level.
pg_escape_bytea() seems to confirm this, it returns '' for ' but \\\\
for \. \ is escaped twice, ' only once.

Unfortunately I can't try it now with pg_query_params(), but I bet on
the second form... well no, maybe I can, yes, here we go:

<?php
$data = implode("", range("\000", "\377"));
$data_escaped = str_replace(
array("\\\\", "''"),
array("\\", "'"),
pg_escape_bytea($data));

/* fill in the conn string, if required */
$db = pg_connect("");
pg_query($db, "CREATE TEMP TABLE atest (data bytea);");
pg_query_params($db, "INSERT INTO atest VALUES ($1)", array($data_escaped));

$res = pg_query($db, "SELECT data FROM atest");
$data2 = pg_unescape_bytea(pg_fetch_result($res, 0));

echo ($data2 == $data) ? "OK!" : "PANIC!", "\n";
?>

$ php -f test.php
OK!

Seems to work...
.TM.