Re: Keywords in pg_hba.conf should be field-specific

Lists: pgsql-hackers
From: Brendan Jurd <direvus(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Keywords in pg_hba.conf should be field-specific
Date: 2010-10-09 02:13:07
Message-ID: AANLkTi=q8DZj79OKrWc-kE9zg-rH-1tcQdqbsbKfO1zF@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi folks,

One of the speedbumps I hit when setting up HS+SR was naming the user
the slave would connect as for streaming replication. At first I
picked 'replication', which seemed quite natural to me (and I don't
doubt will seem natural to others as well).

When I started up the slave, I got this error:

FATAL: could not connect to the primary server: FATAL: no
pg_hba.conf entry for replication connection from host
"192.168.21.10", user "replication", SSL on

Bzzzt. Wrong. There *was* such an entry in pg_hba.conf. I wasted a
lot of time checking my conf files for typos before I wondered whether
there might be something wrong with using 'replication' as a username.
I changed the username to 'streamrep' and it all started working
perfectly.

I understand that 'replication' is a keyword as far as the database
name is concerned, but I was surprised to find that it was treated as
a keyword in the username field also. I had a look in
src/backend/libpq/hba.c, and next_token() appears to be completely
naive about this. 'replication' (along with 'all', 'sameuser',
'samegroup' and 'samerole') is treated as a keyword wherever it
appears, not just in the field where it is relevant. next_token()
appends a newline to the end of the 'replication' username token, and
that's why the entry doesn't match my connection attempt.

I suspect this is going to trip a lot of people up. We could just
document it and tell people that if they want to use 'replication' as
a username, they'd better quote it in pg_hba.conf. But I'd prefer to
actually solve the problem. We could do this by teaching next_token
to be sensitive to which field it is parsing, and limit keyword
detection appropriately. Or, if that's awkward, we could teach
parse_hba_line to ignore keyword markers where the keyword is not
applicable. The more I think about it, the more I incline towards the
latter.

Thoughts?

Cheers,
BJ


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Brendan Jurd <direvus(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Keywords in pg_hba.conf should be field-specific
Date: 2010-10-09 03:08:05
Message-ID: AANLkTinrkTNkiCcdb0u8Nuzj12oiXNeqW8_=apRupvtf@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Oct 8, 2010 at 10:13 PM, Brendan Jurd <direvus(at)gmail(dot)com> wrote:
> One of the speedbumps I hit when setting up HS+SR was naming the user
> the slave would connect as for streaming replication.  At first I
> picked 'replication', which seemed quite natural to me (and I don't
> doubt will seem natural to others as well).
>
> When I started up the slave, I got this error:
>
> FATAL:  could not connect to the primary server: FATAL:  no
> pg_hba.conf entry for replication connection from host
> "192.168.21.10", user "replication", SSL on
>
> Bzzzt.  Wrong.  There *was* such an entry in pg_hba.conf.  I wasted a
> lot of time checking my conf files for typos before I wondered whether
> there might be something wrong with using 'replication' as a username.
>  I changed the username to 'streamrep' and it all started working
> perfectly.
>
> I understand that 'replication' is a keyword as far as the database
> name is concerned, but I was surprised to find that it was treated as
> a keyword in the username field also.

Yikes. That does seems surprising.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Brendan Jurd <direvus(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Keywords in pg_hba.conf should be field-specific
Date: 2010-10-12 13:28:16
Message-ID: AANLkTi=4nCkLi8p29up2gOuM1nNcotXwr49AopnjObjp@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Oct 9, 2010 at 11:13 AM, Brendan Jurd <direvus(at)gmail(dot)com> wrote:
> I understand that 'replication' is a keyword as far as the database
> name is concerned, but I was surprised to find that it was treated as
> a keyword in the username field also.  I had a look in
> src/backend/libpq/hba.c, and next_token() appears to be completely
> naive about this.  'replication' (along with 'all', 'sameuser',
> 'samegroup' and 'samerole')  is treated as a keyword wherever it
> appears, not just in the field where it is relevant.  next_token()
> appends a newline to the end of the 'replication' username token, and
> that's why the entry doesn't match my connection attempt.
>
> I suspect this is going to trip a lot of people up.  We could just
> document it and tell people that if they want to use 'replication' as
> a username, they'd better quote it in pg_hba.conf.  But I'd prefer to
> actually solve the problem.

Agreed. We should address that.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


From: Brendan Jurd <direvus(at)gmail(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Keywords in pg_hba.conf should be field-specific
Date: 2010-10-16 14:52:05
Message-ID: AANLkTi=RivP3iq_+OfeqZs5L_ZYk2auvJ0y_LBNH_TYL@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 13 October 2010 00:28, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> On Sat, Oct 9, 2010 at 11:13 AM, Brendan Jurd <direvus(at)gmail(dot)com> wrote:
>> I understand that 'replication' is a keyword as far as the database
>> name is concerned, but I was surprised to find that it was treated as
>> a keyword in the username field also.  I had a look in
>> src/backend/libpq/hba.c, and next_token() appears to be completely
>> naive about this.  'replication' (along with 'all', 'sameuser',
>> 'samegroup' and 'samerole')  is treated as a keyword wherever it
>> appears, not just in the field where it is relevant.  next_token()
>> appends a newline to the end of the 'replication' username token, and
>> that's why the entry doesn't match my connection attempt.
>>
>> I suspect this is going to trip a lot of people up.  We could just
>> document it and tell people that if they want to use 'replication' as
>> a username, they'd better quote it in pg_hba.conf.  But I'd prefer to
>> actually solve the problem.
>
> Agreed. We should address that.
>

Hi folks,

Per the above discussion, I've prepared a patch to make keywords in
pg_hba.conf field-specific.

This patch takes a least-resistance approach to solving the problem.
next_token() continues to blithely append a newline character to any
token which *might* be a keyword, but the patch teaches
hba_parse_line() to be a bit more savvy about whether to take
next_token's word for it.

In the Database field, hba_parse_line() acknowledges 'all',
'sameuser', 'samegroup', 'samerole' and 'replication' as legitimate
keywords. Anything else is considered not a keyword, and if there is
a trailing newline it is stripped from the token (by just reassigning
it to NUL).

Likewise, in the Role field, only 'all' is considered legitimate.

In the Host field, we allow 'samehost' and 'samenet'.

The bottom line is that this makes 'replication' safe to use as a
username, no quoting required.

It seemed a little bit inelegant to allow next_token() to mark up a
keyword, only to ignore that marking later on, but as the number of
fields in each line is not decided until we parse it, there was no
sensible way to teach next_token() which field it was evaluating.

Added to the November CF.

Cheers,
BJ


From: Brendan Jurd <direvus(at)gmail(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Keywords in pg_hba.conf should be field-specific
Date: 2010-10-16 14:53:31
Message-ID: AANLkTi=khcyyPuv4jmKhBG-QT5o3spdGGhanA5Ck9sCa@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 17 October 2010 01:52, Brendan Jurd <direvus(at)gmail(dot)com> wrote:
> Per the above discussion, I've prepared a patch to make keywords in
> pg_hba.conf field-specific.
>

Try New and Improved This Message (tm), now with attachment!

Cheers,
BJ

Attachment Content-Type Size
hba-keywords.diff text/plain 2.0 KB

From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Brendan Jurd <direvus(at)gmail(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Keywords in pg_hba.conf should be field-specific
Date: 2010-10-16 15:27:17
Message-ID: 1287242720-sup-7900@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Excerpts from Brendan Jurd's message of sáb oct 16 11:53:31 -0300 2010:
> On 17 October 2010 01:52, Brendan Jurd <direvus(at)gmail(dot)com> wrote:
> > Per the above discussion, I've prepared a patch to make keywords in
> > pg_hba.conf field-specific.
>
> Try New and Improved This Message (tm), now with attachment!

Hmm. Would it be possible to list keywords _applicable_ to each field,
and have these passed down to next_token by the caller instead? This
seems backwards, but I'm not sure if the other way is really workable.

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Brendan Jurd <direvus(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Keywords in pg_hba.conf should be field-specific
Date: 2010-10-16 16:07:13
Message-ID: AANLkTi=heyzyVKFqAYwCAZfvcVVdpT_fUm20p-Xm7XOK@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 17 October 2010 02:27, Alvaro Herrera <alvherre(at)commandprompt(dot)com> wrote:
> Hmm.  Would it be possible to list keywords _applicable_ to each field,
> and have these passed down to next_token by the caller instead?  This
> seems backwards, but I'm not sure if the other way is really workable.
>

Short answer: I don't think it is workable, or I would have done it
that way in the first place.

Full answer: The problem is that pg_hba.conf doesn't have a fixed
structure. Each line can be 4, 5 or 6 fields (not including the final
'options' field) long, and which of these structures apply to any
given line isn't decided until parse_hba_line goes to work on it.

At the time that next_token gets called, we have no way of knowing
which field is currently being tokenised, at least not without doing
some serious rearrangement of hba.c, so that it tokenises and then
parses one token at a time, instead of tokenising the whole file and
then parsing all the lines.

Cheers,
BJ


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Brendan Jurd <direvus(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Keywords in pg_hba.conf should be field-specific
Date: 2010-10-16 22:55:23
Message-ID: 12968.1287269723@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> Excerpts from Brendan Jurd's message of sb oct 16 11:53:31 -0300 2010:
>> Try New and Improved This Message (tm), now with attachment!

> Hmm. Would it be possible to list keywords _applicable_ to each field,
> and have these passed down to next_token by the caller instead?

+1. That newline thing is a crude hack and always has been. If people
are dissatisfied with it, it's time to start over rather than piling
additional crockery on top.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Brendan Jurd <direvus(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Keywords in pg_hba.conf should be field-specific
Date: 2010-10-16 22:59:40
Message-ID: 13033.1287269980@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Brendan Jurd <direvus(at)gmail(dot)com> writes:
> Full answer: The problem is that pg_hba.conf doesn't have a fixed
> structure. Each line can be 4, 5 or 6 fields (not including the final
> 'options' field) long, and which of these structures apply to any
> given line isn't decided until parse_hba_line goes to work on it.

> At the time that next_token gets called, we have no way of knowing
> which field is currently being tokenised, at least not without doing
> some serious rearrangement of hba.c, so that it tokenises and then
> parses one token at a time, instead of tokenising the whole file and
> then parsing all the lines.

Good point. Maybe the correct fix is to remember whether each token was
quoted or not, so that keyword detection can be done safely after the
initial lexing. I still think that the current method is impossibly
ugly ...

regards, tom lane


From: Brendan Jurd <direvus(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Keywords in pg_hba.conf should be field-specific
Date: 2010-10-17 04:54:08
Message-ID: AANLkTi=CEXnzxgWHcThQaKC5SMR9JKrY7G1FxM9Nsyqr@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 17 October 2010 09:59, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Brendan Jurd <direvus(at)gmail(dot)com> writes:
>> At the time that next_token gets called, we have no way of knowing
>> which field is currently being tokenised, at least not without doing
>> some serious rearrangement of hba.c, so that it tokenises and then
>> parses one token at a time, instead of tokenising the whole file and
>> then parsing all the lines.
>
> Good point.  Maybe the correct fix is to remember whether each token was
> quoted or not, so that keyword detection can be done safely after the
> initial lexing.  I still think that the current method is impossibly
> ugly ...
>

I'm happy to revise the patch on that basis. Any suggestions about
how to communicate the 'quotedness' of each token? We could make each
token a struct consisting of the token itself, plus a boolean flag to
indicate whether it had been quoted. Does that work for you?

Cheers,
BJ


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Brendan Jurd <direvus(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Keywords in pg_hba.conf should be field-specific
Date: 2010-10-17 14:19:10
Message-ID: 8221.1287325150@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Brendan Jurd <direvus(at)gmail(dot)com> writes:
> On 17 October 2010 09:59, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Good point. Maybe the correct fix is to remember whether each token was
>> quoted or not, so that keyword detection can be done safely after the
>> initial lexing. I still think that the current method is impossibly
>> ugly ...

> I'm happy to revise the patch on that basis. Any suggestions about
> how to communicate the 'quotedness' of each token? We could make each
> token a struct consisting of the token itself, plus a boolean flag to
> indicate whether it had been quoted. Does that work for you?

Seems reasonable. I had the idea of a parallel list of booleans in the
back of my mind, but a list of structs is probably easier to understand,
and to extend further if necessary.

regards, tom lane


From: Brendan Jurd <direvus(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Keywords in pg_hba.conf should be field-specific
Date: 2010-10-28 22:59:37
Message-ID: AANLkTimRrdLHNGp76o8ff4G8vZW-OntzX9FXNZrwiuU2@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 18 October 2010 01:19, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Brendan Jurd <direvus(at)gmail(dot)com> writes:
>> On 17 October 2010 09:59, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> Good point.  Maybe the correct fix is to remember whether each token was
>>> quoted or not, so that keyword detection can be done safely after the
>>> initial lexing.  I still think that the current method is impossibly
>>> ugly ...
>
>> I'm happy to revise the patch on that basis.  Any suggestions about
>> how to communicate the 'quotedness' of each token?  We could make each
>> token a struct consisting of the token itself, plus a boolean flag to
>> indicate whether it had been quoted.  Does that work for you?
>
> Seems reasonable.  I had the idea of a parallel list of booleans in the
> back of my mind, but a list of structs is probably easier to understand,
> and to extend further if necessary.
>

Okay, I've taken the red pill and I'm finding out how deep the rabbit
hole goes ...

The logical structure of pg_hba.conf is a set of lines, each line
containing a set of fields, each field containing a set of tokens.
The way the existing implementation handles this is to create a list
of lines containing sublists of fields, containing comma-separated
strings for the set of tokens, with newlines embedded next to tokens
which might be keywords.

The tokeniser breaks apart the comma-separated tokens ... and then
reassembles them into a comma-separated string. Which the db/role
matching functions then have to break apart *again*.

In order to keep track of whether each individual token was quoted, I
first need to impose some sanity here. Rather than using a magical
string for each field, I intend to use a List of HbaToken structs
which explicitly note whether quoting was used.

Introducing an extra List level does mean a bit more work copying and
freeing, and it makes the patch really quite intrusive. I have to
touch a lot of lines in hba.c, but I think the additional clarity is
worth it. If nobody dissuades me from this approach I hope to post a
patch in a couple of days.

Cheers,
BJ


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Brendan Jurd <direvus(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Keywords in pg_hba.conf should be field-specific
Date: 2011-02-26 07:06:25
Message-ID: 201102260706.p1Q76PR05021@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Any progress on this?

---------------------------------------------------------------------------

Brendan Jurd wrote:
> On 18 October 2010 01:19, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > Brendan Jurd <direvus(at)gmail(dot)com> writes:
> >> On 17 October 2010 09:59, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >>> Good point. ?Maybe the correct fix is to remember whether each token was
> >>> quoted or not, so that keyword detection can be done safely after the
> >>> initial lexing. ?I still think that the current method is impossibly
> >>> ugly ...
> >
> >> I'm happy to revise the patch on that basis. ?Any suggestions about
> >> how to communicate the 'quotedness' of each token? ?We could make each
> >> token a struct consisting of the token itself, plus a boolean flag to
> >> indicate whether it had been quoted. ?Does that work for you?
> >
> > Seems reasonable. ?I had the idea of a parallel list of booleans in the
> > back of my mind, but a list of structs is probably easier to understand,
> > and to extend further if necessary.
> >
>
> Okay, I've taken the red pill and I'm finding out how deep the rabbit
> hole goes ...
>
> The logical structure of pg_hba.conf is a set of lines, each line
> containing a set of fields, each field containing a set of tokens.
> The way the existing implementation handles this is to create a list
> of lines containing sublists of fields, containing comma-separated
> strings for the set of tokens, with newlines embedded next to tokens
> which might be keywords.
>
> The tokeniser breaks apart the comma-separated tokens ... and then
> reassembles them into a comma-separated string. Which the db/role
> matching functions then have to break apart *again*.
>
> In order to keep track of whether each individual token was quoted, I
> first need to impose some sanity here. Rather than using a magical
> string for each field, I intend to use a List of HbaToken structs
> which explicitly note whether quoting was used.
>
> Introducing an extra List level does mean a bit more work copying and
> freeing, and it makes the patch really quite intrusive. I have to
> touch a lot of lines in hba.c, but I think the additional clarity is
> worth it. If nobody dissuades me from this approach I hope to post a
> patch in a couple of days.
>
> Cheers,
> BJ
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +


From: Brendan Jurd <direvus(at)gmail(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Keywords in pg_hba.conf should be field-specific
Date: 2011-02-26 09:57:13
Message-ID: AANLkTi=E1gR7TLYmXWymGFNzi3ihxqayZYisFDvvxx-6@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 26 February 2011 18:06, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>
> Any progress on this?
>

I ended up doing most of the work, but never got around to finishing
it off. Thanks for the reminder, though. I'll get that one ready and
drop it onto the next CF.

Cheers,
BJ


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Brendan Jurd <direvus(at)gmail(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Keywords in pg_hba.conf should be field-specific
Date: 2011-02-26 11:07:37
Message-ID: AANLkTimFhNjc-CktWPUzo=otcDt8Mic6bAWOvfHHfned@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Feb 26, 2011 at 10:57, Brendan Jurd <direvus(at)gmail(dot)com> wrote:
> On 26 February 2011 18:06, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>>
>> Any progress on this?
>>
>
> I ended up doing most of the work, but never got around to finishing
> it off.  Thanks for the reminder, though.  I'll get that one ready and
> drop it onto the next CF.

Just as a point, which i proably moot if you've done most of the work
already :) But at some point it might be worthwhile to see if our life
would be mad easier by using flex (with or without bison) to parse
pg_hba.conf as wel, since we do it for postgresql.conf and other
things. If it makes maintenance easier, it's probably worth it.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Brendan Jurd <direvus(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Keywords in pg_hba.conf should be field-specific
Date: 2011-02-27 08:14:29
Message-ID: 201102270814.p1R8EUm00170@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Brendan Jurd wrote:
> On 26 February 2011 18:06, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> >
> > Any progress on this?
> >
>
> I ended up doing most of the work, but never got around to finishing
> it off. Thanks for the reminder, though. I'll get that one ready and
> drop it onto the next CF.

Added to TODO:

Have pg_hba.conf consider "replication" special only in the database
field

* http://archives.postgresql.org/pgsql-hackers/2010-10/msg00632.php

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +


From: Brendan Jurd <direvus(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Keywords in pg_hba.conf should be field-specific
Date: 2011-04-01 05:51:54
Message-ID: AANLkTin8p0SoN1YJeXO3cgiDLxev67oh4c7VtJ7e0h4O@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 29 October 2010 09:59, Brendan Jurd <direvus(at)gmail(dot)com> wrote:
> On 18 October 2010 01:19, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Brendan Jurd <direvus(at)gmail(dot)com> writes:
>>> On 17 October 2010 09:59, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>>> Good point.  Maybe the correct fix is to remember whether each token was
>>>> quoted or not, so that keyword detection can be done safely after the
>>>> initial lexing.  I still think that the current method is impossibly
>>>> ugly ...
>>
>>> I'm happy to revise the patch on that basis.  Any suggestions about
>>> how to communicate the 'quotedness' of each token?  We could make each
>>> token a struct consisting of the token itself, plus a boolean flag to
>>> indicate whether it had been quoted.  Does that work for you?
>>
>> Seems reasonable.  I had the idea of a parallel list of booleans in the
>> back of my mind, but a list of structs is probably easier to understand,
>> and to extend further if necessary.
>>
>
> Okay, I've taken the red pill and I'm finding out how deep the rabbit
> hole goes ...
>
> The logical structure of pg_hba.conf is a set of lines, each line
> containing a set of fields, each field containing a set of tokens.
> The way the existing implementation handles this is to create a list
> of lines containing sublists of fields, containing comma-separated
> strings for the set of tokens, with newlines embedded next to tokens
> which might be keywords.
>
> The tokeniser breaks apart the comma-separated tokens ... and then
> reassembles them into a comma-separated string.  Which the db/role
> matching functions then have to break apart *again*.
>
> In order to keep track of whether each individual token was quoted, I
> first need to impose some sanity here.  Rather than using a magical
> string for each field, I intend to use a List of HbaToken structs
> which explicitly note whether quoting was used.
>
> Introducing an extra List level does mean a bit more work copying and
> freeing, and it makes the patch really quite intrusive.  I have to
> touch a lot of lines in hba.c, but I think the additional clarity is
> worth it.  If nobody dissuades me from this approach I hope to post a
> patch in a couple of days.
>

Hi folks,

I've finally managed to get a patch together for the above. It ended
up being a much more difficult project than I anticipated.

Just to recap, the HEAD code in hba.c parses pg_hba.conf (and
ident.conf) into a List of lines, each line being a List of strings.
These strings have various magical properties, including commas to
separate individual tokens and newlines to mark up keywords.

I have replaced this behaviour with something which more closely
represents the true logical structure of the file, and is (hopefully)
much more intelligible. The patch introduces the HbaToken struct,
which consists of a string and a boolean to indicate whether the
string was parsed inside quotes. The parser now produces a
triple-nested list -- a List of lines, each line being a List of
fields, each field being a List of tokens.

It sounds straightforward enough, but to make this happen I had to
change a very substantial fraction of pg_hba.c.

I have tested a few basic pg_hba.conf setups and everything seems to
be working as intended. There is much more testing to be done, it
probably leaks memory and there is a lot more room for cleanup in
hba.c, but I would like to open the patch up for review and comment as
a WIP.

Posting it to the 2011-Next CF.

Cheers,
BJ

Attachment Content-Type Size
hba-keywords.diff.bz2 application/x-bzip2 9.6 KB