New option in pg_basebackup to exclude pg_log files during base backup

Lists: pgsql-hackers
From: "Prabakaran, Vaishnavi" <vaishnavip(at)fast(dot)au(dot)fujitsu(dot)com>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: New option in pg_basebackup to exclude pg_log files during base backup
Date: 2014-04-09 00:06:30
Message-ID: 82897A1301080E4B8E461DDAA0FFCF142A1B2660@SYD1216
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi all,

Following the discussion in message id - CAHGQGwFFMOr4EcugWHZpAaPYQbsEKDg66VmJ1rveJ6Z-EgaqAg(at)mail(dot)gmail(dot)com<mailto:CAHGQGwFFMOr4EcugWHZpAaPYQbsEKDg66VmJ1rveJ6Z-EgaqAg(at)mail(dot)gmail(dot)com> , I have developed the patch which gives option to user to exclude pg_log directory contents in pg_basebackup.

[Current situation]
During pg_basebackup, all files in pg_log directory will be copied to new backup directory.

[Design]
- Added new non-mandatory option "-S/--skip-log-dir" to pg_basebackup .
- If "skip-log-dir" is specified in pg_basebackup command, then in basebackup, exclude copying log files from standard "pg_log" directory and any other directory specified in Log_directory guc variable. (Still empty folder "pg_log"/$Log_directory will be created)
- In case, pg_log/$Log_directory is symbolic link, then an empty folder will be created

[Advantage]
It gives an option to user to avoid copying of large log files if they doesn't wish to and hence can save memory space.

Attached the patch.

Thanks & Regards,
Vaishnavi
Fujitsu Australia

Attachment Content-Type Size
pgbasebackup_excludes_pglog_v1.patch application/octet-stream 8.8 KB

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: "Prabakaran, Vaishnavi" <vaishnavip(at)fast(dot)au(dot)fujitsu(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New option in pg_basebackup to exclude pg_log files during base backup
Date: 2014-04-09 14:45:23
Message-ID: CABUevEw3h93DzsH-CY==bmprYXkC9v4S1Zz51Bh8-Pu2jMeDcA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Apr 9, 2014 at 2:06 AM, Prabakaran, Vaishnavi <
vaishnavip(at)fast(dot)au(dot)fujitsu(dot)com> wrote:

> Hi all,
>
>
>
> Following the discussion in message id -
> CAHGQGwFFMOr4EcugWHZpAaPYQbsEKDg66VmJ1rveJ6Z-EgaqAg(at)mail(dot)gmail(dot)com , I
> have developed the patch which gives option to user to exclude pg_log
> directory contents in pg_basebackup.
>
>
>
> [Current situation]
>
> During pg_basebackup, all files in pg_log directory will be copied to new
> backup directory.
>
>
>
> [Design]
>
> - Added new non-mandatory option "-S/--skip-log-dir" to pg_basebackup .
>
> - If "skip-log-dir" is specified in pg_basebackup command, then in
> basebackup, exclude copying log files from standard "pg_log" directory and
> any other directory specified in Log_directory guc variable. (Still empty
> folder "pg_log"/$Log_directory will be created)
>
> - In case, pg_log/$Log_directory is symbolic link, then an empty folder
> will be created
>
>
>
> [Advantage]
>
> It gives an option to user to avoid copying of large log files if they
> doesn't wish to and hence can save memory space.
>
>
>
>
>
While pg_log is definitely the most common one being the default on many
platforms, we'll still be missing other ones. Should we really hardcode it,
or should we somehow derive it from the settings for log_directory instead?

As a more general discussion, is this something we might want to expose as
a more general facility rather than hardcode it to the log directory only?
And is it perhaps something we'd rather have configured at the server than
specified in pg_basebackup - like a guc saying which directories should
always be excluded from a basebackup? So you don't have to remember it
every time?

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: "Prabakaran, Vaishnavi" <vaishnavip(at)fast(dot)au(dot)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New option in pg_basebackup to exclude pg_log files during base backup
Date: 2014-04-09 14:55:50
Message-ID: 20140409145550.GU5822@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Magnus Hagander wrote:

> While pg_log is definitely the most common one being the default on many
> platforms, we'll still be missing other ones. Should we really hardcode it,
> or should we somehow derive it from the settings for log_directory instead?
>
> As a more general discussion, is this something we might want to expose as
> a more general facility rather than hardcode it to the log directory only?
> And is it perhaps something we'd rather have configured at the server than
> specified in pg_basebackup - like a guc saying which directories should
> always be excluded from a basebackup? So you don't have to remember it
> every time?

So it'd be an array, and by default you'd have something like:
basebackup_skip_path = $log_directory
?

Maybe use it to skip backup labels by default as well.
basebackup_skip_path = $log_directory, $backup_label_files

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: "Prabakaran, Vaishnavi" <vaishnavip(at)fast(dot)au(dot)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New option in pg_basebackup to exclude pg_log files during base backup
Date: 2014-04-09 14:57:32
Message-ID: CABUevExdLQw8VYH2OF5CeFgEUnGH1e-5_MRegjUJqiN+fyBnjg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Apr 9, 2014 at 4:55 PM, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>wrote:

> Magnus Hagander wrote:
>
> > While pg_log is definitely the most common one being the default on many
> > platforms, we'll still be missing other ones. Should we really hardcode
> it,
> > or should we somehow derive it from the settings for log_directory
> instead?
> >
> > As a more general discussion, is this something we might want to expose
> as
> > a more general facility rather than hardcode it to the log directory
> only?
> > And is it perhaps something we'd rather have configured at the server
> than
> > specified in pg_basebackup - like a guc saying which directories should
> > always be excluded from a basebackup? So you don't have to remember it
> > every time?
>
> So it'd be an array, and by default you'd have something like:
> basebackup_skip_path = $log_directory
> ?
>
> Maybe use it to skip backup labels by default as well.
> basebackup_skip_path = $log_directory, $backup_label_files
>

I hadn't considered any details, but yes, someting along that line. And
then you could also include arbitrary filenames or directories should you
want. E.g. if you use the data directory to store your torrents or
something.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: "Prabakaran, Vaishnavi" <vaishnavip(at)fast(dot)au(dot)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New option in pg_basebackup to exclude pg_log files during base backup
Date: 2014-04-09 15:14:47
Message-ID: 20140409151447.GV5822@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Magnus Hagander wrote:
> On Wed, Apr 9, 2014 at 4:55 PM, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>wrote:

> > So it'd be an array, and by default you'd have something like:
> > basebackup_skip_path = $log_directory
> > ?
> >
> > Maybe use it to skip backup labels by default as well.
> > basebackup_skip_path = $log_directory, $backup_label_files
> >
>
> I hadn't considered any details, but yes, someting along that line. And
> then you could also include arbitrary filenames or directories should you
> want. E.g. if you use the data directory to store your torrents or
> something.

Man, that's a great idea. Database servers have lots of diskspace in
that partition, so it should work really well. Thanks!

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: "Prabakaran, Vaishnavi" <vaishnavip(at)fast(dot)au(dot)fujitsu(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New option in pg_basebackup to exclude pg_log files during base backup
Date: 2014-04-10 01:40:12
Message-ID: 82897A1301080E4B8E461DDAA0FFCF142A1B327F@SYD1216
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thursday, Apr 10,2014 at 1:15Am, Álvaro Herrera wrote:
>Magnus Hagander wrote:
>>On Wed, Apr 9, 2014 at 4:55 PM, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>wrote:

>> > So it'd be an array, and by default you'd have something like:
>> > basebackup_skip_path = $log_directory ?
>> >
>> > Maybe use it to skip backup labels by default as well.
>> > basebackup_skip_path = $log_directory, $backup_label_files
>> >
>>
>> I hadn't considered any details, but yes, someting along that line.
>> And then you could also include arbitrary filenames or directories
>> should you want. E.g. if you use the data directory to store your
>> torrents or something.

>Man, that's a great idea. Database servers have lots of diskspace in that partition, so it should work really well. Thanks!

Yes, It sounds like a good idea. I will look into this and start working in sometime.

Thanks & Regards,
Vaishnavi
Fujitsu Australia


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Magnus Hagander <magnus(at)hagander(dot)net>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: "Prabakaran, Vaishnavi" <vaishnavip(at)fast(dot)au(dot)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New option in pg_basebackup to exclude pg_log files during base backup
Date: 2014-04-16 03:16:46
Message-ID: 534DF61E.8020708@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 4/9/14, 10:57 AM, Magnus Hagander wrote:
> So it'd be an array, and by default you'd have something like:
> basebackup_skip_path = $log_directory
> ?
>
> Maybe use it to skip backup labels by default as well.
> basebackup_skip_path = $log_directory, $backup_label_files
>
>
> I hadn't considered any details, but yes, someting along that line. And
> then you could also include arbitrary filenames or directories should
> you want.

What are the use cases for excluding anything else?

pg_basebackup ought to have some intelligence about what files are
appropriate to include or exclude, depending on what the user is trying
to do. It shouldn't become a general file copying tool.