Re: Configuration include directory

From: Noah Misch <noah(at)leadboat(dot)com>
To: Greg Smith <greg(at)2ndQuadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Configuration include directory
Date: 2011-12-13 18:28:49
Message-ID: 20111213182849.GD5736@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Dec 12, 2011 at 01:34:24PM -0500, Greg Smith wrote:

[various things I agree with]

> -Don't bother trying to free individual bits of memory now that it's all
> in the same context. Saves some lines of code, and I do not miss the
> asserts I am no longer triggering.

In the postmaster, this context is the never-reset PostmasterContext. Thus,
these yield permanent leaks. The rest of the ProcessConfigFile() code makes
an effort to free everything it allocates, so let's do the same here. (I'd
favor, as an independent effort, replacing some of the explicit pfree()
activity in guc-file.l with a temporary memory context create/delete.)

> The only review suggestion I failed to incorporate was this one from Noah:
>
> > > > + if (strcmp(de->d_name + strlen(de->d_name) - 5, ".conf")
> != 0)
> > > + continue;
> > That may as well be memcmp().
>
> While true, his idea bothers both my abstraction and unexpected bug
> concern sides for reasons I can't really justify. I seem to have
> received too many past beatings toward using the string variants of all
> these functions whenever operating on things that are clearly strings.
> I'll punt this one toward whoever looks at this next to decide, both
> strcmp and strncmp are used in this section now.

Okay.

> diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
> index d1e628f..5df214e 100644
> *** a/doc/src/sgml/config.sgml
> --- b/doc/src/sgml/config.sgml

> *************** SET ENABLE_SEQSCAN TO OFF;
> *** 178,184 ****
> any desired selection condition. It also contains more information about
> what values are allowed for the parameters.
> </para>
> ! </sect1>
>
> <sect1 id="runtime-config-file-locations">
> <title>File Locations</title>
> --- 161,273 ----
> any desired selection condition. It also contains more information about
> what values are allowed for the parameters.
> </para>
> !
> ! <sect2 id="config-includes">
> ! <title>Configuration file includes</title>

Our section names use title case.

> ! <para>
> ! <indexterm>
> ! <primary><literal>include</></primary>
> ! <secondary>in configuration file</secondary>
> ! </indexterm>
> ! In addition to parameter settings, the <filename>postgresql.conf</>
> ! file can contain <firstterm>include directives</>, which specify
> ! another file to read and process as if it were inserted into the
> ! configuration file at this point. Include directives simply look like:
> ! <programlisting>
> ! include 'filename'
> ! </programlisting>
> ! If the file name is not an absolute path, it is taken as relative to
> ! the directory containing the referencing configuration file.
> ! All types of inclusion directives can be nested.
> ! </para>
> !
> ! <para>
> ! <indexterm>
> ! <primary><literal>include_dir</></primary>
> ! <secondary>in configuration file</secondary>
> ! </indexterm>
> ! The <filename>postgresql.conf</> file can also contain
> ! <firstterm>include_dir directives</>, which specify an entire directory
> ! of configuration files to include. It is used similarly:
> ! <programlisting>
> ! include_dir 'directory'
> ! </programlisting>
> ! Non-absolute directory names follow the same rules as single file include
> ! directives: they are relative to the directory containing the referencing
> ! configuration file. Within that directory, only files whose names end

Consider the wording "Within that directory, only non-directory files whose
names ..." to communicate that we ignore all subdirectories.

> ! with the suffix <literal>.conf</literal> will be included. File names
> ! that start with the <literal>.</literal> character are also excluded,
> ! to prevent mistakes as they are hidden on some platforms. Multiple files
> ! within an include directory are ordered by an alphanumeric sorting, so
> ! that ones beginning with numbers are considered before those starting
> ! with letters.
> ! </para>
> !
> ! <para>
> ! Include files or directories can be used to logically separate portions
> ! of the database configuration, rather than having a single large
> ! <filename>postgresql.conf</> file. Consider a company that has two
> ! database servers, each with a different amount of memory. There are likely
> ! elements of the configuration both will share, for things such as logging.
> ! But memory-related parameters on the server will vary between the two. And
> ! there might be server specific customizations too. One way to manage this

I suggest the punctuation "server-specific customizations, too."

> ! situation is to break the custom configuration changes for your site into
> ! three files. You could add this to the end of your
> ! <filename>postgresql.conf</> file to include them:
> ! <programlisting>
> ! include 'shared.conf'
> ! include 'memory.conf'
> ! include 'server.conf'
> ! </programlisting>
> ! All systems would have the same <filename>shared.conf</>. Each server
> ! with a particular amount of memory could share the same
> ! <filename>memory.conf</>; you might have one for all servers with 8GB of RAM,
> ! another for those having 16GB. And finally <filename>server.conf</> could
> ! have truly server-specific configuration information in it.
> ! </para>
> !
> ! <para>
> ! Another possibility for this same sort of organization is to create a
> ! configuration file directory and put this information into files there.
> ! Other programs such as <productname>Apache</productname> use a
> ! <filename>conf.d</> directory for this purpose. And using numbered names

This specific use of conf.d is a distribution-driven pattern; the upstream
Apache HTTP Server distribution never suggests it directly.

> ! to enforce an ordering is common in UNIX startup script directories named
> ! like <filename>/etc/rcN.d</filename>, where <literal>N</> gives the
> ! associated system runlevel. Both of these ideas might be combined by

This text suggests to me that the N in question is used for ordering. It's
not, though the order of names _inside_ each such directory does drive the
order of execution.

Overall, I'd probably just remove these comparisons to other projects.

> ! adding a <filename>conf.d</> directory where the
> ! <filename>postgresql.conf</> file is at, then referencing it at the end:
> ! <programlisting>
> ! include_dir 'conf.d'
> ! </programlisting>
> ! Then you could name the files in the <filename>conf.d</> directory like this:
> ! <programlisting>
> ! include '00shared.conf'
> ! include '01memory.conf'
> ! include '02server.conf'
> ! </programlisting>

This and ...

> ! This shows a clear order in which these files will be loaded. This is
> ! important because only the last setting encountered when the server is
> ! reading its configuration will be used. Something set in
> ! <filename>conf.d/02server.conf</> in this example would override a value
> ! set in <filename>conf.d/01memory.conf</>.
> ! </para>
> !
> ! <para>
> ! You might instead use this configuration directory approach while naming
> ! these files more descriptively:
> ! <programlisting>
> ! 00shared.conf
> ! 01memory-8GB.conf
> ! 02server-foo.conf
> ! </programlisting>

... this should be a <screen> rather than a <programlisting>.

> ! This sort of arrangement gives a unique name for each configuration file
> ! variation. This can help eliminate ambiguity when several servers have
> ! their configurations all stored in one place, such as in a version
> ! control repository. (Storing database configuration files under version
> ! control is another good practice to consider).
> ! </para>
> ! </sect2>
> ! </sect1>
>
> <sect1 id="runtime-config-file-locations">
> <title>File Locations</title>
> diff --git a/src/backend/utils/misc/guc-file.l b/src/backend/utils/misc/guc-file.l
> index a094c7a..ae33fdd 100644
> *** a/src/backend/utils/misc/guc-file.l
> --- b/src/backend/utils/misc/guc-file.l

> --- 421,427 ----
> return false;
> }
>
> ! config_file=AbsoluteConfigLocation(config_file,calling_file);

Fix whitespace.

> *************** ParseConfigFp(FILE *fp, const char *conf
> *** 599,604 ****
> --- 627,747 ----
> return OK;
> }
>
> + /*
> + * String comparison code for use by qsort. Borrowed from
> + * src/backend/tsearch/ts_utils.c
> + */
> + static int
> + comparestr(const void *a, const void *b)
> + {
> + return strcmp(*(char **) a, *(char **) b);
> + }
> +
> + /*
> + * Read and parse all config files in a subdirectory in alphabetical order
> + */
> + bool
> + ParseConfigDirectory(const char *includedir,
> + const char *calling_file,
> + int depth, int elevel,
> + ConfigVariable **head_p,
> + ConfigVariable **tail_p)
> + {
> + char *directory;
> + DIR *d;
> + struct dirent *de;
> + char **filenames = NULL;
> + int num_filenames = 0;
> + int size_filenames = 0;
> + bool status;
> +
> + directory=AbsoluteConfigLocation(includedir,calling_file);

Our normal pgindent runs do not visit flex input files. Manually run pgindent
on guc-file.l and merge the reindented versions of ParseConfigDirectory() and
AbsoluteConfigLocation() into your patch.

> + d = AllocateDir(directory);
> + if (d == NULL)
> + {
> + ereport(elevel,
> + (errcode_for_file_access(),
> + errmsg("could not open configuration directory \"%s\": %m",
> + directory)));
> + return false;
> +
> + }
> +
> + /*
> + * Read the directory and put the filenames in an array, so we can sort
> + * them prior to processing the contents.
> + */
> + while ((de = ReadDir(d, directory)) != NULL)
> + {
> + struct stat st;
> + char filename[MAXPGPATH];
> +
> + /*
> + * Only parse files with names ending in ".conf". Explicitly reject
> + * files starting with ".". This excludes things like "." and "..",
> + * as well as typical hidden files, backup files, and editor debris.
> + */
> + if (strlen(de->d_name) < 6)
> + continue;
> + if (strncmp(de->d_name, ".", 1) == 0)

I'd write this as "if (de->d_name[0] == '.')", but it's subjective.

> + continue;
> + if (strcmp(de->d_name + strlen(de->d_name) - 5, ".conf") != 0)
> + continue;
> +
> + join_path_components(filename, directory, de->d_name);

You need a canonicalize_path(), too. If the original config directory was
absolute, it has not yet seen canonicalization.

> + if (stat(filename, &st) == 0)
> + {
> + if (!S_ISDIR(st.st_mode))
> + {
> + /* Add file to list, increasing its size in blocks of 32 */
> + if (num_filenames == size_filenames)
> + {
> + size_filenames += 32;
> + if (num_filenames == 0)
> + /* Must initialize, repalloc won't take NULL input */
> + filenames = palloc(size_filenames * sizeof(char *));
> + else
> + filenames = repalloc(filenames, size_filenames * sizeof(char *));
> + }
> + filenames[num_filenames] = pstrdup(filename);
> + num_filenames++;
> + }
> + else

This "else" block connects to the wrong "if" block.

> + /*
> + * stat does not care about permissions, so the most likely reason
> + * a file can't be accessed now is if it was removed between the
> + * directory listing and now.
> + */
> + {

This comment belongs inside the brace.

There are a few other improbable causes. For one, the postmaster could have
read permission, but not search permission, on the include dir.

> + ereport(elevel,
> + (errcode_for_file_access(),
> + errmsg("configuration file \"%s\" can no longer be accessed: %m",
> + filename)));

Due to the diversity of possible causes, most of them unlikely, I would use
"could not stat file \"%s\": %m". That also avoids adding a string requiring
translation.

> + return false;
> + }
> + }
> + }
> +
> + if (num_filenames > 0)
> + {
> + int i;
> + qsort(filenames, num_filenames, sizeof(char *), comparestr);
> + for (i = 0; i < num_filenames; i++)
> + {
> + if (!ParseConfigFile(filenames[i], NULL,
> + 0, elevel,head_p, tail_p))

Need to pass "depth", not 0. I created a circular include structure and got
this error instead of the expected one about the nesting limit:

FATAL: exceeded MAX_ALLOCATED_DESCS while trying to open directory "/home/nm/test-pg/baz"

Thanks,
nm

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Smith 2011-12-13 18:57:01 Re: logging in high performance systems.
Previous Message Lionel Elie Mamane 2011-12-13 18:17:05 Re: LibreOffice driver 1: Building libpq with Mozilla LDAP instead of OpenLDAP