patch: General purpose utility functions used by the JSON data type

From: Joseph Adams <joeyadams3(dot)14159(at)gmail(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: patch: General purpose utility functions used by the JSON data type
Date: 2010-08-13 09:45:23
Message-ID: AANLkTimKGJgoY+03aFPwyzxGKoECvMQG1q_s8tCXX1aO@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I factored out the general-purpose utility functions in the JSON data
type code into a patch against HEAD. I have made a few changes to
them since I posted about them earlier (
http://archives.postgresql.org/pgsql-hackers/2010-08/msg00692.php ).

A summary of the utility functions along with some of my own thoughts
about them:

getEnumLabelOids
* Useful-ometer: ()-----------------------------------o
* Rationale: There is currently no streamlined way to return a custom
enum value from a PostgreSQL function written in C. This function
performs a batch lookup of enum OIDs, which can then be cached with
fn_extra. This should be reasonably efficient, and it's quite elegant
to use.

FN_EXTRA, FN_EXTRA_ALLOC, FN_MCXT
* Useful-ometer: ()--------------------o
* Rationale: Using fcinfo->flinfo->fn_extra takes a lot of
boilerplate. These macros help cut down the boilerplate, and the
comment explains what fn_extra is all about.

getTypeInfo
* Useful-ometer: ()---------------------------o
* Rationale: The get_type_io_data "six-fer" function is very
cumbersome to use, since one has to declare all the output variables.
The getTypeInfo puts the results in a structure. It also performs the
fmgr_info_cxt step, which is a step done after every usage of
get_type_io_data in the PostgreSQL code.
* Other thoughts: getTypeInfo also retrieves typcategory (and
typispreferred), which is rather ad-hoc. This benefits the JSON code
because to_json() uses the typcategory to figure out what type of JSON
value to convert something to (for instance, things in category 'A'
become JSON arrays). Other data types could care less about the
typcategory. Should getTypeInfo leave that step out?

pg_substring, pg_encoding_substring
* Useful-ometer: ()-------o
* Rationale: The JSONPath code uses it / will use it for extracting
substrings, which is probably not a very useful feature (but who am I
to say that). This function could probably benefit the
text_substring() function in varlena.c , but it would take a bit of
work to ensure it continues to comply with standards.

server_to_utf8, utf8_to_server, text_to_utf8_cstring,
utf8_cstring_to_text, utf8_cstring_to_text_with_len
* Useful-ometer: ()--------------o
* Rationale: The JSON data type operates in UTF-8 rather than the
server encoding because it needs to deal with Unicode escapes, but
individual Unicode characters can't be converted to/from the server
encoding simply and efficiently (as far as I know). These routines
made the conversions done by the JSON data type vastly simpler, and
they could simplify other data types in the future (XML does a lot of
server<->UTF-8 conversions too).

This patch doesn't include tests . How would I go about writing them?

I have made the JSON data type built-in, and I will post that patch
shortly (it depends on this one). The built-in JSON data type uses
all of these utility functions, and the tests for the JSON data type
pass.

Attachment Content-Type Size
json-util-01.patch application/octet-stream 17.7 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mike Fowler 2010-08-13 09:54:14 Re: patch: General purpose utility functions used by the JSON data type
Previous Message Boxuan Zhai 2010-08-13 08:25:47 Re: MERGE command for inheritance