patch: Add JSON datatype to PostgreSQL (GSoC, WIP)

From: Joseph Adams <joeyadams3(dot)14159(at)gmail(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: patch: Add JSON datatype to PostgreSQL (GSoC, WIP)
Date: 2010-07-23 06:18:46
Message-ID: AANLkTinqw2jXyS6v0vc_1PCd94L2NF2HvvRQJFU2j90V@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

This is a work-in-progress patch of my GSoC project: Add JSON datatype
to PostgreSQL. It provides the following:

* JSON datatype: A TEXT-like datatype for holding JSON-formatted
text. Although the JSON RFC decrees that a JSON text be an "object or
array" (meaning '"hello"' is considered invalid JSON text), this
datatype lets you store any JSON "value" (meaning '"hello"'::JSON is
allowed).
* Validation: Content is validated when a JSON datum is constructed,
but JSON validation can also be done programmatically with the
json_validate() function.
* Conversion to/from JSON for basic types. Conversion functions are
needed because casting will not unwrap JSON-encoded values. For
instance, json('"string"')::text is '"string"', while
from_json('"string"') is 'string'. Also, to_json can convert
PostgreSQL arrays to JSON arrays, providing a nice option for dealing
with arrays client-side. from_json currently can't handle JSON
arrays/objects yet (how they should act is rather unclear to me,
except when array dimensions and element type are consistent).
* Retrieving/setting values in a JSON node (via selectors very
similar to, but not 100% like, JSONPath as described at
http://goessner.net/articles/JsonPath/ ).
* Miscellaneous functions json_condense and json_type.

This is a patch against CVS HEAD. This module compiles, installs, and
passes all 8 tests successfully on my Ubuntu 9.10 system. It is
covered pretty decently with regression tests. It also has SGML
documentation (the generated HTML is attached for convenience).

Although I am aware of many problems in this patch, I'd like to put it
out sooner rather than later so it can get plenty of peer review.
Problems I'm aware of include:
* Probably won't work properly when the encoding (client or server?)
is not UTF-8. When encoding (e.g. with json_condense), it should (but
doesn't) use \uXXXX escapes for characters the target encoding doesn't
support.
* json.c is rather autarkic. It has its own string buffer system
(rather than using StringInfo) and UTF-8 validator (rather than using
pg_verify_mbstr_len(?) ).
* Some functions/structures are named suggestively, as if they belong
to (and would be nice to have in) PostgreSQL's utility libraries.
They are:
- TypeInfo, initTypeInfo, and getTypeInfo: A less cumbersome
wrapper around get_type_io_data.
- FN_EXTRA and FN_EXTRA_SZ: Macros to make working with
fcinfo->flinfo->fn_extra easier.
- enumLabelToOid: Look up the Oid of an enum label; needed to
return an enum that isn't built-in.
- utf8_substring: Extract a range of UTF-8 characters out of a UTF-8 string.
* Capitalization and function arrangement are rather inconsistent.
Braces are K&R-style.
* json_cleanup and company aren't even used.
* The sql/json.sql test case should be broken into more files.

P.S. The patch is gzipped because it expands to 2.6 megabytes.

Joey Adams

Attachment Content-Type Size
json.html text/html 8.4 KB
json-datatype-wip-01.diff.gz application/x-gzip 61.0 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2010-07-23 06:22:12 Re: patch (for 9.1) string functions
Previous Message Pavel Stehule 2010-07-23 06:16:03 Re: SQL/MED security