Re: [PATCH] Add transforms feature

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Add transforms feature
Date: 2014-04-04 22:21:47
Message-ID: 20140404222147.GB13431@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2014-01-15 21:13:18 -0500, Peter Eisentraut wrote:
> The attached patch will probably fail to apply because of the pg_proc
> changes. So if you want to try it out, look into the header for the Git
> hash it was based off. I'll produce a properly merged version when this
> approach is validated. Also, some implementation details could probably
> take some revising after a couple of nights of sleep. ;-)

You've slept since? ;)

So, I am only doign a look through the patch, to see where it has gone
in the past year.

> index 4e476c3..5ac9f05 100644
> --- a/src/Makefile.shlib
> +++ b/src/Makefile.shlib
> @@ -133,7 +133,7 @@ ifeq ($(PORTNAME), darwin)
> else
> # loadable module
> DLSUFFIX = .so
> - LINK.shared = $(COMPILER) -bundle -multiply_defined suppress
> + LINK.shared = $(COMPILER) -bundle -multiply_defined suppress -Wl,-undefined,dynamic_lookup
> endif
> BUILD.exports = $(AWK) '/^[^\#]/ {printf "_%s\n",$$1}' $< >$@
> exports_file = $(SHLIB_EXPORTS:%.txt=%.list)

Hm. Do the linkers on other platforms support this behaviour? Linux
does, by default, but I have zap clue about the rest.

Why do we need this? I guess because a transform from e.g. hstore to $pl
needs symbols out of hstore.so and the $pl?

I wonder if it woudln't be better to rely on explicit symbol lookups for
this. That'd avoid the need for the order dependency and linker changes
like this.

> + case OBJECT_TRANSFORM:
> + {
> + TypeName *typename = (TypeName *) linitial(objname);
> + char *langname = (char *) linitial(objargs);
> + Oid type_id = typenameTypeId(NULL, typename);
> + Oid lang_id = get_language_oid(langname, false);
> +
> + address.classId = TransformRelationId;
> + address.objectId =
> + get_transform_oid(type_id, lang_id, missing_ok);
> + address.objectSubId = 0;
> + }
> + break;

Hm. I wonder if missing_ok should be forwarded to get_language_oid() and
(by changing the way things are done atm) to typenameTypeId?

> + case OCLASS_TRANSFORM:
> + {
> + HeapTuple trfTup;
> + Form_pg_transform trfForm;
> +
> + trfTup = SearchSysCache1(TRFOID,
> + ObjectIdGetDatum(object->objectId));
> + if (!HeapTupleIsValid(trfTup))
> + elog(ERROR, "could not find tuple for transform %u",
> + object->objectId);
> +
> + trfForm = (Form_pg_transform) GETSTRUCT(trfTup);
> +
> + appendStringInfo(&buffer, _("transform for %s language %s"),
> + format_type_be(trfForm->trftype),
> + get_language_name(trfForm->trflang, false));
> +
> + ReleaseSysCache(trfTup);
> + break;
> + }
> +

Why deviate from the usual 'cache lookup failed for ..'? elog doesn't
translate so it's not particular relevant, but ...

> referenced.objectSubId = 0;
> recordDependencyOn(&myself, &referenced, DEPENDENCY_NORMAL);
>
> + /* dependency on transform used by return type, if any */
> + if ((trfid = get_transform_oid(returnType, languageObjectId, true)))
> + {
> + referenced.classId = TransformRelationId;
> + referenced.objectId = trfid;
> + referenced.objectSubId = 0;
> + recordDependencyOn(&myself, &referenced, DEPENDENCY_NORMAL);
> + }
> +

Should be compared to InvalidOid imo, rather than implicitly assuming
that InvalidOid evaluates to false.

> +/*
> + * CREATE TRANSFORM
> + */
> +Oid
> +CreateTransform(CreateTransformStmt *stmt)
> +{
...
> + if (!pg_type_ownercheck(typeid, GetUserId()))
> + aclcheck_error_type(ACLCHECK_NOT_OWNER, typeid);
> +
> + aclresult = pg_type_aclcheck(typeid, GetUserId(), ACL_USAGE);
> + if (aclresult != ACLCHECK_OK)
> + aclcheck_error_type(aclresult, typeid);
> +
> + /*
> + * Get the language
> + */
> + langid = get_language_oid(stmt->lang, false);
> +
> + aclresult = pg_language_aclcheck(langid, GetUserId(), ACL_USAGE);
> + if (aclresult != ACLCHECK_OK)
> + aclcheck_error(aclresult, ACL_KIND_LANGUAGE, stmt->lang);

Hm. Is USAGE really sufficient here? Should we possibly make it
dependant on lanpltrusted like CreateFunction() does?

> + if (stmt->fromsql)
> + {
> + fromsqlfuncid = LookupFuncNameTypeNames(stmt->fromsql->funcname, stmt->fromsql->funcargs, false);
> +
> + if (!pg_proc_ownercheck(fromsqlfuncid, GetUserId()))
> + aclcheck_error(ACLCHECK_NOT_OWNER, ACL_KIND_PROC, NameListToString(stmt->fromsql->funcname));

Why isn't EXECUTE sufficient here?

> + aclresult = pg_proc_aclcheck(fromsqlfuncid, GetUserId(), ACL_EXECUTE);
> + if (aclresult != ACLCHECK_OK)
> + aclcheck_error(aclresult, ACL_KIND_PROC, NameListToString(stmt->fromsql->funcname));
> +
> + tuple = SearchSysCache1(PROCOID, ObjectIdGetDatum(fromsqlfuncid));
> + if (!HeapTupleIsValid(tuple))
> + elog(ERROR, "cache lookup failed for function %u", fromsqlfuncid);
> + procstruct = (Form_pg_proc) GETSTRUCT(tuple);
> + if (procstruct->prorettype != INTERNALOID)
> + ereport(ERROR,
> + (errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
> + errmsg("return data type of FROM SQL function must be \"internal\"")));
> + check_transform_function(procstruct);
> + ReleaseSysCache(tuple);

So, this can be used to call a function that takes INTERNAL, and returns
INTERNAL. Isn't that normally reserved for superusers? I think this at
the very least needs to be an ownercheck on the function?

> @@ -66,6 +66,7 @@ CATALOG(pg_proc,1255) BKI_BOOTSTRAP BKI_ROWTYPE_OID(81) BKI_SCHEMA_MACRO
> text proargnames[1]; /* parameter names (NULL if no names) */
> pg_node_tree proargdefaults;/* list of expression trees for argument
> * defaults (NULL if none) */
> + Oid protrftypes[1] /* types for which to apply transforms */
> text prosrc; /* procedure source text */
> text probin; /* secondary procedure info (can be NULL) */
> text proconfig[1]; /* procedure-local GUC settings */

I wonder if this shouldn't rather be a array that lists the transform to
be used for every single column akin to proargtypes or such. That's
going to make life easier for pl developers.
> /**********************************************************************
> @@ -1260,6 +1264,7 @@ static SV *plperl_call_perl_func(plperl_proc_desc *desc,
> bool *isnull)
> {
> FmgrInfo tmp;
> + Oid funcid;
>
> /* we might recurse */
> check_stack_depth();
> @@ -1283,6 +1288,8 @@ static SV *plperl_call_perl_func(plperl_proc_desc *desc,
> /* must call typinput in case it wants to reject NULL */
> return InputFunctionCall(finfo, NULL, typioparam, typmod);
> }
> + else if ((funcid = get_transform_tosql(typid, current_call_data->prodesc->lang_oid)))
> + return OidFunctionCall1(funcid, PointerGetDatum(sv));
> else if (SvROK(sv))
> {
> /* handle references */

Am I missing something here? You're not looking at the proc's
transforms, but just lookup the general ones? Same for output and such.

Looks like you forgot to update this.

> for (i = 0; i < desc->nargs; i++)
> {
> if (fcinfo->argnull[i])
> @@ -2055,9 +2075,16 @@ static SV *plperl_call_perl_func(plperl_proc_desc *desc,
> else
> {
> SV *sv;
> + Oid funcid;
>
> if (OidIsValid(desc->arg_arraytype[i]))
> sv = plperl_ref_from_pg_array(fcinfo->arg[i], desc->arg_arraytype[i]);
> + else if (list_member_oid(current_call_data->prodesc->trftypes, argtypes[i]))
> + {
> + funcid = get_transform_fromsql(argtypes[i], current_call_data->prodesc->lang_oid);
> + Assert(funcid); // TODO
> + sv = (SV *) DatumGetPointer(OidFunctionCall1(funcid, fcinfo->arg[i]));
> + }

This is the behaviour I'd really would like to avoid. Searching an array
for every parameter sucks. I think this really should be a vector with
one element per argument. Yes, that's going to require special handling
of the return type, but whatever.

So, I lost my motiviation here. I like this version *much* more than the
state of a year ago. I think there's a fair amount of work required
here, but it seems to be dilligence that's required, not redesign. But I
still don't think that's doable within the next couple of days?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2014-04-04 22:43:43 Re: [review] PostgreSQL Service on Windows does not start if data directory given is relative path
Previous Message Greg Stark 2014-04-04 22:13:40 Idea for aggregates