Is SPI safe to use in multi-threaded PL/Java?

Lists: pgsql-hackers
From: "MauMau" <maumau307(at)gmail(dot)com>
To: <pgsql-hackers(at)postgresql(dot)org>
Subject: Is SPI safe to use in multi-threaded PL/Java?
Date: 2014-03-08 09:31:24
Message-ID: D854220B41C34FBB94733445CDC62028@maumau
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello,

Is PL/Java safe to use in terms of its threading design? I'm going to ask
the PL/Java community about this too, but I'd ask for opinions here because
I believe people in this community have seasoned knowledge of OS and SPI.

To put the question in other words, is it safe to load a multi-threaded PL
library in the single-threaded backend process, if the PL only calls SPI in
the main thread?

PL/Java (pljava.so) is linked with the JNI (Java Native Interface) library,
libjvm.so, in JRE. libjvm.so is linked with libpthread.so, because Java VM
is multi-threaded. SO, "ldd pljava.so" shows libjvm.so and libpthread.so.
pljava.so doesn't seem to be built for multh-threading --- none
of -mt, -D_REENTRANT or -D_POSIX_C_SOURCE is specified when building it.

When the application calls Java stored function, pljava.so calls a function
in libjvm.so to create a JVM in the backend process, then invokes the
user-defined Java method in the main thread. The user-defined Java method
calls JDBC methods to access database. The JDBC method calls are translated
to backend SPI function calls through JNI.

The main thread can create Java threads using Java Thread API, and those
threads can call JDBC methods. However, PL/Java intercepts JDBC method
calls and serializes SPI calls. So, only one thread calls SPI functions at
a time. I'm wondering if this is the reason why PL/Java is safe for use.

What I'm concerned about is whether multi-threaded code (Java VM) can run
safely in a single-threaded code (postgres). I don't know what can be a
particular problem with PL/Java, but in general, the mixture of
single-threaded code and multi-threaded one seems to cause trouble around
handling errno, memory and file handles/pointers.

FYI, JNI specification says that the code called from Java VM should be
built for multi-threading as follows. But postgres is not.

http://docs.oracle.com/javase/7/docs/technotes/guides/jni/spec/design.html#wp9502

[Excerpt]
Compiling, Loading and Linking Native Methods
Since the Java VM is multithreaded, native libraries should also be compiled
and linked with multithread aware native compilers. For example, the -mt
flag should be used for C++ code compiled with the Sun Studio compiler. For
code complied with the GNU gcc compiler, the flags -D_REENTRANT
or -D_POSIX_C_SOURCE should be used. For more information please refer to
the native compiler documentation.

Regards
MauMau


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "MauMau" <maumau307(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Is SPI safe to use in multi-threaded PL/Java?
Date: 2014-03-08 16:28:21
Message-ID: 17633.1394296101@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"MauMau" <maumau307(at)gmail(dot)com> writes:
> Is PL/Java safe to use in terms of its threading design? I'm going to ask
> the PL/Java community about this too, but I'd ask for opinions here because
> I believe people in this community have seasoned knowledge of OS and SPI.

> To put the question in other words, is it safe to load a multi-threaded PL
> library in the single-threaded backend process, if the PL only calls SPI in
> the main thread?

When it breaks, we're not going to be concerned.

regards, tom lane


From: "MauMau" <maumau307(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Is SPI safe to use in multi-threaded PL/Java?
Date: 2014-03-11 11:11:38
Message-ID: 7E4EA2768DA348879F571FE8193A2503@maumau
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

From: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
> "MauMau" <maumau307(at)gmail(dot)com> writes:
>> To put the question in other words, is it safe to load a multi-threaded
>> PL
>> library in the single-threaded backend process, if the PL only calls SPI
>> in
>> the main thread?
>
> When it breaks, we're not going to be concerned.

I may not understand your nuance. Which of the following do you mean?

* PL/Java's design is dangerous in terms of the mixture of single- and
multi-threading, and we cannot be 100% sure whether there's really no
problem.

* SPI must not be used in multi-threaded process, even if only one thread
calls SPI functions at a time. So what we can say is that PL/Java is not
safe theoretically in terms of SPI.

Regards
MauMau


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "MauMau" <maumau307(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Is SPI safe to use in multi-threaded PL/Java?
Date: 2014-03-11 13:51:59
Message-ID: 11347.1394545919@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"MauMau" <maumau307(at)gmail(dot)com> writes:
> From: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
>> When it breaks, we're not going to be concerned.

> I may not understand your nuance. Which of the following do you mean?

> * PL/Java's design is dangerous in terms of the mixture of single- and
> multi-threading, and we cannot be 100% sure whether there's really no
> problem.

That, more or less. There is exactly zero provision in the Postgres
code for multiple threads to exist inside a backend process. It's
possible that PL/Java manages to completely insulate the Java world
from the C world, so that the C code never sees more than one thread.
But any leakage at all in that abstraction is probably going to cause
bugs; and as I said, we (PG hackers) are not going to consider such
bugs to be our problem.

On platforms where the standard libc supports threading (which is most,
these days), I'd be particularly worried about leakage along the path
java -> libc -> postgres. If libc becomes aware that there are multiple
threads executing inside the process, it's likely to change behaviors.

regards, tom lane


From: "MauMau" <maumau307(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Is SPI safe to use in multi-threaded PL/Java?
Date: 2014-03-12 11:16:00
Message-ID: 3E82FCBA4F7242FB826536A8B6390945@maumau
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

From: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
> That, more or less. There is exactly zero provision in the Postgres
> code for multiple threads to exist inside a backend process. It's
> possible that PL/Java manages to completely insulate the Java world
> from the C world, so that the C code never sees more than one thread.
> But any leakage at all in that abstraction is probably going to cause
> bugs; and as I said, we (PG hackers) are not going to consider such
> bugs to be our problem.
>
> On platforms where the standard libc supports threading (which is most,
> these days), I'd be particularly worried about leakage along the path
> java -> libc -> postgres. If libc becomes aware that there are multiple
> threads executing inside the process, it's likely to change behaviors.

I see... even Tom-san is suspicious about the PL/Java's design, or the use
of SPI from code linked with libpthread.so. I'll communicate this to the
PL/Java community.

Regards
MauMau