Re: [v9.1] sepgsql - userspace access vector cache

From: Kohei Kaigai <Kohei(dot)Kaigai(at)EMEA(dot)NEC(dot)COM>
To: Yeb Havinga <yebhavinga(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PgHacker <pgsql-hackers(at)postgresql(dot)org>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>, Kohei Kaigai <Kohei(dot)Kaigai(at)EMEA(dot)NEC(dot)COM>
Subject: Re: [v9.1] sepgsql - userspace access vector cache
Date: 2011-07-21 09:29:40
Message-ID: D0C1A1F8BF513F469926E6C71461D9EC01E24C@EX10MBX02.EU.NEC.COM
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

The attached patch is revised userspace-avc patch.

List of updates:
- The GUC of sepgsql.avc_threshold was removed.
- "char *ucontext" of avc_cache was replaced by "bool tcontext_is_valid".
- Comments added onto static variables
- Comments of sepgsql_avc_unlabeled() was revised.
- Comments of sepgsql_avc_compute() was simplified.
- Comments of sepgsql_avc_check_perms_label() also mention about
permissive domain, that performs similar to system's permissive mode.
- selinux_status_close() become invoked on on_proc_exit() hook.

Thanks,
--
NEC Europe Ltd, SAP Global Competence Center
KaiGai Kohei <kohei(dot)kaigai(at)emea(dot)nec(dot)com>

> -----Original Message-----
> From: Kohei Kaigai
> Sent: 20. Juli 2011 17:04
> To: 'Yeb Havinga'; Kohei KaiGai
> Cc: Robert Haas; PgHacker
> Subject: RE: [HACKERS] [v9.1] sepgsql - userspace access vector cache
>
> Yeb, Thanks for your volunteering.
>
> > On 2011-07-14 21:46, Kohei KaiGai wrote:
> > > Sorry, the syscache part was mixed to contrib/sepgsql part
> > > in my previous post.
> > > Please see the attached revision.
> > >
> > > Although its functionality is enough simple (it just reduces
> > > number of system-call invocation), its performance
> > > improvement is obvious.
> > > So, I hope someone to volunteer to review these patches.
> > This is a review of the two userspace access vector cache patches for
> > SE-PostgreSQL. Proofreading the source I've seen mostly cosmetic things.
> > Though I have a few questions, I think the overal shape of the patch is
> > good enough to mark it ready for comitter.
> >
> > Remarks:
> >
> > * The patches apply cleanly, compile cleanly on Fedora 15. It depends on
> > libselinux-2.0.99, and that is checked on by the configure script: good.
> >
> > * I run SELECT sepgsql_restorecon(NULL) and saw comparable numbers in
> > speed gain and also backend process memory increase as indicated by
> > KaiGai-san. The vmsize without the patch increases from running
> > restorecon 3864kB, with the patch is increases 8160kB, a difference of
> > ~5MB. Where this change comes from is unclear to me. The avc_cache holds
> > only 7 entries, and the avc memory context stats indicate it's only at
> > 8kB. Investigation into the SECLABELOID syscache revealed that even when
> > that cache was set to a #buckets of 2, still after restorecon() the
> > backend's vmsize increased about the ~5MB more.
> >
> The sepgsql_restorecon(NULL) assigns default security label on all the
> database objects being controlled, thus, its workload caches security
> label (including text data) of these objects.
> So, ~5MB of difference is an upper limit of syscache usage because of
> SECLABELOID.
> The number of buckets is independent from the memory consumption.
>
> > * The size of SECLABELOID is currently set to 2048, which is equal to
> > sizes of the pg_proc and pg_attribute caches and hence makes sense. The
> > initial contents of pg_seclabel is 3346 entries. Just to get some idea
> > what the cachesize means for restorecon performance I tested some
> > different values (debugging was on). Until a size of 256, restorecon had
> > comparable performance. Small drop from ~415 to ~425 from 128 to 64.
> > Cache of 32 had ~435ms performance. Cache of 2 had 680ms. Without
> > debugging symbols, I also got a slightly better performance on the
> > restorecon call with a 1024 syscache size. This might be something to
> > tweak in later versions, when there is more experience what this cache
> > size means for performance on real databases.
> >
> Thanks for your research.
> The reason why I set 2048 is just a copy from the catalog which may
> hold biggest number of entries (pg_proc). It may be improved later.
>
> > * The cache is called userspace, presumably because it isn't cached in
> > shared memory? If sepgsql is targeted at installations with many
> > different kinds of clients (having different subject contexts), or
> > access different objects, this is a good choice.
> >
> SELinux's kernel implementation also has access vector cache:
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=blob;f=security/selinux/avc.c
>
> The reason why I didn't choose shared memory is the length of security
> label text is variable, so it is not feasible to acquire a fixed length
> region on shared memory in startup time.
>
> > * Checked that the cache keeps working after errors, ok.
> >
> > * uavc.c defines some static variables like lru_hint, threshold and
> > unlabeled. It would be nice to have a small comment with each one that
> > says what the variable represents (like is done for the avc_cache struct
> > members right above it)
> >
> OK, I'll add comments here.
>
> > * I had to google SELINUX_AVD_FLAGS_PERMISSIVE to understand what was
> > going on. Since the goal why it is recorded a domain is permissive, is
> > to prevent flooding the log, maybe renaming avc_cache.permissive into
> > avc_cache.log_violations_once would explain itself.
> >
> It is a bit concern for me, because the permissive domain means it shall
> be performed like as system is configured as permissive mode.
> The behavior of log-violation-at-once is a property of permissive mode.
> So, how about an idea to add comments about permissive domain around
> the code to modify cache->allowed?
>
> > * selinux_status_open() is called and it's man page mentions
> > selinux_status_close(). It currently unmaps memory and closes a file
> > descriptor, which is done on process exit anyway. I don't have enough
> > experience with these kinds of system calls to have a feeling whether it
> > might change in the future (and in such cases I'd have added a call to
> > selinux_status_close() on proc_exit, just to be on the safe side).
> >
> I omitted it because OS will handle these things correctly on process
> Exit time, however, it is good idea to move on safe side.
>
> > * sepgsel_avs_unlabeled(). I was confused a bit because objects can have
> > a label 'unlabeled', and the comments use wording like 'when unlabeled'.
> > Does this mean with no label, or with a label 'unlabeled'?
> >
> It means object has no label. I'll fix this comment.
>
> > * sepgsql_avc_compute(): the large comment in the beginning is hard to
> > follow since sentences seem to have been a bit mixed up.
> >
> OK, I'll simplify the comment. How about your feeling about?
>
> /*
> * Validation check of the supplied security context.
> * Because it always invoke system-call, frequent check should be avoided.
> * Unless security policy is reloaded, validation status shall be kept,
> * so we also cache whether the supplied security context was valid, or not.
> */
>
> > * question: why is ucontext stored in avc_cache?
> >
> The 'tcontext' has to be kept on avc_cache as-is, even if the security label
> is not valid according to security_check_context_raw(), because it is used to
> hash-key.
> If your point is that validation status should be kept in bool, not char *,
> it is fair enough for me.
>
> > * there is a new guc parameter for the user: avc_threshold, which is
> > also documented. My initial question after reading the description was:
> > what are the 'items' and how can I see current usage / what are numbers
> > to expect? Without that knowledge any educated change of that parameter
> > would be impossible. Would it be possible to remove this guc?
> >
> Hmm, it might require users deep knowledge about inside of sepgsql,
> although it is useful to few situations.
> OK, I'll remove it in this version.
>
> > * avc_init has magic numbers 384, 4096. Maybe these can be defines (with
> > a small comment what it is?)
> >
> I'll add a comment around static variable of avc_threshold.
>
> > * Finally, IMO the call sites, e.g. check_relation_priviliges, have
> > improved in readability with this patch.
> >
>
> Regarding to the point you suggested, I'll revise my patch within a day.
>
> Thanks,
> --
> NEC Europe Ltd, SAP Global Competence Center
> KaiGai Kohei <kohei(dot)kaigai(at)emea(dot)nec(dot)com>

Attachment Content-Type Size
pgsql-v9.2-uavc-selinux.v5.patch application/octet-stream 36.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kohei Kaigai 2011-07-21 10:16:06 Environment checks prior to regression tests?
Previous Message Yeb Havinga 2011-07-21 08:00:36 Re: [v9.1] sepgsql - userspace access vector cache