Another way to look at the problem is: How do I sample a subset of size
K efficiently? A query like
SAMPLE 1000 OF
(SELECT * FROM mydata WHERE <some condition>)
should return 1000 random rows from the select statement so that two
consecutive evaluations of the query would only with very little
probability return the same 1000 rows.
(Yes, I know that "SAMPLE 1000 OF" is not valid SQL)