Re: Stock Market Price Data & postgreSQL? HELLPPP Please

Lists: pgsql-general
From: Adrian Klaver <aklaver(at)comcast(dot)net>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: pgsql-general(at)postgresql(dot)org, Sanjay Arora <sanjay(dot)k(dot)arora(at)gmail(dot)com>
Subject: Re: Stock Market Price Data & postgreSQL? HELLPPP Please
Date: 2009-08-19 21:41:38
Message-ID: 350809696.1795001250718098536.JavaMail.root@sz0030a.emeryville.ca.mail.comcast.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general


----- "Greg Stark" <gsstark(at)mit(dot)edu> wrote:

> On Wed, Aug 19, 2009 at 9:22 PM, Sanjay
> Arora<sanjay(dot)k(dot)arora(at)gmail(dot)com> wrote:
> > - This is Time Series Data (I don't know what that is except that
> it
> > relates to data marked/related to time) and not suited to a RDBMS.
> > - You need it in Esper (a CEP engine used by Marketcetera, an open
> > source trading platform) which needs to get a data stream, whereas
> an
> > RDBMS will return a recordset.
> > - A RDBMS including postgreSQL simply cannot handle the amount of
> > real-time data coming in from a stock exchange.
> > - A RDBMS simply cannot provide data, even if you build streams
> from
> > recordsets returned, at the speeds required by any technical
> analysis
> > charting program.
>
> There are whole companies doing nothing but providing streaming RDBMS
> used by traders, airline reservation systems, etc. c.f Streambase,
> and
> Truviso amongst others.
>
> I'm not aware of any open source streaming databaes so for a
> shoestring budget you're going to be limited to using existing tools.
> There are certainly people who scale up Postgres to high oltp traffic
> but it does require time and money as you scale up, TANSTAAFL... I'm
> sure it can be done, Postgres is very flexible, though whether it's a
> perfect fit with your needs or not I can't say.
>
> You might also consider whether one of the existing streaming
> database
> system providers would give you a development license for free or on
> a
> trial basis if you hope to be making big money down the road.
>
> --
> greg
> http://mit.edu/~gsstark/resume.pdf

In the its a small world category I just ran across a streaming database called Telegraph(http://telegraph.cs.berkeley.edu/telegraphcq/v2.1/) yesterday. It is based on Postgres 7.3.2, so approach with caution.

Adrian Klaver
aklaver(at)comcast(dot)net


From: Sanjay Arora <sanjay(dot)k(dot)arora(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Stock Market Price Data & postgreSQL? HELLPPP Please
Date: 2009-08-19 22:00:23
Message-ID: 7e41ba8f0908191500p4017fc8em2c50d7ae2db2cdc6@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Thu, Aug 20, 2009 at 3:11 AM, Adrian Klaver<aklaver(at)comcast(dot)net> wrote:
>
>
> In the its a small world category I just ran across a streaming database called Telegraph(http://telegraph.cs.berkeley.edu/telegraphcq/v2.1/) yesterday. It is based on Postgres 7.3.2, so approach with caution.
>
Thanks Adrian,

Very interesting. The pages seem to have moved. I had to read them in
google cache. Will email them regarding the project roadmap and if
they will be pursuing it for some time to come or is it only a minor
research project, that will stop as soon as the research paper is
submitted. Mailing list of the project though shows mails are
decreasing to almost one per day.

However, it threw a very interesting question for me. It was based on
postgres itself, v. 7.2 but thats beside the point.

What exactly the difference in layman sys admin terms between our
everyday postgres and this time series data. postgres supports geo
data types and many others. Whats the big issue in not having this
sort of time series data type and query as a normal support by
postgres?

It would have a big feasibility in stock market data storage and
analysis. I am sure there would be some other major scientific
applications like real time data acquisition & analysis & etc.

So why is this sort of application not supported in postgres natively?
First time, I've come across a db application that I cannot handle
with postgres in my 9 years of using postgres..

Best regards.
Sanjay.


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Adrian Klaver <aklaver(at)comcast(dot)net>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, pgsql-general(at)postgresql(dot)org, Sanjay Arora <sanjay(dot)k(dot)arora(at)gmail(dot)com>
Subject: Re: Stock Market Price Data & postgreSQL? HELLPPP Please
Date: 2009-08-20 00:00:10
Message-ID: 20090820000009.GK4894@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Adrian Klaver escribió:

> In the its a small world category I just ran across a streaming database called Telegraph(http://telegraph.cs.berkeley.edu/telegraphcq/v2.1/) yesterday. It is based on Postgres 7.3.2, so approach with caution.

TelegraphCQ became Truviso.

I clearly remember somebody talking about how Truviso was all set up to
solve the stock ticker problem, but if you go to their site now they
don't mention stock at all. Does this mean anything?

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Adrian Klaver <aklaver(at)comcast(dot)net>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, pgsql-general(at)postgresql(dot)org, Sanjay Arora <sanjay(dot)k(dot)arora(at)gmail(dot)com>
Subject: Re: Stock Market Price Data & postgreSQL? HELLPPP Please
Date: 2009-08-20 00:01:31
Message-ID: alpine.GSO.2.01.0908191932420.25972@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Wed, 19 Aug 2009, Adrian Klaver wrote:

> In the its a small world category I just ran across a streaming database
> called Telegraph(http://telegraph.cs.berkeley.edu/telegraphcq/v2.1/)
> yesterday. It is based on Postgres 7.3.2, so approach with caution.

TelegraphCQ was the research project that Truviso, the company where I
work, originated from. Having taken considerably more mature code of this
type than what's posted there and applied it to market data, I can tell
you that just having a streaming database doesn't make that many of the
problems go away. Streaming database technology certainly helps compute
technical indicators and such more efficiently than having to pull the
data back out of the database, but that's only a small subset of what you
have to do in order to build a complete trading system.

As for Sanjay's project, it sounds like it's underfunded by between two
and four orders of magnitude. Before getting involved with Truviso, I
once wrote a PostgreSQL based automatic trading system that pulled in
streaming market data and made trading decisions based on it, with
backtesting and everything. Took about six months of full time work to
get the first version working, and I doubt that approach would scale up to
more than a few hundred active symbols even on modern hardware.

The libraries provided by Marketcetera would give you a big head start on
the coding compared to where I began at, but without a serious development
budget you're going to be stuck taking somebody's entire software stack
as-is. You should be asking programatic traders where there are complete
applications ready to go here, not asking about the database details
because you're really not going to have any choice besides just using
whatever the app is designed around. Given your lack of developent
background, you sound more like a Tradestation or Metastock customer than
someone who is going to assemble this app yourself.

Sorry to sound like a downer here, but your problem is much bigger than
you think it is and I'm not sure where to even begin sorting out the bits
you're going to need but don't have yet.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD


From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Sanjay Arora <sanjay(dot)k(dot)arora(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Stock Market Price Data & postgreSQL? HELLPPP Please
Date: 2009-08-20 00:15:12
Message-ID: alpine.GSO.2.01.0908192003470.25972@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Thu, 20 Aug 2009, Sanjay Arora wrote:

> What exactly the difference in layman sys admin terms between our
> everyday postgres and this time series data.

There is a good introduction to where regular databases fail to solve
time-series data problems at http://cs.nyu.edu/shasha/papers/jagtalk.html

As mentioned there, one of the very hard problems to solve in a general
way is how to deal with all the gaps in your data when the market isn't
open, which really complicates how you compute indicators. For example,
in regular SQL you might compute an average over some period using
something like:

select sum(x) / count(*) where ts>=t1 and ts<=t2

You might think that you could pick t1 and t2 here based on the number of
samples you want to average; let's say you want an average over the last
minute of data, so you try this:

t1=<something>
t2=<something> + interval('1 minute')

This completely falls apart when when the data isn't continuous. If the
market was closed for some period between t1 and t2, you need to use a
rolling window over the data you've got instead. As of PostgreSQL 8.4's
support for SQL window fuctions, it's easier to compute this sort of
thing, but there's still plenty of rough spots to get nailed by.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD