[ts-gen] Collecting opening prices for the SP500 [Was: Re: ... subtype 14]

Bill Pippin pippin at owlriver.net
Wed Jul 22 16:22:28 EDT 2009


My previous post dealt with support for price message tick subtype 14,
open price messages; now I'll respond to the query that led me to add
support for that feature.

I'll first interject some administrivia, then reply to the question of
using the shim to collect opening prices directly and at some length,
and finally summarize again at the end.

An off list correspondent has emailed me directly, using the email
address listed in the trading-shim sources.  Before I get to his
comments and question, let me note I'm delighted to respond to
ts-gen email, and that other email related to the trading-shim will
be kept private if the sender wishes to use our commercial support
email address, support at trading-shim.com .

Also please note the copyright notice at the top of the sources,
where I indicate that for email to pippin at trading-shim.com,
messages may indeed be extracted and relayed to our ts-gen
mailing list:

/*
 * shim: dbms-augmented command interpreter for Interactive Brokers' ...
 * ...  pippin at trading-shim.com, msgs may gate to the list
 *                                  ^^^^^^^^^^^^^^^^^^^^^^^^^
 * copyright (c) 2005-2008 Trading-shim.com, LLC  Columbus, OH
 * GPL version 3 or later, see COPYING for details
 */

I'm glad to file off email identification headers such as From->To,
In-reply-to, References, and Comments, but the text itself is fodder
for our mailing list, and I feel free to quote it below.  That's the
price of my reply; I want other people, including those who read the
archives, to be able to benefit from all our accumulated efforts to
make the trading-shim useful.

Now, the originator's message:

The off list poster comments on the shim, and asks about collecting
opening prices:
 
> I'm looking at the documentation for Shim. My initial impression, is
> that shim is a very potent product!

Thanks very much.  Please understand that the docs are very outdated, and
that the best documentation for how to use the shim is the example scripts
in the directory exs, the NEWS file, and this mailing list.
 
> I'm facing a particular task, and am trying to gauge whether shim is
> the right tool for this task. Specifically, I need to somehow retrieve
> the opening valuations of all SP500 stocks as close to the opening
> bell of the stock exchange, as possible (preferably a few seconds or
> so past the bell). Ideally, the valuations I retrieve should match
> historical data, as published by Yahoo on the next day.

> This task seems to have two aspects to it:
 
> 1. Trading data retrieval.
> 2. Opening valuation reconstruction off the trading data.

Your question raises many points, and I'll try to summarize at the
end of this post with an abbreviated decision tree.  For now, I'll
take the issues in turn.

If you are happy with the data provided by the IB tws api,
the most recent release of the shim can be used to solve your data
retrieval problem, as suggested by my previous post.

The second problem is the harder, as you indicate below, and so for
that let me point you to what I believe to be the best source for
more info, the TWSAPI mailing list, TWSAPI at yahoogroups.com .  If you
search the archives of that list, you will see that this issue has
been raised repeatedly.

As long as you decide the IB tws api suits your purposes as your
data provider --- which is by no means necessarily your best bet,
another topic often addressed on the yahoo list --- you then
would need to decide what part of the IB tws api to use to collect
your opening price info, in particular whether to use market data
or history requests.

This brings me to the third issue, immediacy, which is implied by
your goal statement.  Both request types require you to break your
use case query for the opening prices to the SP500 into individual
per-symbol requests, only a limited number of which can be active
at one time, and so this process will take some time.

One key issue for your use case is that your account will probably
be limited to 100 market data lines at any one time, so that if you
use market data requests, you'll need to cycle through the SP500,
cancelling some requests before you can obtain the rest of the
opening price info.

Or, if you use history queries, you'll be rate limited to one about
every 10 seconds or so, and you would probably need a bit more than
an hour and twenty minutes to collect your data.

You ideally would be able to collect data for 500 symbols via market
data requests in three minutes or less, although here again you are
at the mercy of IB's upstream servers, so if it takes longer, again
you're back to the issue of IB as your data provider.

> The second aspect is surprisingly non-trivial: Yahoo data provider
> seems to be using some kind of smoothing/filtering techniques to
> obtain the opening valuation of a stock.

I'll recap here the three obvious methods to determine open price using
the IB tws api. (There may well be others, again see the yahoo list
archives.)  The first two require that you collect and analyze
api messages --- whether market data or history is your choice ---
spanning the open time, looking at both time and volume to choose
which message determines the open price, while the last requires
that you merely trust IB's equivalent calculation to the Yahoo analysis
you mention above, which the IB tws can provide as a result of a
market data request.

The async return messages resulting from a market data request have
various subtypes, and selected instances, including the first 10
which have always been supported by the shim, are as follows (note
the zero based indexing):

   msg type     subtype index and name
   --------     ----------------------
    2. size     0.  bid size      
    1. price    1.  bid price     
    1. price    2.  ask price     
    2. size     3.  ask size      
    1. price    4.  last price    
    2. size     5.  last size     
    1. price    6.  high
    1. price    7.  low     
    2. size     8.  volume   
    1. price    9.  close   
        ...
    1  price   14.  open

My impression from the yahoo list is that analysis of market data
is the dominant approach to determining open price; start a market
data subscription prior to the opening bell, look for the volume to
indicate the true open, and average last price numbers for some
small interval around that time to determine the open price.

Note that this won't work for the typical IB account holder against
your use case of 500 symbols, since only 100 market lines can be in
use at any one time.  So, you'll probably want to consider accepting
IB's determination of the open price.  The alternative of using
history data is certainly feasible, though much more time consuming. 

As for accepting IB's idea of the open price via market data
subtype 14, you'll need to be sensitive to the current limitations 
for higher level api support: restrict such sessions to tick data,
that is commands such as are found in the exs/tick script, and for
other api features use another session at the standard api level 23.

Now, about implementation, given that you're collecting tick
subtype 14, open price, messages.  I'd suggest using a ruby
script, and you can start by modifying exs/past.30.rb, which
demonstrates how to collect back history for a corpus of
symbols, in its case symbols from the djia.  For your app,
you presumably have the symbols for the SP500, and you would
want to trigger IB tws api market data requests using the syntax
illustrated by exs/tick, for each of the SP500, being sure to
cancel earlier subs once you'd collected the open prices, since
otherwise the IB tws would reject subscriptions past the 100th.

> Given your experience with financial data retrieval, I would
> very much appreciate your comments on this matter. It would be
> particularly great, if you could advise me how to use shim for my
> task.

I'll summarize here:

As long as you decide to use IB and the IB tws api as your data
provider; accept the open price IB determines, via their tick
subtype 14 message; install the shim on a linux box, along with
mysql; create the symbols database the shim currently requires,
via say the sql/setup.sql script; write a simple ruby script
providing a table of the SP500 symbol names, generating
market data subscription/cancel requests and receiving the
replies, then: it should be feasible to collect opening prices
using the shim.

Note that such collection can be done at any time during the day;
the tick subtype price open message seems to be provided whenever
you start a tick subscription.

There may well be some gotchas in working with so many subscription
requests, and some symbols may be missing from the database.  In
either case, please let us know on the list.

You'll probably also want to experiment with computing your own
open price, looking for the volume spike at the open to determine
which last price message to use, and you may want to compare the
results you get with IB's history data, though as noted above
for now you'll need to collect that in a different session, using
api version level 23.
 
Thanks,

Bill
_______________________________________________

Direct messages to my personal email may still
gate to the list.  For those desiring privacy,
please use our commercial support email address,
support at trading-shim.com .



More information about the ts-general mailing list