[ts-gen] version update/database migration

Bill Pippin pippin at owlriver.net
Fri Aug 22 20:31:53 EDT 2008


Sam (<sam_backtester at yahoo.fr>) asks about upgrading the database:

> I'm using the shim release 080616 ... for getting intraday
> historical data.  ... [what happens] if I upgrade the Shim?

As long as you dump your data in the appropriate format, and then
are able to restore it, you're fine.

Three basic points, given that you used the default setup and create
scripts:

    1.  There are two databases, testing and trading, and you probably
        want to consolidate all your good data into one, say trading,
        duplicate it in testing, and unload it from testing using the
        load script near the end of this post.  Once the data has
        been consolidated and duplicated you can easily recover from
        mistakes.

    2.  Next, dump from testing to a file, about which more in the
        sequel.  This dump step gives another layer of safety.

    3.  After that you can run create.sql on testing, giving yourself
        two databases, one still holding all your data, and that you
        plan to leave alone for now, and the other with only the
        initial symbol load, where you can experiment to your heart's
        content.

Throughout the rest of these two recipes, I'll expect that you've
used the approach of step (1) to reduce the save/restore problem
from two databases to one.

Though for your database all you care about is HistoryBar, your
goal in general would be to save eight tables:

    HistoryBar  AccountCode
    CreateEvent ChangeOrder
    OrderStatus ActiveOrder
    OrderResult OrderReport

The overall approach is to dump the tables to be saved, recreate
the database, and then load the data back in.  There are two
cases A and B as there have been significant changes to the structure
of any of the tables to be saved or not.

First, for the simple case where the tables' attribute names and
index relationships are consistent from the old version to the new,
save/restore boils down to:

    A.1  use the dump script below to save

    A.2  use create.sql to build a new, fresh database

    A.3  use the load script below to load

After using this approach, and running the dump script again to
save to a new file name, diff tells me that the only difference
is in the datestamps:

< -- Dump completed on 2008-08-22 22:31:42
---
> -- Dump completed on 2008-08-22 22:38:03

Unfortunately, from June until now there have been signficant
changes to the database design, including in particular to
HistoryBar, so that the tables' attribute names and index
relationships are *not* consistent from the old version to the
new, and case (B) applies.

In what follows I'll concentrate on HistoryBar, since that is
the example at hand; the generalization to all eight tables
is bulky but straight-forward.

The 080616 system HistoryBar create table statement defines the
contract minimally, though with a dependence on foreign key
stability:

>     sym         int unsigned not null references     Product(uid),
>     ecn         int unsigned not null references    Exchange(uid),

More recently, we've denormalized the data slightly, to add
redundancy and protect against product renumbering as symbols are
added or deleted, to help ensure that the history data and the
journal have stable, consistent meanings over time.  For HistoryBar,
this meant adding the IB contract id as an attribute.

>     ibc         int unsigned not null references      Product(ibc),
>     sym         int unsigned not null references      Product(uid),
>     ecn         int unsigned not null references     Exchange(uid),

For the second table design, and since Product has a one-to-one
unique key relationship between the IB contract id, ibc, and the
Product record uid, here the foreign key "sym", as long as IB
doesn't change their contract ids, saved data can be recovered
even when symbols are added and deleted.
 
The general approach given such non-trivial table changes is to:

    B.1  Provide new table names via the alter table
         statement, for HistoryBar say to HistoryTmp.

    B.2  Ensure that HistoryTmp can be added to your new
         database by leaving the --no-create-info
         option *out* of the dump script below, so that
         the dump'd file has a complete create table
         definition inside;

    B.3  Dump the data from the table with the altered name,
         here HistoryTmp.

    B.4  Run create.sql on testing to get a fresh database.

    B.5  Obtain a newly created table HistoryTmp loaded
         with all your old 1-minute bars by loading the
         dump file.

    B.6  Insert the data into the new, empty HistoryBar table
         via a statement similar to the following:

         insert into HistoryBar(ibc, sym,    ecn, bid, time, open,
                                high, low, close, vol, wap, has_gaps)

         select                 Product.ibc,
                                Product.uid, ecn, bid, time, open,
                                high, low, close, vol, wap, has_gaps

           from HistoryTmp,      Product
          where HistoryTmp.ibc = Product.ibc;

Note that from the first to the second recipes, steps A.1, A.2,
and A.3 correspond loosely to B.3, B.4, and B.5.  The history table
name has changed, and the load script has been modified by the
deletion of the --no-create-info option.

The dump and load scripts are themselves simple to the point of
being trivial, although as explained above you will need to make:
major changes if database version changes have been sufficiently
extensive; and in any case, minor changes depending on your setup.
Note in particular that the single-dash options, that is -h, -u, -p
and the like, are likely to change.

I've include these scripts in the newest release;
herewith sql/bin/dump 
     and sql/bin/load:

-------- cut-here --------
#!/bin/bash

# ::::::::::::::
# load
# ::::::::::::::
#
# Import journal and other accumulated shim data.
# Adapt as needed for database version changes, and
# in particular, correct the Load and Host values.

Load=journal.080822.sql
Host=xps400

mysql     -v -p -h $(Host) -u root       testing	< $Load

-------- cut-here --------
#!/bin/bash

# ::::::::::::::
# dump
# ::::::::::::::
#
# Export journal and other accumulated shim data.
# Adapt as needed for database version changes, and
# in particular, change the Host value to suit.

Host=xps400
Save=$(date +%y%m%d)
Opts="--skip-opt --add-locks --allow-keywords		\
--create-options --extended-insert --single-transaction	\
--no-create-info --tz-utc"

mysqldump -v -p -h $(Host) -u root $Opts testing	\
	HistoryBar  AccountCode				\
        CreateEvent ChangeOrder				\
        OrderStatus ActiveOrder				\
        OrderResult OrderReport				> journal.${Save}.sql 

echo    saved:						  journal.${Save}.sql 
-------- cut-here --------

Cheers,

Bill


More information about the ts-general mailing list