[ts-gen] cross session orders, client ids

pippin at owlriver.net pippin at owlriver.net
Thu Dec 18 12:01:05 EST 2008


Three updates from my message of yesterday:

1.  If you are still seeing problems, and in particular if your session
is ending with a 524 exit, please consider running the shim-081218.tgz
tarball I just released; I've added some print statements at the point
where failure to read in account data is detected, to display some of
the program state at that point.  And, as I wrote yesterday, if you'd
be so kind as to post:

    a.  the text the shim displays when it exits; and
    b.  notes about any exceptional state
        with respect to the log files, e.g., 
        missing log directory, or missing log files; and/or
    c.  the results of running:
        bin/req.filter < log/shim2tws.bin
        bin/msg.filter < log/tws2shim.bin

2.  I claimed in my post of yesterday that the IB tws output:

    21:08:14:690 JTS-EWriter24: [8:23:39:1:0:0:0:ERR]
    Unable write to socket client{8} -

indicated a problem with the IB tws; my claim was based on the
likely false belief that you were seeing such a message before
the shim exits.  If, as I now suspect is more likely, it appears
after the shim exits, please keep in mind that this message is
a perfectly natural response to the socket close by the shim at
termination, and so is to be expected.  In this case, it's still
unclear whether there is any problem with the IB tws.

3.  You may be considering the use of tcpdump to diagnose
connection problems, based on offlist advice.  You are of course
free to use such tools as you wish in the course of fault analysis,
but please do not feel that I expect, or am asking you, to take
such an approach.  What I need is a screen scrape of what the shim
prints at termination, the existence or lack thereof of the log
directory and contained files, and the text of the log file binary
images after translation by the filter scripts.

I suspect that tcpdump will cost you more time than it's
worth.  If the shim is unable to open the binary image
logs, that in itself is a failure worthy of note, and
either a permissions problem on your end, or a bug in
the shim.  If those files are opened, and empty, that is
either an accurate picture of the socket stream received
to date, or again a bug in the shim, which latter case
I can check by the exit trace message.  If the files are
non-empty, then whether or not they're accurate --- and
inaccuracy is, again, worthy of note --- I can begin
debugging there.

Use of tcpdump is often the best starting point for sysadmins.
It makes sense when you lack sources, or wish to assign blame.
Since I can't do much about the IB tws, and intend to fix what
ever problem I can find in the shim, that is not my focus here.

Also, you may need to put the IB tws on a separate machine to
get useful traces from tcpdump; I believe the linux kernel
implements localhost socket open/connects using AF unix rather
than AF inet streams.  Moving the IB tws to another machine is
itself a change in the test environment that may well hide/change
the failure symptoms you see.  In this case, I'd like to ask
that you keep your test approach as before, so I can continue
to focus on your existing problem.

Hope that you haven't lost too much time on a wild goose chase.
I'm sure that, once given good traces, I can track this problem
down.  I suspect that it's a combination of dilatory answers from 
the IB tws, and overly sensitive behavior by the shim.  We'll
just have to endure the former, and I intend to fix the latter.





More information about the ts-general mailing list