Sunday, September 16, 2007

Tracing Events, question from Statistics board @mitbbs.com

Update:
This problem turns out to be a Deep First Search (DFS) problem for graph structure. For a general solution of DFS, see my blog post @ Here .

**********************************************************************************
Tracing events.
The objective of this work is to identify total length of streams identified by ID that are linked by nodes, reversely from TONODE to FROMNODE, all the way to the final node that are not connected with any other streams. If STARTFLAG or BRANCH is 1, no further reverse travel is necessary.

data:

ID LENGTHKM FROMNODE TONODE STARTFLAG BRANCH

T 0.43 773501327 773500764 0 0
kk 1.19 773501293 773501327 0 1
S 0.82 773500791 773501327 1 0
R 4.53 773500796 773500769 0 0
P 3.47 773500809 773500796 0 0
O 1.02 773500821 773500809 0 0
nn 0.86 773500836 773500821 0 1
N 1.08 773500836 773500821 0 0
M 1.14 773500218 773500836 1 0
hh 0.89 773500807 773500809 0 1
Q 0.29 773500778 773500796 0 0
ll 0.18 773501344 773500778 0 1
L 0.62 773501344 773500778 0 0
K 0.09 773501293 773501344 0 0
J 1.78 773500800 773501293 0 0
H 0.43 773500807 773500800 0 0
ff 0.40 773500830 773500807 0 1
I 0.31 773500814 773500800 0 0
gg 0.42 773500813 773500814 0 1
G 0.56 773500813 773500814 0 0
F 0.17 773500830 773500813 0 0
E 2.10 773500833 773500830 0 0
dd 0.14 773500845 773500833 0 1
D 0.44 773500845 773500833 0 0
C 1.36 773500879 773500845 0 0
B 3.42 773500403 773500879 1 0
A 3.79 773500885 773500879 1 0


sample of expected results:

Total T = T + kk + S;
Total R = sum (R ---- A);
Total O = O + N + nn + M;
Total P = P + hh + total O;
Total H = H + ff;
Total I = G + gg + F + E --- A;

My solution using hash object is below:




data



program



result


After all streams that are from the same source are identified and flagged, we can simply calculate the sum of length by ID group.

0 comments: