Saturday, September 22, 2007

Table lookup, all techniques

Table lookup is a fundamental task SAS programmers are performing everyday. Here I demonstrate all possible cases in Table Lookup and corresponding efficient SAS code.

1. The simplest case is that lookup table table has unique ID variable while master table is pre-sorted. In such case, a MERGE BY statement is more efficient than enough. Alternatively, if lookup table has unique ID variable but master table is too large that pre-sort is not a good idea (such as a billion-record master table), then if you have SAS v9, hash table is a good choice, or we can use KEY= option on the lookup table in a nested SET statement, which is the best solution in both versions because hash table is slower than setting an indexed data.

Using hash table, we have:
The data:

uniq_data

Using DataStep Component Object:

uniq_component_obj

Using SET KEY=:

uniq_set_key


(to be continued...)

0 comments: