Tuesday, January 31, 2012

Multi-Threaded Principle Component Analysis




SAS used to not support multithreading in PCA, then I figured out that its server version supports this functionality, see here. Today, I found this mutlithreading capability is finally available in PC SAS v9.22.

The figure above indicates that all 4 threads in my PC are utilized. FYI, My PC uses an Intel 2core 4threads CPU. This multi-threading capability directly help any work relying on SVD due to the direct relationshipbetween SVD and PCA, see here.

Notice that in order to observe the effect of mutli-threading by comparing Real User Time and CPU Time, I/O should not be a bottleneck, that is why in the code, all outputs, either to screen or to data sets, are suppressed.

PS: It turns out that the multi-threading capability is only available when SAS is building up SSCP /USSCP matrix in PROC PRINCOMP.



options fullstimer;
data _junky;
     length id x: 8;
  array x{800};
  do id=1 to 5E3;
     do j=1 to dim(x);
     x[j]=ranuni(0);
  end;
  drop j; output;
  end;
run;

proc princomp data=_junky noprint;
      var x:;
run;

3 comments:

CHARLIE HUANG said...

Nice finding! Great article -- it seems that SAS allocated the jobs equally to 4 cores.

How about you do a benchmark, like:
options nothreads ;
proc princomp data=_junky noprint;
var x:;
run;

options threads cpucount=2;
proc princomp data=_junky noprint;
var x:;
run;

options threads cpucount=3;
proc princomp data=_junky noprint;
var x:;
run;

options threads cpucount=4;
proc princomp data=_junky noprint;
var x:;
run;

Amit said...

Hi,
I am trying to find any implementation of Statistical Data depth in SAS. Do you have any SAS code to compute it (I dont have IML).

The seminal paper for this is "MULTIVARIATE ANALYSIS BY DATA DEPTH: DESCRIPTIVE STATISTICS, GRAPHICS AND INFERENCE by BY REGINA Y. LIU, JESSE M. PARELIUS AND KESAR SINGH , The Annals of Statistics 1999, Vol. 27, No. 3, 783-858"

Liang Xie said...

Amit:
I am not aware of anyone implementing that algorithm, but when I have time, I can have a try, super busy recently.
Liang