But we have a large number of such variables, and a quick way to figure out whether they collectively show up predictive power, we may use Partial Least Square method.
In SAS, we can get the PLS scores and score new data in this way:
%macro PLSSCORE(dsn, XCenScale, XWeights, outdsn, prefix=_xscr_);
%if &prefix eq %str(' ') %then %let prefix=_xscr_;
proc sql noprint;
select variable into :covars separated by ' '
from &XCenScale;
quit;
proc transpose data=&XCenScale out=&XCenScale._t;
id variable;
run;
data &XCenScale._t;
if _n_=1 then _TYPE_='MEAN';
else _TYPE_='STD';
set &XCenSCale._t;
run;
data &XWeights;
length _TYPE_ $ 5;
_TYPE_='PARMS';
set &Xweights;
_NAME_=compress("&prefix"||_n_);
run;
data Xscore/view=Xscore;
length _TYPE_ $ 5;
set &XCenScale._t &XWeights;
run;
proc score data=&dsn score=Xscore type=PARMS out=&outdsn;
var &covars;
run;
%mend;
Lightweight yet detailed explanation of PLS and its application to data mining projects can be found at:
Pharmaceutical Statistics using SAS: A Practical Guide by Alex Dmitrienko, Christy Chuang-Stein, Ralph B. D'Agostino, SAS Publishing 2007
0 comments:
Post a Comment