Since a discriminant analysis is equivalent to a 2-step process, i.e. regress first then conduct discriminant analysis, it is easy to implement the so called generalized discriminant analysis shown in Ch.12.4--12.6 of Elements of Statistical Learning. The basic idea here is to use some regression analysis procedures, such as using PROC REG and its RIDGE= option in MODEL statement for the ridge regression, and then use the prediction from L2 regularized regression in next step's discriminant analysis. Using PROC GLMSELECT, we can replace L2 regularization with a L1 regularization.
A piece of prototype code looks like this:
PROC GLMMOD data=&yourdata OUTDESIGN=&design;
CLASS &dep_var;
model X = &dep_var /noint;
RUN;
data &yourdata;
merge &yourdata &design;
rename Col1-Col&k = Y1 -Y&k;
run;
%let deps= Y1-Y&k; /* for the case of 5-class problem */
PROC REG DATA=&yourdata RIDGE=&minridge to &maxridge by 0.1 OUTEST=beta;
MODEL &deps = &covars ;
OUTPUT OUT=predicted PRED=&dep._HAT;
RUN;
PROC DISCRIM DATA=predicted &options;
CLASS &dep;
VAR &dep._HAT;
RUN;
0 comments:
Post a Comment