Wednesday 23 February 2011

FCMP: BKf_BH Controlling the FDR


 Only once did I see the process for incorporation of cutting edge statistical theory into a new sas procedure and actually know the people. A long time ago, when fish just started to explore the idea of dry land living, I studied under Yoav Benjamini who together with Yosi Hochberg suggested (and named) the FDR as an alternative to the FEW. As is with many revolutionary concepts is was not accepted immediately. But once a few articles were written Yoav was contacted by sas who asked a few questions. It was some time ago so I am a little hazy on the details; but if my memory serves me right, I saw the actual letter. Subsequently Wolfinger (I Think) attended the MCP conference in Tel-Aviv and met Yoav, Yosi and Daniel Yekutieli.

 Sometime later, I was working at sas/Austria (or was it already sas/Denmark?) when a new proc was introduced MultTest. I was excited as it was relevant to my research and also included measures suggested by people I knew – Yosi Hochberg and Yoav Benjamini. I later discovered that Yosi’s measure may also be used in the mean statement for proc Anova and GLM (GT2).

 One thing jarred with me at the time. The option in proc MultTest to use the step-down FDR controlling procedure was called FDR. To my opinion it should have been BH for the Benjamini-Hochberg procedure. Apart for giving them their due and being consisted with the naming of the options such as Tukey, Dunnet etc., I knew of at least one other procedure at the time that controlled the FDR and expected more. Moreover, the naming confused between the procedure and the measure.

 Nowadays the FDR measure is mainstream, especially after its relevance to BioInformatics was recognized. More powerful procedures to control the FDR were proposed and some are implemented in sas 9.2. But I still like the elegance of the BH procedure.

 It was just natural that chose to explore the FCMP functionality through the prism of the FDR (download).

To try it run:

%Let n=3;
data test;
 * Array parameters to subroutine calls must be temporary arrays;
 array a(&n.) _temporary_;
 array b(&n.);
 array c(&n.) _temporary_;
 array d(&n.);
 input b1 b2 b3;
 do i=1 to &n.; a[i]=b[i]; end;
 call BKs_BH(a,c);
 do i=1 to &n.; d[i]=c[i]; end;
 datalines;
 0.05 0.01 0.95
 5.00 0.10 0.01
 0.05 0.05 0.05
 0.03 0.02 0.01
 run;
proc print;run;

Output
Obs     b1      b2      b3       d1      d2      d3

1     0.05    0.01    0.95    0.075    0.03    0.95
2     5.00    0.10    0.01    1.000    0.15    0.03
3     0.05    0.05    0.05    0.050    0.05    0.05
4     0.03    0.02    0.01    0.030    0.03    0.03


using sas V9.2/Base