Department of Computer Science
BBSRC logo

[Back to paper home]

Click here to download the algorithm and the data used in the paper


  1. Extract the contents of the zip file into a folder named "PAPER_CODE".
  2. Run MATLAB.
  3. Include "PAPER_CODE" and all its sub-directories in the path.
  4. Run the script "Initialize.m" and specify organism of interest. Type "1" for Arabidopsis or "2" for Yeast. This reads in the microarray data, GO tree structure and gene annotations.
  5. To run the algorithm, run the script "Run_and_display.m". IMPORTANT: Please ensure that "Initialize.m" has been run before executing this script.
  6. The user is prompted to enter the GO identifier of the functional category to select experiments for. For example, enter "51726" for the GO category "GO:0051726, Regulation of Cell cycle"
  7. When prompted, enter the threshold for the t-test p-value. e.g. 0.05
  8. When prompted, enter the number of experiments to be used as seed. e.g. K = 15.
  9. The index numbers of the selected experiments will be output on the screen as well written into a text file named "selectedExperiments.txt".
  10. The program will also output the ROC curves indicating the performance of the selected set vis-à-vis when all experiments in the collection are used. The ROC curves are saved as "*.png" files.
  11. The 1-AUC values from the two ROC curves are recorded in a text file "AUCreport.txt". Here the first column indicates the GO identifier selected. Second column indicates the index number of selected set of experiments, the third column indicates the 1-AUC when all experiments are used and the fourth column indicates the 1-AUC when the selected experiments are used.