[Back to paper home]
Click here to download the algorithm and the data used in the
paper
INSTRUCTIONS FOR RUNNING THE CODE
- Extract the contents of the zip file into a folder named "PAPER_CODE".
- Run MATLAB.
- Include "PAPER_CODE" and all its sub-directories in the path.
- Run the script "Initialize.m" and specify organism of interest. Type "1" for Arabidopsis or "2" for Yeast. This reads in the microarray data, GO tree structure and gene annotations.
- To run the algorithm, run the script "Run_and_display.m". IMPORTANT: Please ensure that "Initialize.m" has been run before executing this script.
- The user is prompted to enter the GO identifier of the functional category to select experiments for. For example, enter "51726" for the GO category "GO:0051726, Regulation of Cell cycle"
- When prompted, enter the threshold for the t-test p-value. e.g. 0.05
- When prompted, enter the number of experiments to be used as seed. e.g. K = 15.
- The index numbers of the selected experiments will be output on the screen as well written into a text file named "selectedExperiments.txt".
- The program will also output the ROC curves indicating the performance of the selected set vis-à-vis when all experiments in the collection are used. The ROC curves are saved as "*.png" files.
- The 1-AUC values from the two ROC curves are recorded in a text file "AUCreport.txt". Here the first column indicates the GO identifier selected. Second column indicates the index number of selected set of experiments, the third column indicates the 1-AUC when all experiments are used and the fourth column indicates the 1-AUC when the selected experiments are used.