Datasets


GenePattern Tutorial Datasets and files used in the GenePattern Tutorial
gp_tutorial_files.tar.gz Tutorial files (gzip format)
ALL/AML Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression
Golub and Slonim et al., 1999
all_aml.tar.gz All files (gzip)
   all_aml_train.gct Training data set
   all_aml_train.res Training data set
   all_aml_train.cls Training class vector
   all_aml_test.gct Test data set
   all_aml_test.res Test data set
   all_aml_test.cls Test class vector
   Golub_et_al_1999.R Methodology (R script)
Lymphoma Outcome Prediction Diffuse Large B-Cell Lymphoma Outcome Prediction by Gene Expression Profiling and Supervised Machine Learning
Shipp et al., 2002
dlbcl.tar.gz All files (gzip)
   dlbcl_vs_fscc.res DLBCL vs. FL morphology data set
   dlbcl_vs_fscc.cls DLBCL vs. FL morphology class vector
   dlbcl_outcome.res DLBCL outcome data set
   dlbcl_outcome.cls DLBCL outcome class vector
Global Cancer Map Multi-Class Cancer Diagnosis Using Tumor Gene Expression Signatures
Ramaswamy et al., 2001
GCM.tar.gz All files (gzip)
   GCM_Total.res Complete data set
   GCM_Total.cls Class vector for complete set
   GCM_Normal.res Normal samples
GISTIC Assessing the significance of chromosomal aberrations in cancer: Methodology and application to glioma
Beroukhim et al., 2007
GISTIC_Hind_subset.zip Affymetrix 50K Hind chip files (12 of 187 samples, 166MB)
sample_info_subset.txt Sample information file