Performs batch correction on a dataset containing multiple batches - Not intended for use with single-cell RNA-seq data
Author: W. Evan Johnson (Boston University), Marc-Danie Nazaire (Broad Institute)
Contact:
Algorithm Version: 2.0
ComBat runs the Combatting batch effects when combining batches of microarray data R script and uses an Empirical Bayes method to adjust for potential batch effects. Practical considerations limit the number of samples run at a given time, and replicate samples are generated in ways that introduce non-biological differences, or systematic "batch effects". For example, batch effects occur when adding replicates from different labs, array types, or platforms. In some cases, different lots of amplification reagent or the time of day of the assay have been demonstrated to cause batch effects. ComBat's Empirical Bayesian approach assumes phenomena resulting in batch effects affect many genes in similar ways and adjusts for these systematic batch biases common across genes.
*Note that this module is not intended for use with single-cell RNA-seq data
Johnson WE, Rabinovic A, and Li C. Adjusting batch effects in microarray expression data using Empirical Bayes methods. Biostatistics. 2007;8(1):118-127. doi:10.1093/biostatistics/kxj037.
Luo J, Schumacher M, Scherer A, et al. A comparison of batch effect removal methods for enhancement of prediction performance using MAQC-II microarray gene expression data. Pharmacogenomics J. 2010;10(4):278-91. doi:10.1038/tpj.2010.57.
Chen C, Grennan K, Badner J, et al. Removing batch effects in analysis of expression microarray data: An evaluation of six batch adjustment methods. PLoS One. 2011;6(2):e17238. doi:10.1371/journal.pone.0017238.
Name | Description |
---|---|
input file * |
|
sample info file * |
TXT plain text file matching batch and covariate information to sample identifier. First three column labels in first row must be exactly "Array", "Sample", and "Batch" without spaces.
|
covariate columns * | Subset of covariate columns to use in analysis. This is either set to all, none, or a list specifying one or more covariate columns from the sample info file, i.e. (4, 5, 7). |
absent calls filter | Filter to apply to RES file genes with absent calls in 1-(absent calls filter) of the samples. Use values between 0 and 1, or leave blank. For example, (0.8) removes a feature if at least 20% of samples have absent calls. |
create prior plots | Whether to generate prior probability distribution plots. Select "yes" for parametric or "no" for non-parametric method. |
prior method | Empirical Bayes priors distribution estimation method to use, either parametric or non-parametric. |
output files * |
|
* - required
Sample info file
Task Type:
Preprocess & Utilities
CPU Type:
any
Operating System:
any
Language:
R (v. 2.5.0)
Version | Release Date | Description |
---|---|---|
3.0 | 2014-06-03 | Updated doc to html |
2.0 | 2014-03-26 | Updated to run on any OS |
1.0 | 2008-08-18 | Windows only version |