Posted on Saturday, September 15, 2012 at 12:29PM by The GenePattern Team
In cancer genomics, copy number change is one of the hallmarks of the genetic instability common to most human cancers and loss of heterozygosity (LOH) of tumor suppressor genes is a crucial step in the development of sporadic and hereditary cancer (Monti, 2005). Using modules available in GenePattern, you can compute SNP copy number and LOH based on Affymetrix SNP chip data for paired target/normal samples and then view them in the Integrative Genomics Viewer (IGV). The following modules are used for this computation, with IGV at the end for viewing the results:
SNPFileCreator converts the .CEL files from an Affymetrix array into a GenePattern .SNP file. Raw data for the probes in each SNP probe set are converted to a single intensity value per SNP using one of four modeling algorithms: Average Difference, PM/MM Difference Model (dChip, the default), Median Probe, or Trimmed Mean. Note that processing times for this module can average upwards of 30 minutes, depending on the speed of the server, the size of the dataset, and available memory. At least 2GB of memory are needed to run most SNPFileCreator jobs.
For more information about SNPFileCreator please see the SNPFileCreator Documentation
For gender-specific samples, run the XChromosomeCorrect module on the output of SNPFileCreator to correct intensity values for SNPs on the X chromosome. For each sample from a male donor, the module doubles the intensity value for SNPs on the X chromosome.
The sample information file describes the SNP array and must be tab-delimited, include a column labeled Gender that contains a value of M or F for each sample and include target/normal paired samples for copy number and LOH determination. (More information on file formats can be found here)
For more information about XChromosomeCorrect please see the XChromosomeCorrect Documentation
CopyNumberDivideByNormals computes the raw copy number of each target SNP by dividing its intensity value by the mean intensity value of all normal SNPs. This calculation is referred to as copy number normalization or normalization with respect to normals.
For more information about CopyNumberDivideByNormals please see the CopyNumberDivideByNormals Documentation
The LOHPaired module detects loss of heterozygosity (LOH). It takes as input a GenePattern .SNP
file that contains paired normal-target samples with genotype calls. (LOHPaired accepts only nonallele-
specific .SNP files; .SNP files that contain one intensity value per probe.) It returns as output a
GenePattern .LOH file that contains, for each probe, the LOH calls for each array pair.
LOH call values are as follows.
Call | Value |
---|---|
L | LOH: AB in normal and A or B in tumor |
R | Retention: AB in both normal and tumor or No Call in normal and AB in tumor |
C | Conflict: A or B in normal and AB in tumor |
N | Non-informative call: A or B in normal No call: No Call in normal or tumor |
The Integrative Genomics Viewer (IGV) is a high-performance visualization tool for interactive exploration of large, integrated datasets. It supports a wide variety of data types and provides easy access to genomes and datasets hosted by the Broad Institute.
Specifier | Value | Description |
---|---|---|
name | track label | Track name (ignored when used in the IGV file format) |
description | center label | Currently ignored |
visibility | full | dense | hide | Currently ignored |
color | RRR,GGG,BBB | Color for positive values in all tracks |
altColor | RRR,GGG,BBB | Color for negative values in all tracks |
priority | N | Currently ignored |
autoScale | on | off | Currently ignored; all tracks autoscale unless an explicit data range is defined (e.g., by including the viewlimits specifier). |
gridDefault | on | off | Currently ignored |
maxHeightPixels | max:default:min | Default and min are supported; max is currently ignored |
graphType | bar | points | heatmap | Scatter plot | heatmap. IGV only: The heatmap value is an IGV addition to the WIG specification. |
midRange | x:y | Defines the neutral range for a three-color heatmap. Values in this range are rendered with the midColor value, which is white by default. Example: midRange=20:80 IGV only: This specifier is an IGV addition to the WIG specification. |
midColor | RRR,GGG,BBB | Color to use in the "mid range" of a heatmap. Example: midColor=0.0.150 IGV only: This specifier is an IGV addition to the WIG specification. |
viewLimits | lower:upper | Defines the data range |
yLineMark | real-value | Currently ignored |
yLineOnOff | on | off | Currently ignored |
windowingFunction | maximum | minimum | mean | Function that summarizes the values in a window of data represented by one pixel |
smoothingWindow | off | [MATKC:2-16] | Currently ignored |
coords | 0 | 1 | Indicate whether the file uses 0 or 1 based coordinates.The UCSC specification for WIG files uses 1 based coordinates and for BED files uses 0 based coordinates. If data looks off by one, check for a possible 0 vs 1 based coordinate issue. IGV only: This specifier is an IGV addition to the WIG specification. |
To launch IGV and view your Copy Number and/or LOH data:
For more information on navigating or displaying data in IGV please see the IGV User Guide.