GenePattern Team Blog

RNA-seq QC in GenePattern

Posted on Tuesday, March 05, 2013 at 11:28AM by The GenePattern Team

Overview

After aligning and/or assembling your RNA-seq data, it is important to take a closer look at the content of those result files before continuing with further analysis; in part, because the results of that investigation may, in fact, point you toward how you should best analyze your data.

Specifically in GenePattern, modules are provided to calculate such Quality Control (QC) metrics as: Depth of Coverage, Continuity of Coverage, Duplication Rate, Expression Rates, Strand Specificity, and GC content, among others.

Having these sorts of metrics can help to prevent or better understand common RNA-seq errors stemming from such sources as: read length, quality of data, sample prep, or number of reads in the data.

Modules in GP

  • Picard.AddOrReplaceReadGroups
  • SortSam
  • Picard.CreateSequenceDictionary
  • Picard.ReorderSam
  • Picard.MarkDuplicates
  • SAMtools.FastaIndex
  • RNAseqMetrics

The following decision diagram illustrates a suggested workflow. This workflow is discussed in further detail...

Read More


Creating a GenePattern Module

Posted on Saturday, December 15, 2012 at 11:36AM by The GenePattern Team

The following tutorial shows you how to create a new GenePattern module (in GenePattern 3.4 and up). Only the GenePattern team can create or install modules on the GenePattern public server. Therefore, to create a module, you need to have a local GenePattern server installed (see the download and installation page). You may also be interested in the video tutorial: Create a module in GenePattern.

In this tutorial, you will create a module named log_transform. The module invokes a perl script, log_transform.pl, which log-transforms all positive values in a data set and sets all negative or zero values to zero. Before you begin, download the perl script and its documentation:

In GenePattern, to create the log_transform module:

  1. Click Modules & Pipelines>New Module. GenePattern opens the Module Integrator window.
    • Give it a Name: LogTransform
  2. Enter the following information in the Details fields:
    • Description: Log transform a data (gct)...

Read More


Importing Data from caArray to GenePattern

Posted on Tuesday, October 30, 2012 at 12:36PM by The GenePattern Team

Overview

caArray is an open-source, web and programmatically accessible array data management system. caArray guides the annotation and exchange of array data using a federated model of local installations whose results are shareable across the cancer Biomedical Informatics Grid (caBIG®). caArray furthers translational cancer research through acquisition, dissemination and aggregation of semantically interoperable array data to support subsequent analysis by tools and services on and off the Grid.

To facilitate the importing of data from caArray repositories in GenePattern, a module named caArray2.3.0Importer is provided.

caArray2.3.0Importer

The CaArray2.3.0Importer imports data files from a caArray 2.x repository into GenePattern by connecting to a caArray 2.x repository and then retrieving all files of a given extension for a named experiment. The retrieved files are then collected into a single ZIP file archive which is returned as the module's output.

(Note: 2.x is used...

Read More


Displaying Post 79 - 81 of 87 in total