Tissue 1 (anterior gills) - FR | Tissue 1 (anterior gills) - RF | Tissue 2 (posterior gills) - FR | Tissue 2 (posterior gills) - RF |
Tissue 3 (female + male gonads) - FR | Tissue 3 (female + male gonads) - RF | Tissue 4 (eye stalk + muscle) - FR | Tissue 4 (eye stalk + muscle) - RF |
Tissue 5 (1st + 2nd Zoea stage) - FR | Tissue 5 (1st + 2nd Zoea stage) - RF | Tissue 6 (3rd Zoea stage) - FR | Tissue 6 (3rd Zoea stage) - RF |
Trinity is a platform for de nova transcriptome assembly from RNA-seq data in the absence of a reference genome. It partitions RNA-seq data into several independent de Bruijn graphs (ideally one graph per expressed gene) and uses parallel computing to reconstruct full-length transcripts for alternatively spliced isoforms from these graphs.
The Trinity assembly pipeline combines three consecutive independent software modules: Inchworm, Chrysalis and Butterfly.
Due to Trinity's ability to leverage strand-specific Illumina paired-end libraries it proved an excellent platform to assemble our data.
We have generated two fasta files for each tissue: one in the reverse/forward (RF) direction and the other in the forward/reverse (FR) direction. All 12 files are available for download. Fasta entry for one of the transcripts in the Trinity output file is formatted like so :
>c115_g5_i1 len=247 path=[31015:0-148 23018:149-246]
AATCTTTTTTGGTATTGGCAGTACTGTGCTCTGGGTAGTGATTAGGGCAAAAGAAGACAC
ACAATAAAGAACCAGGTGTTAGACGTCAGCAAGTCAAGGCCTTGGTTCTCAGCAGACAGA
AGACAGCCCTTCTCAATCCTCATCCCTTCCCTGAACAGACATGTCTTCTGCAAGCTTCTC
CAAGTCAGTTGTTCACAGGAACATCATCAGAATAAATTTGAAATTATGATTAGTATCTGATAAAGCA
The accession encodes the Trinity gene and isoform information. In the example above, the accession c115_g5_i1 indicates Trinity read cluster c115, gene g5, and isoform i1. Because a given run of trinity involves many clusters of reads, each of which are assembled separately, and because the gene numberings are unique within a given processed read cluster, the gene identifier should be considered an aggregate of the read cluster and corresponding gene identifier, which in this case would be c115_g5.
So, in summary, the above example corresponds to gene id: c115_g5 encoding isoform id: c115_g5_i1.
For Trinity basic statistcs, click here.
For more detailed information about Trinity, please visit http://trinityrnaseq.sourceforge.net/