Next generation sequencing has become a relevant technology in modern biology. GeneCore is providing 'Massively Parallel Sequencing' (MPS) as a core part of its service portfolio according to EMBL's mission to provide a state-of-the-art infrastructure to the scientific community.
The fast growing suite of instruments consists currently of:
|3x Illumina HiSeq 2000 (high content)
2x Illumina HiSeq 2500 (high speed)
1x Illumina MiSeq (long read)
1x Illumina NextSeq 500
|4x Illumina CBot|
|Covaris S2, Hydroshear|
Lab on Chip:
|Agilent BioAnalyzer, AAT Fragment Analyser|
|Sequencer||Lanes||Regime*||Read Length [bases]||Reads/Lane||Run time**|
|HiSeq 2000||8||SE, PE||50, 100||150 - 200 Mio||50 SE: 3 days|
|100 PE: 13 days|
|HiSeq 2500||2||SE, PE||50,100||100 - 120 Mio||50 SE: 1 day|
|100 PE: 4 days|
|HiSeq 2500 rapid run mode||2||PE||250||> 300 Mio||250 PE: 3 days|
|MiSeq||1||PE||36,75,150,250||15 - 20 Mio||36 PE: 8 hours|
|NextSeq 500 HI||1||PE||HI: 75, 150, 300||75 PE:|
|NextSeq 500 MID||1||PE||MID: 75,150,300||75 PE:|
*) SE = Single End, PE = Paired End
**) without data processing, clustering and library preparation
GeneCore can prepare sequencing libraries for the following applications:
- RNA-Seq (strand-specific or strand non-specific)
- gDNA-Seq (de novo, resequencing)
insert size: 150-600 bp or 2.5 - 5.0 kbp (mate pair)
including genome capture
GeneCore can only partially support its users with processing of samples for these applications:
- CLIP-Seq (CrossLinking-and-ImmunoPrecipitation, this method and its variants are used for identification of RNA-Protein binding sites)
- GRO-Seq (Global-Run-On method, used for identification of nascent transcripts)
- CAGE-Seq (Cap-Analysis-of-Gene-Expression, the method is used for analysis of 5' end of mRNA transcripts)
- Hi-C ( a variant of chromosome conformational capture method, enabling identification of points of contacts between distant chromosomal domains)
- Chip-exo and ATAC-Seq (transposase-accessible chromatin using sequencing), include an immuno-precipitation step with a specific antibody.
- Agilent Sure Select Human Exome Version 4 and higher
- Customized target capture
Multiplexing / Barcoding
- Indexing is possible by Illumina-type barcodes/indices (dedicated read) or barcodes integrated into sequencing reads (inline barcodes), during registration of your samples , please indicate correct length of barcode you used.
- Dual barcoding.
- Please note that the pool of barcoded samples registered for sequencing must have at least 4 libraries with balanced base composition. You can check, if your barcodes have the optimal combination of bases with this tool: http://www.ebi.ac.uk/~markus/basedist (paste individual barcodes in one lane each and click analyse; you will receive a colour coded graph and also a numerical value showing contribution of each position of the barcode.)
User made libraries
- Users can submit libraries they prepared themselves, which have to meet GeneCore's QC criteria before sequencing.
Which type of sequencing to use ?
"If you don't get the coverage at the start you'll regret it." (Jonathon Blake, Bioinformation)
Points to consider:
- Experimental design and coverage are the key for meaningful results
- Required coverage can be inferred from the size of the assayed space (genome, transcriptome, etc.) and abundance of recorded event. Generally, the more deteiled the picture should be, the more (deeper) needed to be sequenced. It is ususally necessary to sequence deeper (more) for any de novo analysis.
- Don't underestimate the importance of biological replicates
|Application||Recommended Reading Length and Mode|
|gDNA-Seq (WGS)||100 PE, regardless if genome assembly is based on a reference or carried out de novo|
|Exome Sequencing (WES)||100 PE|
|Methylome Sequencing (BS-Seq)||100 PE|
|ChipSeq, Faire-Seq, DNaseI-Seq, MN-Seq, ...||50 SE, pair-end sequencing or longer reads normally do not improve results. The sequencing depth required, is lower for analysis of binding sites of a transcription factor (not such a frequent event), whereas it is higher for mapping histones modifications ( a very frequent event).|
|RNA-Seq, mRNA-Seq||50 SE, for detection of expressed genes at eukaryotic samples|
|RNA-Seq, Splice Variant detection||50 PE and deeper coverage|
|miRNA-Seq||50 SE, Confirm, if your samples contain full complement of short RNAs (columns!)|
What do we need ?
Please contact Vladimir Benes , if you like to use GeneCore for MPS.
Please use our registration system to place your order. Only samples which have been regeistered electronicaly can be processed. In house user may place their sample with us at V106. Please attach a corresponding barcode label available from the barcode printer in GeneCore. External users should send their sample clearly marked with the ID assigned to their sample during its registration (this information is also available in Â email confirmation they receive upon completion of registration), that we can attach the proper label for processing and storage, as soon as the samples arrive.
Container: Please use only 1.5 ml low binding tubes (e.g. Eppendorf). Everything else causes substantial additional work and costs for all of us.
Quality of the source material for library generation by GeneCore:
|High quality total RNA (BioAnalyzer RIN > 7), absorbance ratio 260/280 ~2|
|Purified IP DNA with majority of fragments within size range 200 - 500 bp|
|High quality DNA, single band on agarose gel, absorbance ratio 260/280 ~2|
|Majority of fragments within size range 200 - 400 bp|
User made libraries (all types)
- The samples should be accompanied by a picture showing a narrow fragment size distribution, with majority of fragments not longer than 700 bp, without any primer-dimers peak.
- If you are in doubt contact GeneCore.
|Type of Library||Amount||Concentration [ng/Âµl]|
|RNA-Seq (mRNA, strand-specific RNA)||1 [Âµg]||100|
|Mate-pair DNA libraries||5 [Âµg]||100|
|Methyl-Seq (BS-Seq)||5 [Âµg]||100|
|Exome Capture||3 [Âµg]||100|
|PCR amplicons||20 [ng]||---|
|User made/ready-to-run libraries||20 ng (10 nM solution)||2|
Important Points to consider during registration
- if you register several samples, each of them needs to be registered individually
- before submitting your request, all fields in the form need to be checked/filled-in but the Mate Pair library box unless you register samples for this application
- your session times out after 15 minutes (to avoid locking-up) and if you think you cannot manage, divide your registration into several batches
- your registration is truly completed only when you click “Complete Order” and you receive an email confirming your registration
- review your entries before clicking “Submit Request” so that you avoid repeated return to the form
- in the case of an error, send the screenshot with its message to zimmerma@ embl.de
What can you expect ?
Number of reads
One HiSeq 2000 flow-cell can sequence seven ‘samples’ in individual lanes and requires a control in a separate lane. These ‘samples’ can be also pools of barcoded samples. Under optimal conditions (cluster density, etc.) it is feasible to obtain from this instrument up to 200 million sequencing reads per lane. Depending upon the application, usually ~75% can be mapped to the reference genome. Please take into account that these numbers are after quality filtering.
Result package contains sequences in a fastaq format and if the alignment option was selected during registration also alignments (bam file, http://samtools.sourceforge.net/) as a tar archive. You will be notified by email when files are ready for download. Primary and intermediate results will be stored as long as necessary to generate the result files. The result package will be available 30 days after you receive the message informing you that data are ready. A longer storage is not possible due to the amount of data produced and the respective costs for data storage. Data are archived and can be retrieved against the fee, if a need be. Please note that generally we do not manipulate data by any means such as trimming, for example. However, we can do it upon request.
is heavily dependent upon the sequencing regime. Running time of the sequencer for 50 single-end bases (50SE) is ~3 days, for 100 pair-ended bases (100PE, each fragment is sequenced from both sides) and the data processing. The sequencers are running 24 hours a day and 7 days a week. Interruptions are only made for maintenance. The general processing time per sample is around 6 weeks, exceptions are possible.
The files are available as fastq file *.txt.gz and bam files and the names have the following notation:
Results are placed on our dedicated result distribution server. The files can be downloaded, using with the fasp protocol (by Aspera), which increases the donwload speed up to a factor of 60x (compared to conventional methods), utilizes enhanced error correction as well on the fly encryption. For further information: information File Download with Aspera.pdf.
Sample storage policy
Your physical sample and generated sequencing library will be stored at GeneCore for a period of 30 days after completion of your request, if no explicit agreement with GeneCore has been made.
Data retention policy
As soon as the result files for your samples are produced (seq and if applicable also bam filess) you will get a notification mail. We will keep your data online for download for 4 weeks, after that period the data files will be deleted.
MPS on Illumina HiseSeq consists of two parts: library preparation and sequencing. . Each of them has its own price, which depends upon the type of the library and its sequencing mode.