高通量测序技术及原理介绍.ppt

资源描述

高通量测序技术及原理介绍高通量测序技术及原理介绍童贻刚童贻刚军事医学科学院军事医学科学院微生物流行病研究所微生物流行病研究所公司公司系统名系统名测序长度测序长度优点优点缺点缺点Roche/454 FLX System200_700读长最长；通量高同聚性错误；仪器和试剂价格贵Illumina HiSeq 2000/miSeq2 x 150 通量非常高价格贵；后期分析复杂ABI/SOLiD 5500 xl SOLiD25_35 通量高；试剂消耗少读长太短Helicos HeliScope 25_30 通量高读长太短14测序平台测序平台测序长度测序长度进化过程进化过程产出产出测序时间测序时间SOLiD30bp15bp30G10天Solexa Hiseq2000150bp X 230bp，50bp，75bp，100bp600G14天454750bp100bp，400bp0.7G7小时37301000bp X 96300bp，600bp0.0001G2小时15Illumina workflowIllumina workflowvSample preparationShearing,ligate adaptervCluster generationBridge PCRvSequencing on Genome Analyzer IIxRTA(Run Time Analysis)v Analysis pipelineOffline analysis,alignment,SNPs calling,reads countingVisualize the data,reports the resultsSequencing processFragment DNARepair ends/Add A overhangLigate adaptersSelect ligated DNAHybridize to flow cellExtend hybridized oligosPerform bridge amplificationPerform sequencing on forward strandRe-generate reverse strandPerform sequencing on reverse strandCONFIDENTIAL DO NOT DISTRIBUTE1 Library prep(6 hrs)2 Automated Cluster Generation(5 hrs)1-8 samples3 Sequencing(46 to 120 hrs)1-8 samplesSample Prep-ResequencingSurface bound adapter 1Sequencing primer binding siteSurface bound adapter 2CONFIDENTIAL DO NOT DISTRIBUTECONFIDENTIAL DO NOT DISTRIBUTE Clonal clusters aregenerated in a containedenvironment(need noclean rooms)Sequencing alsoperformed in the flow cellon the generated clustersFlow cell8 channelsKey to the simplifiedworkflowSurface of flowcell coatedwith a lawn ofoligo pairsCluster generation:Hybridize fragment&extendAdaptersequence 50 M singlemoleculeshybridize to thelawn of primersBound moleculesare then extendedby polymerases3 extensionCONFIDENTIAL DO NOT DISTRIBUTEDouble-strandedmolecule isdenatured.Original templateis washed away.Newly synthesizedcovalentlyattached to theflow cell surface.CONFIDENTIAL DO NOT DISTRIBUTECluster generation:Denature double-stranded DNANewlysynthesizedstrandOriginaltemplatediscardCluster generation:Covalently boundspatially separated single moleculesSinglemoleculesbound toflow cell ina randompatternCONFIDENTIAL DO NOT DISTRIBUTECluster generation:Bridge amplificationSingle-strand flipsover to hybridize toadjacent primers toform a bridge.Hybridized primeris extended bypolymerases.CONFIDENTIAL DO NOT DISTRIBUTECluster generation:Bridge amplificationdouble-strandedbridge is formed.CONFIDENTIAL DO NOT DISTRIBUTECluster generation:Bridge amplificationDouble-stranded bridgeis denatured.Result:Two copies ofcovalently bound single-stranded templates.CONFIDENTIAL DO NOT DISTRIBUTECluster generation:Bridge amplificationSingle-strands flip overto hybridize to adjacentprimers to form bridges.Hybridized primer isextended by polymerase.CONFIDENTIAL DO NOT DISTRIBUTECluster generation:Bridge amplificationBridge amplificationcycle repeated tillmultiple bridgesare formedCONFIDENTIAL DO NOT DISTRIBUTECluster generationdsDNAbridgesdenatured.Reversestrandscleavedandwashedaway.CONFIDENTIAL DO NOT DISTRIBUTECluster generation leavinga clusterwith forwardstrands only.CONFIDENTIAL DO NOT DISTRIBUTECluster generationFree 3 endsare blocked topreventunwantedDNA priming.CONFIDENTIAL DO NOT DISTRIBUTECONFIDENTIAL DO NOT DISTRIBUTEhybridizedto adaptersequence.SequencingSequencingprimer isSequencingprimerAdd 4 Fl-NTPs+PolymeraseIncorporatedFl-NTP isimagedTerminator andfluorescent dyeare cleaved fromthe Fl-NTPX 36CONFIDENTIAL DO NOT DISTRIBUTESequencing primerFlow cell imagingTotal Internal Reflection FluorescenceFluidics portFlow cellPrismFluidics portCONFIDENTIAL DO NOT DISTRIBUTECONFIDENTIAL DO NOT DISTRIBUTEPaired end sequencingSequencedstrandstripped off3-endsunblockedPaired end sequencingBridgeformation3extensionCONFIDENTIAL DO NOT DISTRIBUTEPaired end sequencingDoublestrandedDNA isdenaturedCONFIDENTIAL DO NOT DISTRIBUTEPaired end sequencing3 endsareblockedOriginalforwardstrand iscleavedCONFIDENTIAL DO NOT DISTRIBUTEAdd 4 Fl-NTPs+PolymeraseIncorporatedFl-NTP isimagedTerminator andfluorescent dyeare cleaved fromthe Fl-NTPX 36-50CONFIDENTIAL DO NOT DISTRIBUTESequencing reverse strandHybridizesequencingprimerSolexaSolexaFlow cell in GAIIxFlow cell in GAIIxCONFIDENTIAL DO NOT DISTRIBUTEImage re-analysis piplelineImageAnalysisBasecallingSequenceAnalysisGA Analysis PipelineInstrument PCAnalysis PC/clusterdatatransferImages(.tif)Lane 1.8Cycle 1.36Tile_Cycle_Image_a,Tile_Cycle_Image_c,Tile_Cycle_Image_g,Tile_Cycle_Image_t.params fileFor each tile:Cluster intensitiesCluster noiseFor each tile:Corrected cluster intensitiesCluster sequenceCluster probabilitiesFor all data:Quality FilteringSequence AlignmentRun Statistics VisualizationCONFIDENTIAL DO NOT DISTRIBUTEBustardBase with highest corrected intensity is calledACGTCGeraldI AI A+IBGEneration ofRecursiveAnalysesLinked byDependencyIAIBFiltering removes low quality base callsChastity:C=Default value 0.6Other filters include purity,similarity,neighbor andneighborhood.CONFIDENTIAL DO NOT DISTRIBUTEBustard output*_qseq.txtBustard output*_qseq.txtMachine nameRun number Lane number Tile number X coordY coordSequence Quality PassedFilter IndexRead formatEAS1 89 1 59 111 525 AACCTT 2 TGACCAGCGTCAACCAGTACTACGTCTTTGTCGATAG aaaaa_V_OYOZZYUPJZRX 1EAS1 89 1 59 111 726 AACCTT 2 TCTGGATGAAGAACGATCCGCTGCAGAGGTGCTGGCA _FNXXZWFZ_YYTYMUVBBBBBBBBBBB 0EAS1 89 1 59 111 860 AACCTT 2 TATCGCGTAGTGTAGCACGGCCTTTTTTTCGTCCACC aaaXFUWQUHVN_ZRWZZXFWYFTX 1EAS1 89 1 59 112 377 AACCTT 2 TTTTCTTCTCCTTCGCCATCAGCGACAAAATCAAGCA abbbabbbbbbaaaTaaaaaY_YNaZZ 1EAS1 89 1 59 112 538 AACCTT 2 TGTGAATTAACAGTATTGGCGTAGTTACAGGCAGTGT aa_aabbaaa_aSYZYUBBBBB 1EAS1 89 1 59 112 576 AACCTT 2 TCTCCTTCGTCTTCTTCCATCAGTTGTTCGACCGGCT GJRNGBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 0EAS1 89 1 59 112 607 AACCTT 2 TCCACCATCAACTGGTTGCCAGTGCGCGGGCAGTTAA aabaaaaaaX_YTTHTTZQYTX 1EAS1 89 1 59 112 255 AACCTT 2 TGATGCTGATAAGCAGCGTGCTCACAACCCAGATTTG aaba_abaabbbbabababbbbb_aba_Zabbb 1Fastq formatFastq formatvhttp:/www.bioinformatics.babraham.ac.uk/projects/fastqc/GERALD sequenceGERALD sequenceSummary.htmlSummary.html(PF:pass filter)FastQCFastQCThird part softwareThird part softwareBrief Bioinform.2011 Jan 18NGS NGS 技术论坛技术论坛vSEQwiki:http:/ vs.Mate-PairPaired-end vs.Mate-PairSOLiDSOLiD System Mate-Paired Library Preparation System Mate-Paired Library Preparation genomic DNAsheared DNAEcoP15I CAP linkers ligated on to sheared,methylated DNAdigestionFDV-RDV ligated library moleculesbiotinylated internal adaptors with 25-27bp tags from genomic DNA sheared,methylated DNAshearing&end repairmethylationligationcircularizationcircularized DNA with biotinylated internal adaptorsligationSolexa Mate Pair Library I ISolexa Mate Pair Library II II454 Long Span Paired-End I454 Long Span Paired-End II四种关键酶四种关键酶1.Digestion of non-circular DNA 降解线状DNA（ATP dependent Plasmid-Safe DNAse）2.Nick-translation 切口平移（DNA polymerase I）3.T7 exonuclease digestion 从53进一步降解线状DNA，形成3伸出单链4.S1 nuclease digestion 消除3伸出单链谢谢！

展开阅读全文