资源描述
单击此处编辑母版标题样式,单击此处编辑母版文本样式,第二级,第三级,第四级,第五级,生物化学与分子生物学系,Genomics,基因组学,是一个细胞(或病毒)所携带的全部遗传信息,它代表了一种生物所具有的全部遗传信息。,真核生物基因组是指一套完整单倍体DNA(染色体DNA)及线粒体或叶绿体DNA的全部序列,既有编码序列,也有大量存在的非编码序列。,细菌基因组包含了拟核和质粒中的DNA序列。,病毒基因组有的为DNA(DNA病毒),有的则为RNA(RNA病毒)。,基因组,(Genome),Genomics is a discipline in genetics concerning the study of the genomes of organisms.The field includes intensive efforts to determine the entire DNA sequence of organisms and fine-scale genetic mapping efforts.The field also includes studies of intragenomic phenomena such as,heterosis,epistasis,pleiotropy,and other interactions between loci and alleles within the genome.,Genomics,基因组学(genomics)是研究基因组的,结构,和,功能,以及基因间的,相互作用,的学科。,研究内容,结构基因组学(structural genomics):,遗传图谱、物理图谱、序列图谱以及转录图谱和大规模DNA测序。,功能基因组学(functional genomics):,分析鉴定基因组功能。,比较基因组学(comparative genomics):,基因组之间比较鉴定,研究生物进化,预测新基因。,Genomics,结构基因组学(structural genomics),结构基因组学着重研究基因组的结构并构建高分辨的,遗传图,、,物理图,、,转录图,和,序列图;,揭示基因组的全部序列及其组成。,主要任务,:,基因组作图,大规模测序,The Human Genome Project,(,HGP,),人类基因组计划,The history and the milestones of HGP,The HGP sequencing strategy,Map-First,Sequencing-Later,(genetic map,physical map,expression map,all would contain the markers or signposts to allow DNA sequence to be pieced together in proper order along chromosomes).,*Hierarchy genome library to set up.,*Hierarchical genome segments to be sequenced,.,Assembly and annotation of human genome,9,The Outline of HGP,The key methods applied to HGP,The establishment of the hierarchical,BACs,library,of the human genome.,The improvement of DNA sequencing,(,CyDye,-labeled,dNTPs,capillary electrophoresis,laser detector,pyrosequencing,).,The establishment of Bioinformatics,to annotate the DNA segments sequenced(the assembly of DNA sequence in order,definition of functional elements),and annotation of human genome sequence through assays,The achieved from HGP(working draft,Chr22,Chr21,the final draft,),HGP application(HGP meanings to other scientific fields),The Outline of HGP,11,15 Feb 2001,Nature,12,Introduction to Human Genome Project,In 1990,American geneticists embarked on an ambitious quest:to map and ultimately sequence the entire human genome,and quickly this effort became an international program(USA 40%,German 15%,France 15%,UK 20%,Japan 7%,and China 1%),IHGSC,.,The human genome project is the huge-more than 3 billion,bp,to sequence,.To get an idea of the magnitude of this task,If all 3 billion,bp,were written down,it would take about 500,000 pages of the Journal Nature to contain all the information.If you could stand the boredom,it would take your about 60 years,working 8h/day,every day,at 5 bases a second,to read it all.,The original plan for HGP,was systematic and conservative,(map-then-sequencing),.At first,genetics would prepare genetic and physical maps of the genome,i.e.,with markers or signposts to,allow all DNA sequences to be pieced together in the proper order,.Later,the bulk of sequencing will be done only if the mapping was completed and clones representing all points on the map were in hand(systematically stored in freezers around the world),scheduled to complete by 2005,.,13,In 1998,A private company Celera created by,Craig Venter announced,that Celera would complete the rough draft of human genome by the end of the year 2000,insisting that the genes are protected by patent.,Excited,by Celera announcement,Francis Collins,director of HGP,promised to produce a rough draft by the end of the year 2000,too,.,June26,2000,Venter and Collins,appeared with President Clinton at a ceremony in the East Room of the White House,to announce the completion of a rough draft of the human genome,(Celera 99%,vs,HGP85%,published in Feb15,2001,HGP in Nature&Celera in Science).,Introduction to Human Genome Project,15,Weaver(2001)Molecular Biology 2,nd,Ed p786,In Nov,2010,Nature,1000 Genome Sequencing Project to be announced for investigation of human genome variations.,16,Human Karyotype,Mouse Karyotype,Zebrafish Karyotype,Pig Karyotype,Nucleus Karyotypes Determine Species in Biology,Cellular Nuclear,Karyotype,:chromosome number&structure,In map-then-sequencing strategy:,(1)the mapping of the human and mouse genomes to allow the study of inherited disease and provide a crucial scaffold for genome assembly(2)the sequencing of organisms with smaller,simpler genomes to serve as a,testbed,for method development and assist in interpreting the human genome(3)the sequencing of human genome and annotating of the segments sequenced,by IHGSC.,In shotgun-sequencing strategy:,omitting map,creating a BAC clone collection of genome,sequencing BAC clones,by Celera.,Q:why need to map for the human genome?,A:in order to relate human gene sequences along chromosomes,have to locate a set of landmarks along chromosomes,.,HGP Sequencing Strategy,Maps of HGP,遗传图(Genetic Map),物理图(Physical Map),转录图(Expression Map),序列图(Sequence Map),Genetic Map(Linkage Map),genetic trait identity to be located along chromosomes,Historically,the markers originally used were detectable phenotypes(enzyme production,eye color)derived from coding DNA sequences;eventually,confirmed or assumed noncoding DNA sequences such as microsatellites or those generating restriction fragment length polymorphisms(RFLPs)have been used,.,遗传作图(genetic mapping):就是确定连锁的遗传标志位点在一条染色体上的排列顺序及它们之间的相对遗传距离,用厘摩尔根(centi-Morgan,cM)表示,当两个遗传标记之间的重组值为1%时,图距即为1 cM。,遗传图(genetic map)连锁图(linkage map),DNA 标志,限制性片段长度多态性(restriction fragment length polymorphism,RFLP),可变数目串联重复序列(variable number of tandem repeats,VNTRs),单核苷酸多态性(single nucleotide polymorphism,SNP),Restriction Fragment Length Polymorphisms(,RFLPs),Each person differs genetically from every other,the sequences of their DNA will differ somewhat as will the pattern of cutting by restriction enzymes.,In case of,HindIII,site,(AAGCTT),for example,one person with 3 HindIII sites separated by 4 kb and 2 kb in a given region of a chromosome(if chromosome to cut with Hind III,it to produce 4kb and 2kb),another with 2 HindIII sites lacking the middle HindIII site(if cut,to produce a 6kb segment),this phenomena,due to the difference of the restriction enzyme site among population,is called RFLPs,.,在群体中生物个体之间,由于DNA 某一位点上的变异有可能引起该位点特异性的限制性内切酶识别位点的改变,包括原有位点的消失或出现新的酶切位点。当用限制性内切酶处理不同生物个体的DNA时,致使酶切片段长度发生变化,个体之间出现限制性片段长度的差异,这称为限制性片段长度多态性。,限制性片段长度多态性,Amplified Fragment Length Polymorphism(AFLP),AFLP is a PCR-based tool used in genetics research,DNA fingerprinting,and in the practice of genetic engineering.Developed in the early 1990s by Keygene,AFLP uses restriction enzymes to digest genomic DNA,followed by ligation of adaptors to the sticky ends of the restriction fragments.A subset of the restriction fragments is then selected to be amplified.This selection is achieved by using primers complementary to the adaptor sequence,the restriction site sequence and a few nucleotides inside the restriction site fragments.,variable number of tandem repeats(VNTRs),VNTR is a location in a genome where a short nucleotide sequence is organized as a tandem repeat.These can be found on many chromosomes,and often show variations in length between individuals.Each variant acts as an inherited allele,allowing them to be used for personal or parental identification.,There are two principal families of VNTRs:microsatellites and minisatellites.The former are repeats of sequences less than about 6 bp in length(2-6BP),while the latter involve longer blocks(10-100BP).,基因组中存在的一种重复DNA短序列。可分为两种 微卫星DNA(,microsatellites)和小卫星DNA(minisatellites)。,其基本原理与RFLP大致相同,通过限制性内切酶的酶切和/或PCR,可一次性检测到众多微卫星位点,得到个体特异性的DNA指纹图谱。,可变数目串联重复序列,(VNTR),SNP,A SNP is a DNA sequence variation occurring when a single nucleotide in the genome differs between members of a biological species or paired chromosomes in an individual.,Variations in the DNA sequences of humans can affect how humans develop diseases and respond to pathogens,chemicals,drugs,vaccines,and other agents.SNPs are also thought to be key enablers in realizing the concept of personalized medicine.However,their greatest importance in biomedical research is for comparing regions of the genome between cohorts(such as with matched cohorts with and without a disease)in genome-wide association studies.,SNP不以“长度”的差异作为检测手段,而是直接以序列的变异作为标记。SNP是指在基因组水平上由单个核苷酸变异所造成的DNA序列多态性。SNP是人类可遗传的变异中最常见的一种,也是基因组中最为稳定的变异。SNP最大限度地代表了不同个体之间的遗传差异,因而成为研究多基因疾病、药物遗传学及人类进化的重要遗传标记。,单核苷酸多态性,(SNP),物理作图(physical mapping)是在遗传作图基础上制作的更详细的人类基因组图谱。包括:,荧光原位杂交图(FISH map),限制性酶切图(restriction map),连续克隆系图(clone contig map),物理图,(Physical Map),荧光原位杂交图(fluorescent,in situ,hybridization map,FISH map):,将荧光标记的探针与染色体杂交确定分子标记所在的位置。探针常选用已知基因的大片段序列,限制性酶切图(restriction map):,将限制性酶切位点标定在DNA分子的相对位置。,连续克隆系图(clone contig map):,采用酶切位点稀有的限制性内切酶或高频超声破碎技术将DNA分解成大片段后,再通过构建酵母人工染色体(yeast artificial chromosome,YAC)或细菌人工染色体(bacterial artificial chromosome,BAC)获取含已知基因组序列标签位点(sequence tagged site,STS)的DNA大片段。,Sequence-tagged sites,(or,STSs,)are short(200 to 500 base pair)DNA sequences that have single occurrence in the genome and whose location and base sequence are known.STSs were done by HGP on 1998.,STSs can be easily detected by PCR using specific primers.For this reason they are useful for constructing genetic and physical maps from sequence data reported from many different laboratories.They serve as landmarks on the developing physical map of a genome.,Sequence-Tagged Sites,*STS(,sequence tagged site,基因组序列标签位点):,是指染色体定位明确,并且可用PCR扩增的单拷贝序列,每隔100 kb距离就有一个标志。在STS基础上构建能够覆盖每条染色体的大片段DNA连续克隆系就可绘制精细物理图谱,为大规模DNA测序做好了准备。,based on cDNA library and ESTs(expressed sequence tags)to locate genes along chromosomes.,EST is a short sub-sequence of a transcribed cDNA sequence.They may be used to identify gene transcripts,and are instrumental in gene discovery and gene sequence determination.There are approximately 65.9 million ESTs now available in public databases(e.g.GenBank 18 June 2010,all species).,An EST is produced by one-shot sequencing of a cloned mRNA(i.e.sequencing several hundred base pairs from an end of a cDNA clone taken from a cDNA library).,Expression Map(transcriptional map),Sequence Map,The Goal!,Vectors for HGP,No matter which sequencing strategy is used,one must first clone fragments of the genome in appropriate vectors,and proper large fragments are particular valuable.,Vector features,an origin of replication,a selectable marker,a multicloning site,Vector Type,YAC(0.2-2Mb),BAC(100-300 kb),Cosmid(50 kb),Phage(15kb),Plasmid(10 kb),Yeast Artificial Chromosome(YAC),a linear ds-DNA,with telomeres located at its terminus,origin site and centromere closer to left telomere,allowed of insertion of foreign DNA with over 500 kb at the site between centromere and right telomere.But YAC to be unstable,inefficient and hard to isolate.,Bacterial Artificial Chromosome(BAC),unlike the linear YAC,BAC is circular supercoiled resistance to breakage,with insertion of foreign DNA average with 150 kb(100-300 kb).For example,pBAC108L,chloramphenicol resistance gene(Cm,R,)which is based on a natural plasmid F-plasmid that inhabits in E.coli cells.,Plasmid:,with its backbone size of around 6 kb and allowance of insertion of foreign DNA 10 kb,circular supercoiled,and inhabited in E.coli.,41,HGP Sequencing Strategy,Map-then-Sequencing Strategy(Clone Contig Map),Up to down,by IHGSC.,Whole Genome Shotgun Sequencing Strategy,Bottom to up,by Celera.,IHGSC Strategy:Hierarchical-Genome Shotgun Sequencing(HGSS),Isolation&digestion of chromosomes,Establish of hierarchical BAC library along every chromosomes with BAC clone fingerprint&STC,Genome Source:,blood cells&sperm cells,*Hierarchical BAC Library of,Every Single Chromosome of Human Genome,Hierarchical plasmid library of,every single BAC clone for sequencing,*DNA Sequencing:clone-by-clone,(plasmids),Assembly of DNA sequences via,merging STCs of BAC clones,*Annotation of DNA Sequences,as Function Elements,Isolation of blood cells from the peripheral blood of people and isolation of sperm cells,With virus transformation,cells to be immortal,Establish of hierarchical plasmid library from every single BAC clone for sequencing,Sanger DNA sequencing(chain termination),CyDye,labeled,dNTPs,laser detector,capillary electrophoresis,Pyro,-sequencing,Bioinformatics analysis(ORF,CpG,island,cDNA,seq,Gene Ontology,Sequence homology,etc.),Function confirmation assay,With BAC clone fingerprint&STC,genetic marker,seq,markers,expression tag to place clones in order along each chromosomes,43,STCs:sequence-tagged connectors via both end segments sequenced of 500 bp of genome insert in every BAC clones,44,Hierarchical BAC Library of Human Genome,IHGSC(2001)Nature p863,Organized BAC Clone Contigs along chromosome,Via overlap seq&BAC-Fingerprints,Hierarchical BAC Library(170 kb)of,Shotgun Chromosomes,Human Genomic DNA,Plasmid Libraries(4-6 kb)of,Shotgun Insert of BAC Clones,DNA Sequencing of Plasmid Clones,Assembly of Sequenced DNA Segments,Via End-to-End,Physical Maps,BAC-STCs,Contig:a set of segments with overlapped stretches among them,.,45,Workflow of Human Genome Sequencing,Wikipedia,Celera Strategy:Whole-Genome Shotgun Sequencing(WGSS),Weaver(2001)Molecule Biology,46,Chromosomes are cloned into a BAC vector,yielding a collection of 300,000 BAC clones,but at random.,Every seed BAC is selected for next sequencing.,The seed BAC is,subcloned,into a plasmid vector,yielding a plasmid library.,Three thousand of the plasmid clones are sequenced,and the sequences are ordered by their overlaps,producing the sequence of the whole 150-kb BAC.,Find the,BACs,(about 30)with overlapping,STCs,then compare them by fingerprinting to find those with minimal overlaps,and sequence them.This process,known as BAC walking,can in principle create a,contig,covering the whole chromosome.,A random collection,Great risk of long-range mis-assembly,47,Celera Flow Diagram for Genome Sequencing,Venter(2001)Science,48,Distribution of Molecular Functions of 26,383 Human Genes By Celera,Venter(2001)Science p1335,49,Celera Anatomy of Whole Genome Assembly,Venter(2001)Science,50,IHGSC Hierarchical Shotgun Vs Celera Whole-Genome Shotgun,Waterston(2002)PNAS,Hierarchical Genome BAC Library,Hierarchical Organization,Genome to chromosomes,Chromosome to BAC clones,BAC clone to plasmid clones,DNA Sequencing,Chain termination(Sanger),Pyrosequencing,reaction mechanism,53,DNA to Synthesize:formation of phosphodiester bonds,New phosphodiester bond,Synthesis Reaction,PPi,Direction 53,Resource components:,dNTPs,NTPs,Primer(RNA or DNA,oligo,10-60,nt,)with 3-OH free,Template DNA,DNA polymerase:catalyze new,phosphodiester,bond formation between 3-OH of primer with the alpha phosphate of incoming nucleotide,54,The Mechanism of DNA Polymerase for DNA Synthesis,New phosphodiester bond,These,two Mg,2+,ions,to coordinate,to,tri-phosphate,of incoming,dNTP,and,to three Asp residues,in enzyme in site.,The,upper Mg,2,ion,to facilitate,an attack of 3-OH,at 3-end of growing DNA,onto the alpha-phosphate,of incoming,dNTP,.,The,lower Mg,2,ion,to facilitate,displacement,of pyrophosphate with,new,phosphodiester,bond.,Both
展开阅读全文