收藏 分销(赏)

从大猩猩的基因组序列深入了解人类进化.docx

上传人:仙人****88 文档编号:8925348 上传时间:2025-03-08 格式:DOCX 页数:30 大小:359.97KB 下载积分:10 金币
下载 相关 举报
从大猩猩的基因组序列深入了解人类进化.docx_第1页
第1页 / 共30页
从大猩猩的基因组序列深入了解人类进化.docx_第2页
第2页 / 共30页


点击查看更多>>
资源描述
-------------------------------- Insights into hominid evolution from the gorilla genome sequence 从大猩猩的基因组序列深入了解人类进化 ---------------------------------------------------------- Aylwyn Scally1, Julien Y. Dutheil2{, LaDeana W. Hillier3, Gregory E. Jordan4, Ian Goodhead1{, Javier Herrero4, Asger Hobolth2,Tuuli Lappalainen5, Thomas Mailund2, Tomas Marques-Bonet3,6,7, Shane McCarthy1, Stephen H. Montgomery8, Petra C. Schwalie4, Y. Amy Tang1, Michelle C. Ward9,10, Yali Xue1, Bryndis Yngvadottir1{, Can Alkan3,11, Lars N. Andersen2, Qasim Ayub1, Edward V. Ball12, Kathryn Beal4, Brenda J. Bradley8,13, Yuan Chen1, Chris M. Clee1, Stephen Fitzgerald4, Tina A. Graves14, Yong Gu1, Paul Heath1, Andreas Heger15, Emre Karakoc3, Anja Kolb-Kokocinski1, Gavin K. Laird1, Gerton Lunter16, Stephen Meader15, Matthew Mort12, James C. Mullikin17, Kasper Munch2, Timothy D. O’Connor8, Andrew D. Phillips12, Javier Prado-Martinez6, Anthony S. Rogers1{, Saba Sajjadian3, Dominic Schmidt9,10, Katy Shaw12, Jared T. Simpson1, Peter D. Stenson12, Daniel J. Turner1{, Linda Vigilant18, Albert J. Vilella4, Weldon Whitener1, Baoli Zhu19{, David N. Cooper12, Pieter de Jong19, Emmanouil T. Dermitzakis5, Evan E. Eichler3,11, Paul Flicek4, Nick Goldman4, Nicholas I. Mundy8, Zemin Ning1, Duncan T. Odom1,9,10, Chris P. Ponting15, Michael A. Quail1, Oliver A. Ryder20, Stephen M. Searle1, Wesley C. Warren14, Richard K.Wilson14, Mikkel H. Schierup2, Jane Rogers1{, Chris Tyler-Smith1 & Richard Durbin1 人名------------------------------------------------------------------------------ Gorillas are humans’ closest living relatives after chimpanzees, and are of comparable importance for the study of humanorigins and evolution. Here we present the assembly and analysis of a genome sequence for the western lowland gorilla,and compare the whole genomes of all extant great ape genera. We propose a synthesis of genetic and fossil evidence consistent with placing the human–chimpanzee and human–chimpanzee–gorilla speciation events at approximately 6 and 10 million years ago. In30%of the genome, gorilla is closer tohuman or chimpanzee than the latter are to each other;this is rarer around coding genes, indicating pervasive selection throughout great ape evolution, and has functional consequences in gene expression. A comparison of protein coding genes reveals approximately 500 genes showing accelerated evolution on each of the gorilla, human and chimpanzee lineages, and evidence for parallel acceleration,particularly of genes involved in hearing.Wealso compare the western and eastern gorilla species, estimating an average sequence divergence time 1.75 million years ago, but with evidence for more recent genetic exchange and a population bottleneck in the eastern species. The use of the genome sequence in these and future analyses will promote a deeper understanding of great ape biology and evolution. 大猩猩是人类除了黑猩猩之外的又一近亲,并且在研究人类起源以及进化方面有着重要作用。现在,我们对西部低地大猩猩的基因组进行组装和分析,而后,和现存的巨猿属的所有基因组进行比较。我们提出一个符合大约在6百万到1千万年前人类—黑猩猩和人类—黑猩猩—大猩猩形态事件的基因和化石的综合证据。30%的基因组中,大猩猩接近于人类或者黑猩猩的程度较之于黑猩猩接近于人类或者大猩猩更高,这在编码基因中是非常罕见地,表明普遍选择在巨猿时演变以及基因表达的功能性后果。比较蛋白质的编码基因我们发现,大约500个基因表现出加速人类,大猩猩,黑猩猩的血统的进化,并且这种进化是并行加速的,而基因的参与则更加特别。我们还比较东西部的大猩猩物种,估算出平均的基因序列差异是在175万年前。但最近的证据表明遗传交换和物种数量的瓶颈是在东部的物种。对基因组序列的现在和未来的分析的利用将促进更深入的了解巨猿的生物与进化。 --------------------------------------------------------------------- Humans share many elements of their anatomy and physiology with both gorillas and chimpanzees, and our similarity to these species was emphasized byDarwin andHuxley in the first evolutionary accounts of human origins1. Molecular studies confirmed that we are closer to the African apes than to orang-utans, and on average closer to chimpanzees than gorillas2 (Fig. 1a). Subsequent analyses have explored functional differences between the great apes and their relevance to human evolution, assisted recently by reference genome sequences for chimpanzee3 and orang-utan4. Here we provide a reference assembly and initial analysis of the gorilla genome sequence, establishing a foundation for the further study of great ape evolution and genetics. 人类与大星星和黑猩猩有着许多共有的解剖和生理学原理,我们和这些物种的相似强调了达尔文和赫胥黎的第一次人类起源进化理论。分子研究证实,和接近猩猩相比,我们更接近于非洲人猿;和大猩猩相比,平均上,我们更接近于黑猩猩(图1)。随后的分析,根据对于黑猩猩与猩猩基因组序列的辅助参考 ,探讨了巨猿以及他们与人类进化之间的关系上的功能性差异。在这里我们提供了组装和初步分析大猩猩基因组序列的参考,为进一步研究巨猿的进化与遗传学建立了一个基础。 ------------------------------------------------------------------- Recent technological developments have substantially reduced the costs of sequencing, but the assembly of a whole vertebrate genome remains a challenging computational problem. We generated a reference assembly from a single female western lowland gorilla (Gorilla gorilla gorilla) named Kamilah, using 5.43109 base pairs (5.4 Gbp) of capillary sequence combined with 166.8 Gbp of Illumina read pairs (Methods Summary). Genes, transcripts and predictions of gene orthologues and paralogues were annotated by Ensembl, and additional analysis found evidence for 498 functional long (.200-bp) intergenic RNA transcripts. Table 1 summarizes the assembly and annotation properties. An assessment of assembly quality using finished fosmid sequences found that typical (N50; see Table 1 for definition) stretches of error-free sequence are 7.2 kbp in length, with errors tending to be clustered in repetitive regions.Outside repeat masked regions and away from contig ends, the totalrate of single-base and indel errors is 0.13 per kbp. See Supplementary Information for further details. 现在的技术发展已经大大降低了测序的成本,但组合整个脊椎动物的基因组仍然是一个具有挑战性的计算问题。我们生产出的参考组件来自于一个单一的名叫Kamilah的西部雌性低地大猩猩,使用毛细血管的5.43109个碱基对(5.4英镑),结合166.8英镑的阅读序列,基因,转录和预测的同源基因和目录是被人类附加的,额外的分析发现基因间转录是498功能长(200-bp)。表1总结了组件和注释的内容,对已完成的序列进行质量评估时发现,典型的(N50:见表一的定义)无差别序列是7.2功能长,错误则倾向于集中在重复区域。再重复区域外部和远离结束区域,单碱基和缺失的错误总率是0.13长每千碱基对。参见补充信息的进一步细节。 1Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK. 2Bioinformatics Research Center, Aarhus University, C.F. M鴏lers Alle′ 8, 8000 Aarhus C, Denmark. 3Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA. 4European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK. 5Department of Genetic Medicine and Development, University of Geneva Medical School, Rue Michel-Servet 1, 1211 Geneva 4, Switzerland. 6Institut de Biologia Evolutiva (UPF-CSIC), 08003 Barcelona, Catalonia, Spain. 7Institucio Catalana de Recerca i Estudis Avan鏰ts, ICREA, 08010 Barcelona, Spain. 8Department of Zoology, University of Cambridge, Downing Street, CambridgeCB2 3EJ, UK. 9University of Cambridge, Department of Oncology, Hutchison/MRC Research Centre, Hills Road, Cambridge CB2 0XZ, UK. 10Cancer Research UK, Cambridge Research Institute, Li Ka Shing Centre, Robinson Way, Cambridge CB2 0RE, UK. 11Howard Hughes Medical Institute, University of Washington, Seattle, Washington 20815-6789, USA. 12Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff CF14 4XN, UK. 13Department of Anthropology, Yale University, 10 Sachem Street, New Haven, Connecticut 06511, USA. 14The Genome Institute at Washington University, Washington University School of Medicine, Saint Louis, Missouri 63108, USA. 15MRC Functional Genomics Unit, University of Oxford, Department of Physiology, Anatomy and Genetics, South Parks Road, Oxford OX1 3QX, UK. 16Wellcome Trust Centre for Human Genetics, Roosevelt Drive, Oxford OX3 7BN, UK. 17Comparative Genomics Unit, Genome Technology Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, 20892-2152, USA. 18Max Planck Institute for Evolutionary Anthropology, Primatology Department, Deutscher Platz 6, Leipzig 04103, Germany. 19Children’s Hospital Oakland Research Institute, Oakland, California 94609, USA. 20San Diego Zoo’s Institute for Conservation Research, Escondido, California 92027, USA. {Present addresses: Institut des Sciences de l’E′ volution – Montpellier (I.S.E.-M.), Universite′ de Montpellier II – CC 064, 34095 Montpellier Cedex 05, France (J.Y.D); Centre for Genomic Research, Institute of Integrative Biology, University of Liverpool, Crown Street, Liverpool L69 7ZB, UK (I.G.); Division of Biological Anthropology, University of Cambridge, Fitzwilliam Street, Cambridge CB2 1QH, UK (B.Y.); EASIH, University of Cambridge, Addenbrooke’s Hospital, Cambridge CB2 0QQ, UK (A.S.R.); Oxford Nanopore Technologies, Edmund Cartwright House, 4 Robert Robinson Avenue, Oxford OX4 4GA, UK (D.J.T.); Institute of Microbiology, Chinese Academy of Sciences, Datun Road, Chaoyang District, Beijing 100101, China (B.Z.); The Genome Analysis Centre, Norwich Research Park, Norwich NR4 7UH, UK (J.R.). 地名 -------------------------------------------------------------------- We also collected less extensive sequence data for three other gorillas,to enable a comparison of species within the Gorilla genus. Gorillas survive today only within several isolated and endangered populationswhose evolutionary relationships are uncertain. In addition to Kamilah,our analysis included two western lowland gorillas, Kwanza (male) and EB(JC) (female), and one eastern lowland gorilla, Mukisi (male). 我们还收集了其他三种大猩猩的广泛的基因序列,用来比较物种内的大猩猩属,大猩猩生活到现在只有几个进化关系不确定的孤立和濒危的种群,除了Kamilah,我们的分析还包括两个西部低地大猩猩Kwanza(男)和EB(女),和一个东部大猩猩,mukisi(男)。 -------------------------------------------------------------------- Figure 1 | Speciation of the great apes. a, Phylogeny of the great ape family, showing the speciation ofhuman(H), chimpanzee (C), gorilla (G) and orang-utan (O).Horizontal lines indicate speciation times within the hominine subfamily and the sequence divergence time between human and orang-utan. Interior grey lines illustrate an example of incomplete lineage sorting at a particular genetic locus—in this case (((C, G), H), O) rather than (((H, C), G), O). Below aremean nucleotide divergences between human and the other great apes from the EPO alignment. b, Great ape speciation and divergence times. Upper panel, solid lines show how times for the HC and HCG speciation events estimated by CoalHMM vary with average mutation rate; dashed lines show the corresponding average sequence divergence times, as well as the HO sequence divergence. Blue blocks represent hominid fossil species (key at top right): each has a vertical extent spanning the range of dates estimated for it in the literature9,50, and a horizontal position at the maximummutation rate consistent both with its proposed phylogenetic position and the CoalHMM estimates (including some allowance for ancestral polymorphism in the case of Sivapithecus). The grey shaded region shows that an increase in mutation rate going back in time can accommodate present-day estimates, fossil hypotheses, and a middle Miocene speciation for orang-utan. Lower panel, estimates of the average mutation rate in present-day humans11–13; grey bars show 95%confidence intervals, with black lines at themeans. Estimates were made by the 1000 Genomes Project for trios of European (CEU) and Yoruban African (YRI) ancestry. 图1|类人猿的形态。a,巨猿家庭的系统发育,显示人类(H),黑猩猩(C),大猩猩(g)和猩猩(O)的形态。水平线表明人类亚种形态的时间和人类和大猩猩序列分歧的时间。内部的灰色线说明了在一个特定的遗传位点下不完整的谱系分类的例子。在(((C,G),H),O)的情况下,而不是(((H,C),G),O)。下面的是人类和其他类人猿的促红细胞生成素序列之间的的平均核苷酸的分歧。b,巨猿物种和分裂时间。上端,实线条显示的是COALHMM如何随着平均突变率估计HC和HCG的形态事件,虚线则显示相应的平均序列差异,以及HO的序列差异,蓝色块代表的是原始人类物种的化石:在文献上,每一个都有一个涵盖估算日期的垂直范围,以及可能的进化史中的位置和COALHMM估计的最大变异率的水平位置(包括一些允许的西瓦古猿的祖先多态性的情况)。灰色的阴影区域显示突变率的增加,一个化石,中新世中期的猩猩形态。较低的区域,当今人类的平均突变率的估计11—13;95%的私密距离。这项估计来自于欧洲(CEU)和非洲约鲁巴的祖先(yr)的1000个基因组项目。 ---------------------------------------------------------- Speciation of the great apes 类人猿的形态 ------------------------------------------------------ We included the Kamilah assembly with human, chimpanzee (Pan troglodytes), orang-utan (Pongo abelii) and macaque (Macaca mulatta) in a five-way whole-genome alignment using the Ensembl EPO pipeline6 (Supplementary Table 3.2). Filtering out low-quality regions of the chimpanzee assembly and regions with many alignment gaps, we obtained 2.01 Gbp of 1:1:1:1 great ape orthologous alignment blocks, to which we then applied a coalescent inference model, CoalHMM, to estimate the timescales and population sizes involved in the speciation of the hominines (African great apes; see Supplementary Table 1.1 for terminology), with orang-utan as an outgroup(Supplementary Information). 我们包括kamilah再利用人类EPO传递途径的五种形式的全基因组队列中装配人类,黑猩猩,猩猩与猕猴的基因,在由此筛选的黑猩猩基因组合的地质质量区域和有许多队列缺口的地区,我们获得了2.01Gbp1:1:1:1的巨猿直系同源队列块,随后我们建立了一个联合推理模型,coalhmm,以估算与人类物种(非洲巨猿,补充表1.1)形成有关的时间和种群规模,而把猩猩列为一个外类群(补充信息)。 ------------------------------------------------------ ------------------------------------------------------ Two issues need to be addressed in interpreting the results from CoalHMM (Supplementary Table 4.2). First, the results themselves are obtained in units of sequence divergence rather than years, and so need to be scaled by an appropriate yearly mutation rate. Second, as with any model, CoalHMM makes several simplifying assumptions whose consequences we need to understand in the context of realistic demography. We discuss these issues in turn. 在解释由CoalHMM得到的结果时,两个问题需要解决(补充表4.2)。首先,他们自己由单位序列差异得到的结果和往年并不相同,于是需要通过一个适当的年突变率进行规范。其次,和其他模型一样,CoalHMM做出了一些我们需要在现实的人口统计学中理解的简化假设,现在我们反过来讨论这些问题。 ------------------------------------------------------ Using a rate of 1E-9 mutations per bp per year, derived from fossil calibration of the human–macaque sequence divergence and as used in previous calculations, CoalHMM’s results would correspond to speciation time estimates THC (for human–chimpanzee) and THCG (for human chimpanzee–gorilla) of 3.7 and 5.95 Myr ago, respectively (Fig. 1b). These dates are consistent with other recent molecular estimates7,8, but are at variance with certain aspects of the fossil record, including several fossils which have been proposed—though not universally accepted9—to be hominins, and therefore to postdate the human–chimpanzee split (Fig. 1b). Indeed, the relationship between molecular and fossil evidence has remained difficult to resolve despite the accumulation of genetic data10. Direct estimates of the per-generation mutation rate in modern human populations, based on the incidence of disease-causing mutations11 or sequencing of familial trios12,13, indicate that a lower value of (0.5–0.6)31029 bp21 yr21 is plausible (based on average hominine generation times of 20–25 yr). This would give substantially older estimates of approximately 6 and 10 Myr ago for THC and THCG, potentially in better agreement with the fossil record. 利用每年每bp10-9的突变率和起源与化石标度的人类—猕猴序列以及先前的计算,coalhmm的结果符合估计的物种形成时间THC(人类—大猩猩)370万年和thcg(人类—黑猩猩—大猩猩)590万年。这些数据和最近其他的分子计算相一致,但是在化石记录的某几个方面发生了变异,包括几个已经被提出的----即使还没有完全被普遍接受为人类,这因此推迟了人类—黑猩猩的分离时间甚至于尽管有大量的基因数据,分子数据和化石证据的关系也是很难分开的。对于人类每一代的突变率,是根据疾病产生的突变率的影响和家庭的基因序列得来的。进而表明了产生的一个(0.5-0.6)31029 21年的低值是貌似可信的(根据人类平均一代的时间20—25年)。这可能会给出一个更充分更久远的关于THC和THCG的估算,大约600---1000万年前,大约与化石的记录更一致 ------------------------------------------
展开阅读全文

开通  VIP会员、SVIP会员  优惠大
下载10份以上建议开通VIP会员
下载20份以上建议开通SVIP会员


开通VIP      成为共赢上传

当前位置:首页 > 教育专区 > 其他

移动网页_全站_页脚广告1

关于我们      便捷服务       自信AI       AI导航        抽奖活动

©2010-2025 宁波自信网络信息技术有限公司  版权所有

客服电话:0574-28810668  投诉电话:18658249818

gongan.png浙公网安备33021202000488号   

icp.png浙ICP备2021020529号-1  |  浙B2-20240490  

关注我们 :微信公众号    抖音    微博    LOFTER 

客服