1、Dynamics in the assembly of the 30S ribosomal subunitinvestigated by coarse-grained simulationsXinLiu1,andZhiyongZhang1,21School of Data Science,University of Science and Technology of China,Hefei 230027,China;2Department of Physics,University of Science and Technology of China,Hefei 230026,ChinaCor
2、respondence:ZhiyongZhang,E-mail:2023TheAuthor(s).ThisisanopenaccessarticleundertheCCBY-NC-ND4.0license(http:/creativecommons.org/licenses/by-nc-nd/4.0/).Cite This:JUSTC,2023,53(9):0906(8pp)ReadOnlineAbstract:Theribosomeisalargebiomolecularcomplexresponsibleforproteinsynthesis.InEscherichia coli(E.co
3、li),acompleteribosomeiscomposedofa30Ssmallsubunitanda50Slargesubunit.Forapproximatelyhalfacentury,the30Ssubunithasbeenakeymodelsystemforstudyingtheinvitroassemblyoftheribosome,andanassemblymaphasbeenproposed.However,structuraldetailsintheassemblyofthisproteinRNAcomplexremainelusive.Inthispaper,wecon
4、-ductedaseriesofcoarse-grainedsimulationsfollowingtheorderoftheassemblymaptoinvestigateconformationaldy-namicsduringtheassemblyprocessofthe30Ssubunit.Ithasbeenfoundthatthetertiarystructureofnaked16SrRNAisveryunstable,whichisthecaseafterbindingofearly-assemblyproteins.Themid-assemblyproteinscansignif
5、icantlyre-strictthemobilityofthe16SrRNAandmakethelatterclosetothenativestructure.Thefinalbindingofthelate-assemblyproteinswouldfullyobtainthecollectivemotionofthe16SrRNA.Inparticular,proteinsS9andS3mayhavemoreim-portantcontributionstotheassemblyofthe30SsubunitthanotherSproteins.Ourstrategyofcoarse-g
6、rainedsimulationscanbegenerallyusedtostudyassemblydynamicsoflargebiomolecularcomplexesaslongastheassemblymapisavailable.Keywords:biomolecularcomplexes;assembly;ribosome;30Ssubunit;coarse-grainedsimulation;principalcomponentanalysisCLC number:Q617Document code:A1 IntroductionTheribosomeisalargebiomol
7、ecularcomplexthatincludesproteinsandRNAsandisresponsibleforcatalyzingproteinsynthesis1.Acompleteribosomecontainsalargesubunitanda small subunit.In prokaryotes and archaea,the 50S largesubunitiscomposedofa23SrRNA,a5SrRNA,and33pro-teins,whereasthe30Ssmallsubunitconsistsofa16SrRNAand21proteins2.Theevol
8、utionofalargebiomolecularcomplexcanbeseenasalongperiodofassembly,sotheassemblycanreflectthepathwayofcomplexevolution3.Theassemblyprocessofacomplexisdynamic4,whichisrelatedtotheorderinwhichthesameordifferentcomponentscometogether3,5,6.Theas-semblyof the ribosome in vivo requires hundreds of as-sembly
9、factorstoworktogetherundercertainconditions,butthe assembly in vitro can be done without any assemblyfactors7,whichhasbeenstudiedforalongtime.Asearlyasthe1960s,theinvitrospontaneousassemblyabilityofthe30SsubunitintheE.coliribosomewasconfirmedbytheNomuralaboratory8,andsubsequently,theassemblyofthe50S
10、 subunit was confirmed by the Nierhaus laboratory9.Later,the ability of ribosomes from other bacteria to as-semble into catalytically active structures through separatenaturalRNAandproteinswasalsodemonstrated1015.The16SrRNAinthe30Sribosomalsubunithasfourinde-pendentdomains:the5domain,thecentraldomai
11、n,the3majordomain,andthe3minordomain.Eachofthefirstthreemaindomainscanbeindependentlyassembledwiththecorresponding S proteins in vitro1618.Naked 16S rRNA isveryactiveandrequirestheassemblyofSproteinstoformastableordered structure.The 30S subunit can be reconstit-utedinvitrowith16SrRNAasastartingpoin
12、tbyaddingthenecessary S proteins to exhibit their relevant biochemicalcharacteristics19.Under reconstitution conditions,severalmethodsofaddingproteinstothe16SrRNAusingcombinedbindingallowtheSproteinstobedividedintothreeclassestoformanassemblymap20.Primarybindingproteinsbinddir-ectlyandindependentlyt
13、othe16SrRNA,whichisthoughttoinitiatethefoldingofeachofthethreemaindomains.Sec-ondarybindingproteinsrequireatleastoneprimarybindingproteinbeforebindingtothe16SrRNA,andtertiarybindingproteinsrequireatleastoneproteinfromthefirsttwostagestobeassembled.Throughinvitroassemblykineticexperi-ments,akineticas
14、semblymapwasformed,andtheSpro-teins were divided into early-,mid-,mid-late-and late-assemblyproteins.Thekineticassemblydatareflectvariousaspects of domain assembly and binding trends,indicatingthattheassemblyrangesfromthe5domaintothe3domain,whichisconsistentwiththecotranscriptionalassembly21.Article
15、http:/Received:April 18,2023;Accepted:June 08,202309061DOI:10.52396/JUSTC-2023-0064JUSTC,2023,53(9):0906Inrecentyears,variousexperimentaltechniqueshavebeenusedtostudytheassemblyofthe30Ssubunit.Electrosprayionizationmassspectrometry(MS),alongwiththelatestcryo-EM techniques,can visualize intermediates
16、 at near-atomicresolutionthroughoutthepathwayofsubunitconstructionandisthereforewidelyusedtostudytheassemblypathwayofcomplexes5.Whilesuchdatahaveproventobeaveryusefulstartingpoint,theydonotprovidedetailedinformationaboutdynamicsintheassemblyprocess.Thatis,onlythestructurecanbeobtained,andthereisnote
17、noughinformationonhowthecomplexformstheexistingstructure.Molecular dynamics(MD)simulations have long beenusedin studying the dynamic processes of large bio-molecules.They are generally atomic based on classicalmechanics as kinetic principles,and molecular mechanicalforce fields are constantly improv
18、ing22.However,all-atomMDisveryexpensiveinsimulatinglargebiomolecularcom-plexessuchastheribosome,anditwouldbedifficulttode-scribethecompleteandlongtime-scalekineticprocess23.Inrecentyears,coarse-grained(CG)modelshavebeenusedtorevealthephysicalmechanismofaseriesofprotein/nucleicacidmolecules2428becaus
19、etheircomputationalefficiencyissignificantlyhigher than that of all-atom MD.CG simula-tionshavebeencontinuouslyoptimizingandcalibratingrel-evantforcefieldsandmethodsduetotheirabilitytodisplayprocessessuchastheassemblyoflargebiomolecularcom-plexesonalongtimescale29.TherearemanyCGmodelsavailable,sucha
20、sthestructure-basedmodel30,themultiscalecoarse-grained(MS-CG)method31,theMARTINImodel32,33,andtheSAFTmodel34.Therefore,inthiswork,basedonthekineticassemblymap,westudystructuraldynamicsintheassemblyprocessofthe30SsubunitthroughCGsimulations.Thesimulationdataarecomparedwiththeexistingexperimentaldatao
21、finvitroas-sembly,whichcanrevealtheroleofthespecificSproteinsintheassemblyprocess.2 Materials and methods2.1 Simulated systemsThe 30S ribosomal subunit of E.coli is composed of 16SrRNAand21ribosomalproteins(Sproteins,namedS1S21indecreasingorderofmolecularweight).Anatomicmodelofthe30Ssubunitwastakenf
22、romRef.35,inwhichthe16SrRNAcontains1536nucleotidesand20Sproteins(S2S21)areavailable(Fig.1).Thethreemaindomains(5,central,and3major)ofthe16SrRNAmakeupthebody,platform,andhead,respectively.TheassemblyoftheSproteinsisdi-videdintofourstagesaccordingtothekineticmap(Table1).Theearly-assemblyproteinsinclud
23、eS4,S6,S11,S15,S16,S17,S18,andS20(Fig.1,magenta),themid-assemblypro-teinsincludeS7,S8,S9,S13,andS19(Fig.1,orange),themid-late-assemblyproteinsincludeS5andS12(Fig.1,blue),andthelate-assemblyproteinsincludeS2,S3,S10,S14,andS21(Fig.1,red).WebuiltthefollowingsystemstorunCGsimulations.(i)Thenaked16SrRNAi
24、sthestartingstateoftheassembly.(ii)The 30S subunit is the end state of the assembly.(iii)Betweenthestartingandendstates,weaddedtheSproteinsonebyone,followingtheordershowninTable1.2.2 The CG modelsIn the off-lattice G model of a protein,each amino acidFig.1.Structureofthe30Sribosomalsubunit.(a)Thefro
25、ntside(theinterfacewiththe50Ssubunit)and(b)theback.The16SrRNAiscoloredblack,andtheSproteinsarecoloreddifferentlyaccordingtothekineticassemblymap(Table1).Thebody,platform,head,andbeakarelabeled.Dynamicsintheassemblyofthe30Sribosomalsubunitinvestigatedbycoarse-grainedsimulationsLiuetal.09062DOI:10.523
26、96/JUSTC-2023-0064JUSTC,2023,53(9):0906residueisrepresentedbyaCGparticle,mostoftenattheposi-tionofitsCatom.Thepotentialenergyfunctionis:V(R|R0)=ikb(ri,i+1ri,i+1,0)2+ik(ii,0)2+ik11cos(ii,0)+k31cos3(ii,0)+nativeij3go5(rij,0rij)126(rij,0rij)10+non-nativeij3ev(drij)12.(1)Thefirstthreetermsrepresentpoten
27、tialsforvirtualbondlengths,bondangles,anddihedralangles,respectively.Thefourthtermisthenativecontactpotentialbetweennonlocalresiduepairs,wherethesummationisrestrictedtothosepairsinthe reference structure.The last term represents an ex-cludedvolumeeffectthatisapenaltyfornonnativecontacts.In this work
28、,a G-like protein model called AICG2+36wasused:VAICG2+(R|R0)=Vbond+Vflploc+Vnatloc+Vnatnloc+Vexv.(2)VbondVflplocVnatlocVnatnlocgoVexvisthefirsttermofEq.(1).isagenericflexiblelocalpotentialforthevirtualbondanglesanddihedralangles.isthestructure-basedlocalpotential.isthestructure-basednonlocalcontactp
29、otential,whichissimilartothefourthtermofEq.(1)exceptthatthesearenotuniformbutde-termined from atomic interactions between specific residuepairs.isthelasttermofEq.(1).Additionally,electrostat-icforceswerecomputedusingtheDebyeHckelequationwithacutoffof20andionstrengthof0.15mol/L.Thedielectric constant
30、 was set as 78.0 F/m.The off-lattice Gmodel37anditsderivativesrepresentquasiharmonicfluctu-ationsnearnativestructureswhileachievingaperfectfunnelenergylandscape.Inthe16SrRNA,eachnucleotideisrepresentedbythreeCGparticles,whicharephosphate(P),sugar(S),andbase(B).Thismodelisarationaltreatmentforthechem
31、icaldistri-butionofnucleotides.Thepotentialisdescribedbythefol-lowingterms:thelocalpotential,thenonlocalcontactinterac-tion,theexcludedvolumeterm,andtheelectrostaticinterac-tion.Since the 16S rRNA is largely double-stranded,aPAIR_RNA potential needs to be included to maintain thestabilityofbasepairs
32、.Thisinteractionhasasimilarformasthegeneralcontactinteractionbutusesdifferentcoefficientsdiscriminatedbythenumberofbase-paringhydrogenbonds.Forinteractionsbetweenthe16SrRNAandtheSproteins,wealsousedtheGmodelwiththeadditionofrepulsionandelectrostaticinteractions.2.3 CG simulationsAlltheCGsimulationsw
33、ereconductedbyCafeMolversion3.2.130usingLangevindynamics,withatimestepof0.3inCafeMol time units.Each CG simulation was run for 108steps,and the temperature was set to 300 K.The overalltranslationandrotationoftheentiremoleculewereallowed.go_unitisanoverallscalingfactorofthepairwiseinterac-tionstrengt
34、h.Foreachnativecontact(intramoleculeorinter-molecule),thedefaultvalueofgo_unitis1.0.Weneedtode-termineapropervalueofgo_unittoobservesignificantcon-formationaldynamicsofthe30Ssubunitwhilepreservingitstertiarystructure.Therefore,CGsimulationsofthe30Ssub-unitwereconductedbytryingdifferentvaluesofgo_uni
35、t,andthreeindependentsimulationswererunforeachgo_unit.Ithasbeenfoundthatago_unitof0.8isappropriate.Wethenfixed the go_unit as 0.8 for both intra-and intermolecularcontactsinthefollowingCGsimulations.2.4 Principal component analysisij=(riri)(rjrj)Principalcomponentanalysis(PCA)onasimulationtraject-or
36、yofalargebiomoleculeisusedtoextractlarge-scalecol-lectivemotionofthebiomoleculefromitssmallandrandominternalmotion38.Thismethodcomprisesthefollowingsteps.(i)Allconformationsinthetrajectoryarefittedtoareferencestructuretoeliminatetheoveralltranslationandrotationofthebiomolecule.(ii)Acovariancematrixo
37、fpositionalfluctu-ationisconstructed,whereriandrjrepresentCartesiancoordinatesoftheparticlesselectedforPCA,andmeansthecoordinateaverage.(iii)Thecovari-ance matrix is diagonalized to obtain eigenvectors(PCAmodes)andcorrespondingeigenvalues.ThePCAmodeswiththelargesteigenvaluesgenerallydescribethecolle
38、ctivemo-tionofthebiomolecule39.Foreachsystem,1000conforma-tionswerecontainedintheCGtrajectory,andthePparticleswereusedtoconstructthecovariancematrix.ToclearlyseethemotionalongcertainPCAmodes,onecan project the conformations in the trajectory onto thesemodes individually.The root mean square inner pr
39、oduct(RMSIP)betweentwosetsofPCAmodescanmeasuretheirdynamicsimilarity40:Table 1.KineticassemblymapoftheSproteinsinthe30Sribosomalsubunit.GroupProteinI(early)S4S6S11S15S16S17S18S20II(mid)S7S8S9S13S19II(mid-late)S5S12III(late)S2S3S10S14S21Liuetal.09063DOI:10.52396/JUSTC-2023-0064JUSTC,2023,53(9):0906RM
40、SIP=1nni=1nj=1(vAivBj)212,(3)whereviandvjrepresenttheithandjthPCAmodesobtainedfromtwodifferentsetsofPCAmodes(AandB),andnisthenumberofPCAmodesusedtocomputeRMSIP.SincethePCAmodesarenormalized,anRMSIPof1meansthatthetwosetsofPCAmodesareidentical,andanRMSIPof0in-dicatesthattheyareorthogonal.3 Results and
41、 discussion3.1 Structural dynamics in the four-stage assembly ofthe 30S subunitTomeasureconformationalchangesofthe16SrRNAintheCGsimulations,therootmeansquareddeviations(RMSD)ofallthePparticleswerecalculatedusingthenativestructureofthe16SrRNAinthe30Ssubunitasthereference.IntheCGsimulationofthenaked16
42、SrRNA,thesystemisveryun-stable,withanRMSDgreaterthan30.0(Fig.2,black).Thetertiarystructureofthe16SrRNAisunfolded.Boththeplat-formandtheheadareexpanded,andthejunctionbetweentheheadandthebodyisextremelymobile(Fig.3a).Theresultsindicatethat,withouttheSproteins,the16SrRNAitselfcan-notmaintainitsnativest
43、ructure.Wenowwanttoinvestigatehowtheoverallstabilityofthe16S rRNA changes after the S proteins at each stage areadded sequentially.When only the early-assembly proteinsareassembledtothe16SrRNA(denotedas16S+I),theter-tiarystructureisstillnotstable,withRMSDvaluesessen-tially between 2030 (Fig.2,magent
44、a).These early-as-semblyproteinsbindtothe5andcentraldomains(Fig.3b,magenta),which make up the body and platform,respect-ively.Therefore,the body is stabilized,and the platformmovesclosetothebodyattheearlystagecomparedtothena-ked 16S rRNA(Fig.3a).However,no early-assembly pro-teinisassembledtothe3maj
45、ordomainthatmakesupthehead,sotheheadisnearlyasmobileasthatinthenaked16SrRNA.ThismayexplainwhytheRMSDvaluesofthe16S+Isystem are still large.After the mid-assembly proteins are assembled(denotedas16S+I-II),thestabilityofthesystemisgreatlyin-creased.TheRMSDvaluesfluctuatebyapproximately10.0(Fig.2,orang
46、e).Accordingtothepositionsofthefivemid-assemblyproteinsonthe16SrRNA(Fig.3c,orange),fourofthemarelocatedontheheadthatjoinsthetwohalvesofthe3majordomaintogether;therefore,themobilityoftheheadis significantly decreased.Only one mid-assembly protein(S8)islocatedatthebackofthebody.Thetwomid-lateas-sembly
47、proteinsareclosetothecentraljunctionbetweenthebodyandthehead(Fig.3d,blue).Oncetheyareassembled(denoted as 16S+I-II-II),the stability is increased slightly(Fig.2,blue)compared to the 16S+I-II system(Fig.2,or-ange).The16S+I-II-IIsystemisalsocalledthe21Sreconsti-tutionintermediate20,whichlacksmostofthe
48、head-bodycon-tacts,andthebeakregionintheheadisflexible.Thelate-assembly proteins would make additional connectionsbetweentheheadandthebodyandrestrictthemotionofthebeak(Fig.3e,red).Therefore,in the final 30S subunit,the 16SrRNAiswellfoldedwithanaverageRMSDof3.7(Fig.2,red).TheassemblyorderoftheSprotei
49、nsisbasicallyfrom5to3ofthe16SrRNA41,whichdoesnotmeanthattheearly-assemblyproteinscontributemosttothestabilityofthesys-tem.Ithasbeenfoundthatmid-andlate-assemblyproteinsplaythemostimportantrolesintheassemblyprocessduetotheirlocationson16SrRNA.WeconductedPCAontheCGtrajectoryofthe30Ssub-unit,andonlyPpa
50、rticleswereusedtoconstructthecovari-ancematrix.Conformationsinthetrajectorywerethenpro-jected onto a 2D subspace defined by the first two PCAmodes(denotedasPC1andPC2).Thesampledregionofthe30Ssubunitiscenteredatapproximately(0,0)(Fig.4a,red).ForeachPCAmode,wetookconformationswiththemostnegativeandthe