ImageVerifierCode 换一换
格式:PDF , 页数:83 ,大小:2.50MB ,
资源ID:1508826      下载积分:25 金币
验证码下载
登录下载
邮箱/手机:
验证码: 获取验证码
温馨提示:
支付成功后,系统会自动生成账号(用户名为邮箱或者手机号,密码是验证码),方便下次登录下载和查询订单;
特别说明:
请自助下载,系统不会自动发送文件的哦; 如果您已付费,想二次下载,请登录后访问:我的下载记录
支付方式: 支付宝    微信支付   
验证码:   换一换

开通VIP
 

温馨提示:由于个人手机设置不同,如果发现不能下载,请复制以下地址【https://www.zixin.com.cn/docdown/1508826.html】到电脑端继续下载(重复下载【60天内】不扣币)。

已注册用户请登录:
账号:
密码:
验证码:   换一换
  忘记密码?
三方登录: 微信登录   QQ登录  
声明  |  会员权益     获赠5币     写作写作

1、填表:    下载求助     留言反馈    退款申请
2、咨信平台为文档C2C交易模式,即用户上传的文档直接被用户下载,收益归上传人(含作者)所有;本站仅是提供信息存储空间和展示预览,仅对用户上传内容的表现方式做保护处理,对上载内容不做任何修改或编辑。所展示的作品文档包括内容和图片全部来源于网络用户和作者上传投稿,我们不确定上传用户享有完全著作权,根据《信息网络传播权保护条例》,如果侵犯了您的版权、权益或隐私,请联系我们,核实后会尽快下架及时删除,并可随时和客服了解处理情况,尊重保护知识产权我们共同努力。
3、文档的总页数、文档格式和文档大小以系统显示为准(内容中显示的页数不一定正确),网站客服只以系统显示的页数、文件格式、文档大小作为仲裁依据,个别因单元格分列造成显示页码不一将协商解决,平台无法对文档的真实性、完整性、权威性、准确性、专业性及其观点立场做任何保证或承诺,下载前须认真查看,确认无误后再购买,务必慎重购买;若有违法违纪将进行移交司法处理,若涉侵权平台将进行基本处罚并下架。
4、本站所有内容均由用户上传,付费前请自行鉴别,如您付费,意味着您已接受本站规则且自行承担风险,本站不进行额外附加服务,虚拟产品一经售出概不退款(未进行购买下载可退充值款),文档一经付费(服务费)、不意味着购买了该文档的版权,仅供个人/单位学习、研究之用,不得用于商业用途,未经授权,严禁复制、发行、汇编、翻译或者网络传播等,侵权必究。
5、如你看到网页展示的文档有www.zixin.com.cn水印,是因预览和防盗链等技术需要对页面进行转换压缩成图而已,我们并不对上传的文档进行任何编辑或修改,文档下载后都不会有水印标识(原文档上传前个别存留的除外),下载后原文更清晰;试题试卷类文档,如果标题没有明确说明有答案则都视为没有答案,请知晓;PPT和DOC文档可被视为“模板”,允许上传人保留章节、目录结构的情况下删减部份的内容;PDF文档不管是原文档转换或图片扫描而得,本站不作要求视为允许,下载前自行私信或留言给上传者【宇***】。
6、本文档所展示的图片、画像、字体、音乐的版权可能需版权方额外授权,请谨慎使用;网站提供的党政主题相关内容(国旗、国徽、党徽--等)目的在于配合国家政策宣传,仅限个人学习分享使用,禁止用于任何广告和商用目的。
7、本文档遇到问题,请及时私信或留言给本站上传会员【宇***】,需本站解决可联系【 微信客服】、【 QQ客服】,若有其他问题请点击或扫码反馈【 服务填表】;文档侵犯商业秘密、侵犯著作权、侵犯人身权等,请点击“【 版权申诉】”(推荐),意见反馈和侵权处理邮箱:1219186828@qq.com;也可以拔打客服电话:4008-655-100;投诉/维权电话:4009-655-100。

注意事项

本文(人工智能行业:ChatGPT技术分析.pdf)为本站上传会员【宇***】主动上传,咨信网仅是提供信息存储空间和展示预览,仅对用户上传内容的表现方式做保护处理,对上载内容不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知咨信网(发送邮件至1219186828@qq.com、拔打电话4008-655-100或【 微信客服】、【 QQ客服】),核实后会尽快下架及时删除,并可随时和客服了解处理情况,尊重保护知识产权我们共同努力。
温馨提示:如果因为网速或其他原因下载失败请重新下载,重复下载【60天内】不扣币。 服务填表

人工智能行业:ChatGPT技术分析.pdf

1、ChatGPT技术分析刘群 LIU Qun华为诺亚方舟实验室 Huawei Noahs Ark Lab在线讲座(an online lecture)2023-02-16ChatGPT概览ChatGPT的出色表现ChatGPT的关键技术ChatGPT的不足之处ChatGPT未来发展方向ContentYV7ZUV2ZGWRYbR9RaQoMqQmOsRjMrRqOeRoOvN6MpNtQwMqRtRMYrQrOChatGPT概览ChatGPT的出色表现ChatGPT的关键技术ChatGPT的不足之处ChatGPT未来发展方向ContentChatGPT轰动效应用户数:5天100万,2个月达到1亿所

2、有人都开始讨论ChatGPT,传播速度堪比新冠病毒Google内部拉响红色警报Google紧急仅仅发布Bard,但因发布现场出现错误导致股票蒸发8%微软追加投资OpenAI一百亿美元微软迅速推出加载了ChatGPT的New Bing,并计划将ChatGPT接入Office套件国内外大厂迅速跟进1 total:40ChatGPT官方博客:简介TryChatGPT Learn moreAPIRESEARCHABOUTBLOGChatGPT:OptimizingLanguage Modelsfor DialogueWeve trained a model called ChatGPT which i

3、nteracts in aconversational way.The dialogue format makes it possible forChatGPT to answer followup questions,admit its mistakes,challenge incorrect premises,and reject inappropriate requests.ChatGPT is a sibling model to InstructGPT,which is trained tofollow an instruction in a prompt and provide a

4、detailedresponse.November 30,202213 minute readWe are excited to introduce ChatGPT to get users feedback and learn about its strengths and weaknesses.During the researchpreview,usage of ChatGPT is free.Try it now at .SamplesTRY CHATGPTIn the following sample,ChatGPT asks clarifying questions to debu

5、g code.(?/?)USERthis code is not working like i expect how do i fixit?resultWorkerErr:=make(chan error)defer close(resultWorkerErr)go func()defer cancel()resultWorkerErr-b.resultWorker(ctx)We are excited to introduce ChatGPT to get users feedback and learn about its strengths and weaknesses.During t

6、he researchpreview,usage of ChatGPT is free.Try it now at .SamplesTRY CHATGPTIn the following sample,ChatGPT asks clarifying questions to debug code.(?/?)USERthis code is not working like i expect how do i fixit?resultWorkerErr:=make(chan error)defer close(resultWorkerErr)go func()defer cancel()resu

7、ltWorkerErr 0PLM(s)=110(1)total:40语言模型定义Language Modeling is the task of predicting what word comes next.the students opened their _More formally:given a sequence of words ,compute the probability distribution of the next word :where can be any word in the vocabularyA system that does this is called

8、 a Language Model.Language Modelingexamsmindslaptopsbooks15Christopher Manning,Natural Language Processing with Deep Learning,Standford U.CS224n10(2)total:40语言模型的发展n元语言模型神经网络语言模型循环神经网络语言模型Transformer语言模型预训练语言模型(Pre-trained Language Models,PLMs)BERT:双向掩码语言模型GPT:纯解码器语言模型大型生成式预训练语言模型(Large Language Mod

9、els,LLMs)GPT-3ChatGPT11 total:40预训练语言模型(Pre-trained Language Models,PLMs)典型代表:ELMo,BERT,GPTPre-training-then-fine-tuning范式将在pre-training阶段学习到的语言表示迁移到下游任务12 total:40Transformer模型Liliang Wen,Generalized Language Models:Ulmfit&OpenAI GPT(blog)13 total:40自注意力机制(self-attention)(Vaswani et al.,2017)14(1)t

10、otal:40自注意力机制(self-attention)每个token是通过所有词动态加权得到动态权重会随着输入的改变而变化(BertViz tool,Vig et al.,2019)14(2)total:40ChatGPT的关键技术预训练语言模型(Pre-trained Language Models,PLMs)大型生成式预训练语言模型(Large Language Models,LLMs)人类反馈强化学习(RLHF)Content大型生成式预训练语言模型(LLM)预训练语言模型大型生成式预训练语言模型Pre-trained LanguageModels,PLMsLargeLanguage

11、Models,LLMs典型模型ELMo,BERT,GPT-2GPT-3模型结构BiLSTM,TransformerTransformer注意力机制双向、单向单向训练方式Mask&PredictAutoregressive Generation擅长任务类型理解生成模型规模1-10亿参数10-x1000亿参数下游任务应用方式Fine-tuningFine-tuning&Prompting涌现能力小数据领域迁移Zero/Few-shot Learning,In-context Learning,Chain-of-Thought15 total:40GPT-3简介GPT-3(Generative Pr

12、e-trained Transformer 3)是一个自回归语言模型,目的是为了使用深度学习生成人类可以理解的自然语言。GPT-3是由在旧金山的人工智能公司OpenAI训练与开发,模型设计基于谷歌开发的变换语言模型。GPT-3的神经网络包含1750亿个参数,在发布时为参数最多的神经网络模型。OpenAI于2020年5月发表GPT-3的论文,在次月为少量公司与开发团队发布应用程序界面的测试版。微软在2020年9月22日宣布取得了GPT-3的独家授权。16 total:40GPT-3模型家族ELMo:93M params,2-layer biLSTMBERT-base:110M params,12

13、-layer TransformerBERT-large:340M params,24-layer TransformerThe language model“scaling wars”!Mohit Iyyer,slides for CS685 Fall 2020,University of Massachusetts Amherst17 total:40GPT-3数据来源DatasetTokens(billion)AssumptionsTokens per byte(Tokens/bytes)RatioSize(GB)Web dataWebText2Books1Books2Wikipedia

14、410B19B12B55B3B25%WebTextGutenbergBibliotikSee RoBERTa0.710.380.570.540.261:1.91:2.61:1.751:1.841:3.8570502110111.4Total499B753.4GBTable.GPT-3 Datasets.Disclosed in bold.Determined in italics.Alan D.Thompson,GPT-3.5+ChatGPT:An illustrated overview,https:/lifearchitect.ai/chatgpt/18(1)total:40GPT-3数据

15、来源数据来源:跟其他大规模语言模型的对比18(2)total:40GPT-3训练数据量看一下大语言模型训练的token数量:GPT-3(2020.5)是500B(5000亿),目前最新数据为止;Google的PaLM(2022.4)是780B;DeepMind的Chinchilla是1400B;Pangu-公布了训练的token数,约为40B,不到GPT-3的十分之一;国内其他的大模型都没有公布训练的token数。19(1)total:40GPT-3训练数据量ELMo:1B training tokensBERT:3.3B training tokensRoBERTa:30B training

16、 tokensThe language model“scaling wars”!Mohit Iyyer,slides for CS685 Fall 2020,University of Massachusetts Amherst19(2)total:40GPT-3算力消耗The language model“scaling wars”!Log scale!Mohit Iyyer,slides for CS685 Fall 2020,University of Massachusetts Amherst20 total:40Few-shot and zero-shot learning(in-c

17、ontext learning)Brown et al.,Language Models are Few-Shot Learners,arXiv:2005.14165,202121(1)total:40Few-shot and zero-shot learning(in-context learning)Brown et al.,Language Models are Few-Shot Learners,arXiv:2005.14165,202121(2)total:40Chain-of-thoughtPreprint:https:/arxiv.org/pdf/2201.11903.pdf22

18、 total:40Magic word:Lets think step-by-step(c)Zero-shotQ:A juggler can juggle 16 balls.Half of the balls are golf balls,and half of the golf balls are blue.How many blue golf balls are there?A:The answer(arabic numerals)is(Output)8 X(d)Zero-shot-CoT(Ours)Q:A juggler can juggle 16 balls.Half of the b

19、alls are golf balls,and half of the golf balls are blue.How many blue golf balls are there?A:Lets think step by step.(Output)There are 16 balls in total.Half of the balls are golf balls.That means that there are 8 golf balls.Half of the golf balls are blue.That means that there are 4 blue golf balls

20、.Q:Roger has 5 tennis balls.He buys 2 more cans of tennis balls.Each can has 3 tennis balls.How many tennis balls does he have now?A:Roger started with 5 balls.2 cans of 3 tennis balls each is 6 tennis balls.5+6=11.The answer is 11.Q:A juggler can juggle 16 balls.Half of the balls are golf balls,and

21、 half of the golf balls are blue.How many blue golf balls are there?A:(Output)The juggler can juggle 16 balls.Half of the balls are golf balls.So there are 16/2=8 golf balls.Half of the golf balls are blue.So there are 8/2=4 blue golf balls.The answer is 4.(b)Few-shot-CoT(a)Few-shotQ:Roger has 5 ten

22、nis balls.He buys 2 more cans of tennis balls.Each can has 3 tennis balls.How many tennis balls does he have now?A:The answer is 11.Q:A juggler can juggle 16 balls.Half of the balls are golf balls,and half of the golf balls are blue.How many blue golf balls are there?A:(Output)The answer is 8.XFigur

23、e 1:Example inputs and outputs of GPT-3 with(a)standard Few-shot(Brown et al.,2020),(b)Few-shot-CoT(Wei et al.,2022),(c)standard Zero-shot,and(d)ours(Zero-shot-CoT).Similar toFew-shot-CoT,Zero-shot-CoT facilitates multi-step reasoning(blue text)and reach correct answerwhere standard prompting fails.

24、Unlike Few-shot-CoT using step-by-step reasoning examplespertask,ours does not need any examples and just uses the same prompt“Lets think step by step”acrossall tasks(arithmetic,symbolic,commonsense,and other logical reasoning tasks).In contrast to the excellent performance of LLMs in intuitive and

25、single-step system-1 Stanovichand West,2000 tasks with task-specific few-shot or zero-shot prompting Liu et al.,2021b,evenlanguage models at the scale of 100B or more parameters had struggled on system-2 tasks requiringslow and multi-step reasoning Rae et al.,2021.To address this shortcoming,Wei et

26、al.2022,Wang et al.2022 have proposed chain of thought prompting(CoT),which feed LLMs with thestep-by-step reasoning examples rather than standard question and answer examples(see Fig.1-a).Such chain of thought demonstrations facilitate models to generate a reasoning path that decomposesthe complex

27、reasoning into multiple easier steps.Notably with CoT,the reasoning performance thensatisfies the scaling laws better and jumps up with the size of the language models.For example,when combined with the 540B parameter PaLM model Chowdhery et al.,2022,chain of thoughtprompting significantly increases

28、 the performance over standard few-shot prompting across severalbenchmark reasoning tasks,e.g.,GSM8K(17.9%58.1%).While the successes of CoT prompting Wei et al.,2022,along those of many other task-specificprompting work Gao et al.,2021,Schick and Schtze,2021,Liu et al.,2021b,are often attributedto L

29、LMs ability for few-shot learning Brown et al.,2020,we show that LLMs are decent zero-shotreasoners by adding a simple prompt,Lets think step by step,to facilitate step-by-step thinking beforeanswering each question(see Figure 1).Despite the simplicity,our Zero-shot-CoT successfullygenerates a plaus

30、ible reasoning path in a zero-shot manner and reaches the correct answer in aproblem where the standard zero-shot approach fails.Importantly,our Zero-shot-CoT is versatile andtask-agnostic,unlike most prior task-specific prompt engineering in the forms of examples(few-shot)or templates(zero-shot)Liu

31、 et al.,2021b:it can facilitate step-by-step answers across variousreasoning tasks,including arithmetic(MultiArith Roy and Roth,2015,GSM8K Cobbe et al.,2021,AQUA-RAT Ling et al.,2017,and SVAMP Patel et al.,2021),symbolic(Last letter and Coinflip),commonsense reasoning(CommonSenseQA Talmor et al.,201

32、9 and Strategy QA Geva et al.,2021),and other logical reasoning tasks(Date understanding and Tracking Shuffled Objects fromBIG-bench big,2021)without modifying the prompt per task.We empirically evaluate Zero-shot-CoT against other prompting baselines in Figure 1.While ourZero-shot-CoT underperforms

33、 Few-shot-CoT with carefully-crafted and task-specific step-by-stepexamples,Zero-shot-CoT achieves enormous score gains compared to the zero-shot baseline,e.g.from 17.7%to 78.7%on MultiArith and from 10.4%to 40.7%on GSM8K with 175B parameter2Preprint:http:/arxiv.org/abs/2205.1191623 total:40Emergenc

34、e and homogenizationBommasani et al.,On the Opportunities and Risks of Foundation Models,arXiv:2108.07258 cs.LG24(1)total:40Emergence and homogenizationBommasani et al.,On the Opportunities and Risks of Foundation Models,arXiv:2108.07258 cs.LG24(2)total:40The scale matters:the emergence of abilities

35、101810201022102401020304050Accuracy(%)(A)Mod.arithmetic101810201022102401020304050BLEU(%)(B)IPA transliterate101810201022102401020304050Exact match(%)(C)Word unscrambleLaMDAGPT-3GopherChinchillaPaLMRandom101810201022102401020304050Exact match(%)(D)Figure of speech102010221024010203040506070Accuracy(

36、%)(E)TruthfulQA102010221024010203040506070Model scale(training FLOPs)Accuracy(%)(F)Grounded mappings102010221024010203040506070Accuracy(%)(G)Multi-task NLU102010221024010203040506070Accuracy(%)(H)Word in contextFigure 2:Eight examples of emergence in the few-shot prompting setting.Each point is a se

37、parate model.Theability to perform a task via few-shot prompting is emergent when a language model achieves random performanceuntil a certain scale,after which performance significantly increases to well-above random.Note that modelsthat used more training compute also typically have more parameters

38、hence,we show an analogous figure withnumber of model parameters instead of training FLOPs as the x-axis in Figure 7.AD:BIG-Bench(2022),2-shot.E:Lin et al.(2021)and Rae et al.(2021).F:Patel and Pavlick(2022).G:Hendrycks et al.(2021),Rae et al.(2021),and Hoffmann et al.(2022).H:Brown et al.(2020),Hof

39、fmann et al.(2022),and Chowdhery et al.(2022)on theWiC benchmark(Pilehvar and Camacho-Collados,2019).The ability to perform a task via few-shot prompt-ing is emergent when a model has random per-formance until a certain scale,after which perfor-mance increases to well-above random.Figure 2shows eigh

40、t such emergent abilities spanning fivelanguage model families from various work.BIG-Bench.Figure 2AD depicts four emergentfew-shot promptedtasks from BIG-Bench,acrowd-sourced suite of over 200 benchmarks for languagemodel evaluation(BIG-Bench,2022).Figure 2Ashows an arithmetic benchmark that tests

41、3-digitaddition and subtraction,as well as 2-digit multi-plication.GPT-3 and LaMDA(Thoppilan et al.,2022)have close-to-zero performance for severalorders of magnitude of training compute,beforeperformance jumps to sharply above random at2 1022training FLOPs(13B parameters)for GPT-3,and1023training F

42、LOPs(68B parameters)forLaMDA.Similar emergent behavior also occurs ataround the same model scale for other tasks,suchas transliterating from the International PhoneticAlphabet(Figure 2B),recovering a word from itsscrambled letters(Figure 2C),and detecting fig-ures of speech(Figure 2D).Even more emer

43、gentabilities from BIG-Bench are given in Table 1.TruthfulQA.Figure 2E shows few-shot promptedperformance on the TruthfulQA benchmark,whichmeasures the ability to answer questions truthfully(Lin et al.,2021).This benchmark is adversari-ally curated against GPT-3 models,which do notperform above rand

44、om,even when scaled to thelargest model size.Small Gopher models also donot perform above random until scaled up to thelargest model of5 1023training FLOPs(280Bparameters),for which performance jumps to morethan 20%above random(Rae et al.,2021).Grounded conceptual mappings.Figure 2Fshows the task of

45、 grounded conceptual mappings,where language models must learn to map a con-ceptual domain,such as a cardinal direction,rep-resented in a textual grid world(Patel and Pavlick,2022).Again,performance only jumps to aboverandom using the largest GPT-3 model.Multi-task language understanding.Figure 2Gsh

46、ows the Massive Multi-task Language Under-standing(MMLU)benchmark,which aggregates57 tests covering a range of topics including math,history,law,and more(Hendrycks et al.,2021).ForGPT-3,Gopher,and Chinchilla,models of1022training FLOPs(10B parameters)or smaller donotperformbetterthanguessingonaverag

47、eoverallthe topics,scaling up to35 1023training FLOPs(70B280B parameters)enables performance tosubstantially surpass random.This result is strik-ing because it could imply that the ability to solveknowledge-based questions spanning a large col-lection of topics might require scaling up past thisthre

48、shold(for dense language models without re-trieval or access to external memory).Word in Context.Finally,Figure 2H shows theWord in Context(WiC)benchmark(Pilehvar andCamacho-Collados,2019),which is a semantic un-derstanding benchmark.Notably,GPT-3 and Chin-chilla fail to achieve one-shot performance

49、 of bet-ter than random,even when scaled to their largestmodel size of5 1023FLOPs.Although these re-sults so far may suggest that scaling alone may notenable models to solve WiC,above-random perfor-mance eventually emerged when PaLM was scaledto2.51024FLOPs(540B parameters),which wasmuch larger than

50、 GPT-3 and Chinchilla.4Augmented Prompting StrategiesAlthough few-shot prompting is perhaps currentlythe most common way of interacting with largelanguage models,recent work has proposed severalother prompting and finetuning strategies to furtheraugment the abilities of language models.If a tech-niq

移动网页_全站_页脚广告1

关于我们      便捷服务       自信AI       AI导航        获赠5币

©2010-2025 宁波自信网络信息技术有限公司  版权所有

客服电话:4008-655-100  投诉/维权电话:4009-655-100

gongan.png浙公网安备33021202000488号   

icp.png浙ICP备2021020529号-1  |  浙B2-20240490  

关注我们 :gzh.png    weibo.png    LOFTER.png 

客服