收藏 分销(赏)

Python在经济计量统计和数据分析上的应用(英文版).pdf

上传人:曲**** 文档编号:225441 上传时间:2023-03-08 格式:PDF 页数:359 大小:13.16MB
下载 相关 举报
Python在经济计量统计和数据分析上的应用(英文版).pdf_第1页
第1页 / 共359页
Python在经济计量统计和数据分析上的应用(英文版).pdf_第2页
第2页 / 共359页
Python在经济计量统计和数据分析上的应用(英文版).pdf_第3页
第3页 / 共359页
Python在经济计量统计和数据分析上的应用(英文版).pdf_第4页
第4页 / 共359页
Python在经济计量统计和数据分析上的应用(英文版).pdf_第5页
第5页 / 共359页
点击查看更多>>
资源描述

1、Introduction to Python for Econometrics,Statistics and Data AnalysisKevin Sheppard University of OxfordSaturday 12th October,20132012,2013 Kevin Sheppard2Notes to the 2nd EditionThis edit ion includes t he following changes from t he first edit ion(March 2012):The preferred inst allat ion met hod is

2、 now Cont inuum Analyt ics(Anaconda.Anaconda is a complet e scient ific st ack and is available for all major plat forms.New chapt er on pandas,pandas provides a simple but powerful t ool t o manage dat a and perform basic analysis.It also great ly simplifies import ing and export ing dat a.New chap

3、t er on advanced select ion of element s from an array.Numba provides just-in-t ime compilat ion for numeric Pyt hon code which oft en produces large performance gains when pure NumPy solut ions are not available(e.g.looping code).Dict ionary,set and t uple comprehensions Numerous t ypos All code ha

4、s been verified working against Anaconda 1.7.0.iContents1 Introduction 11.1 Background.11.2 Convent ions.21.3 Import ant Component s of t he Pyt hon Scient ific St ack.31.4 Set up.41.5 Test ing t he Environment.121.6 Pyt hon Programming.121.7 Exercises.161.A regist er_pyt hon.py.182 Python 2.7 vs.3(

5、and the rest)212.1 Pyt hon 2.7 vs.3.212.2 Int el Mat h Kernel Library and AMD Core Mat h Library.212.3 Ot her Variant s.222.A Relevant Differences bet ween Pyt hon 2.7 and 3.233 Built-in Data Types 253.1 Variable Names.253.2 Core Nat ive Dat a Types.263.3 Pyt hon and Memory Management.363.4 Exercise

6、s.384 Arrays and Matrices 414.1 Array.414.2 Mat rix.434.3 1-dimensional Arrays.444.4 2-dimensional Arrays.454.5 Mult idimensional Arrays.454.6 Concat enat ion.454.7 Accessing Element s of an Array.464.8 Slicing and Memory Management.514.9 import and Modules.53iii4.10 Calling Funct ions.544.11 Exerci

7、ses.565 Basic Math 575.1 Operat ors.575.2 Broadcast ing.585.3 Array and Mat rix Addit ion(+)and Subt ract ion(-).595.4 Array Mult iplicat ion(*).605.5 Mat rix Mult iplicat ion(*).605.6 Array and Mat rix Division(/).605.7 Array Exponent iat ion(*).605.8 Mat rix Exponent iat ion(*).615.9 Parent heses.

8、615.10 Transpose.615.11 Operat or Precedence.615.12 Exercises.626 Basic Functions and Numerical Indexing 656.1 Generat ing Arrays and Mat rices.656.2 Rounding.686.3 Mat hemat ics.696.4 Complex Values.716.5 Set Funct ions.716.6 Sort ing and Ext reme Values.726.7 Nan Funct ions.746.8 Funct ions and Me

9、t hods/Propert ies.756.9 Exercises.767 Special Arrays 777.1 Exercises.788 Array and Matrix Functions 798.1 Views.798.2 Shape Informat ion and Transformat ion.808.3 Linear Algebra Funct ions.878.4 Exercises.909 Importing and Exporting Data 939.1 Import ing Dat a using pandas.939.2 Import ing Dat a wi

10、t hout pandas.949.3 Saving or Export ing Dat a using pandas.1009.4 Saving or Export ing Dat a wit hout pandas.100iv9.5 Exercises.10110 Inf,NaN and Numeric Limits 10310.1 inf and NaN.10310.2 Float ing point precision.10310.3 Exercises.10411 Logical Operators and Find 10711.1,=,t his indicat es t hat

11、t he command is running an int eract ive I Pyt hon session.Out put will oft en appear aft er t he console command,and will not be preceded by a command indicat or.2 x=1.0 x+23.0If t he code block does not cont ain t he console session indicat or,t he code cont ained in t he block is int ended t o be

12、 execut ed in a st andalone Pyt hon file.from-fut ure-import print _funct ionimport numpy as npx=np.array(l,2,3,4)y=np.sum(x)print(x)print(y)1.3 Important Components of the Python Scientific Stack1.3.1 PythonPyt hon 2.7.5(or lat er,but in t he Pyt hon 2.7.x family)is required.This provides t he core

13、 Pyt hon int erpret er.1.3.2 NumPyNumPy provides a set of array and mat rix dat a t ypes which are essent ial for st at ist ics,economet rics and dat a analysis.1.3.3 SciPySciPy cont ains a large number of rout ines needed for analysis of dat a.The most import ant include a wide range of random numb

14、er generat ors,linear algebra rout ines and opt imizers.SciPy depends on NumPy.1.3.4 IPythonI Pyt hon provides an int eract ive Pyt hon environment which enhances product ivit y when developing code or performing int eract ive dat a analysis.1.3.5 matplotlibmat plot lib provides a plot t ing environ

15、ment for 2D plot s,wit h limit ed support for 3D plot t ing.1.3.6 pandaspandas provides high-performance dat a st ruct ures.31.3.7 Performance ModulesA number of modules are available t o help wit h performance.These include Cyt hon and Numba.Cyt hon is a Pyt hon module which facilit at es using a s

16、imple Pyt hon-derived creole t o writ e funct ions t hat can be compiled t o nat ive(C code)Pyt hon ext ensions.Numba uses a met hod of just-in-t ime compilat ion t o t ranslat e a subset of Pyt hon t o nat ive code using Low-Level Virt ual Machine(LEVM).1.4 SetupThe recommended met hod t o inst all

17、 t he Pyt hon scient ific st ack is t o use Cont inuum Analyt ics,Anaconda.Inst ruct ions are also provided for direct ly inst alling Pyt hon and t he required modules if it isnt possible t o inst all Anaconda.1.4.1 Continuum Analytics Anaconda9Anaconda,a free product of Cont inuum Analyt ics(www.co

18、nt inuum.io),is a virt ually complet e scient ific st ack for Pyt hon.It includes bot h t he core Pyt hon int erpret er adn st andard libraries as well as most modules required for dat a analysis.Anaconda is free t o use and modules for accelerat ing t he performance of linear algebra on Int el proc

19、essors using t he Mat h Kernel Library(MKL)are available(free t o academic users and for a small cost t o non-academic users).Cont inuum Analyt ics also provides ot her high-performance modules for reading large dat a files or using t he GPU t o furt her accelerat e performance for an addit ional,mo

20、dest charge.Most import ant ly,inst allat ion is ext raordinarily easy on Windows,Linux and OSX.Anaconda is also simple t o updat e t o t he lat est version usingconda updat e conda conda updat e anacondaWindowsInst allat ion on Windows requires downloading t he inst aller and running.These inst ruc

21、t ions use ANACONDA t o indicat e t he Anaconda inst allat ion direct ory(e.g.t he default is C:Anaconda)Once t he set up has complet ed,open a command prompt(cmd.exe)and runcd ANACONDAconda updat e conda conda updat e anaconda conda creat e-n economet rics cyt hon dist ribut e ipyt hon-not ebook ip

22、yt hon-qt console jinja2Ixml mat plot lib nose numba numexpr numpy pandas pip pygment s pyt ables pywin32 scipy st at smodels xlrd xlwtwhich will first ensure t hat Anaconda is up-t o-dat e and t hen creat e a virt ual environment named economet rics.The virt ual environment provides a set of compon

23、ent s which will not change even if anaconda is updat ed.Using a virt ual environment is a best pract ice and is import ant since component updat es can lead t o errors in ot herwise working programs due t o backward incompat ible changes in a module.The long list of modules in t he conda creat e co

24、mmand includes all of t hose t hat will be used in t hese not es.It is also possible t o inst all all available packages using t he command conda creat e-n economet rics anaconda,The economet rics environment must be act ivat ed before use.This is accomplished by running4Python 2.7.5 SAnaconda 1.7.0

25、!(default-Jul 1 2013,12:37:52 MSC u.500 64 bit Introduction and overuiew of IPythonJ s features.Kquickref-Quick reference.heIp-PythonJ s own heIp system.object?Details about 9 object1use 9 object?1 for extra details.Figure 1.1:IPython running in the standard Windows console(cmd.exe).ANACONDAScript s

26、act ivat e.bat economet ricsfrom t he command prompt,which prepends economet rics t o t he prompt as an indicat ion t hat virt ual environment is act ive.Act ivat e t he economet rics environment and t hen run pip inst all openpyxlwhich inst alls t wo packages not direct ly available in Anaconda.The

27、 final st ep is t o creat e launchers for t he bot h t he virt ual environment and t he IPyt hon int eract ive Pyt hon console.First,open a t ext edit or,ent ercmd/k ANACONDAScript sact ivat e economet ricsand save t he file as ANACONDAenvseconomet ricspyt hon-economet rics.bat.The bat ch file will

28、open a command prompt in t he economet rics virt ual environment.Right click on t he bat ch file and select Send To,Deskt op(Creat e Short cut)which will place a short cut on t he deskt op.Next,creat e a launcher t o run IPyt hon in t he st andard Windows cmd.exe console.Open a t ext edit or ent erc

29、md/c ANACONDAScript sact ivat e economet rics&st art ipyt hon.exe-pylaband save t he file as ANACONDAenvseconomet ricsipyt hon-plain.bat.Finally,right click on ipyt hon-plain.bat select Sent To,Deskt op(Creat e Short cut).The icon of t he short cut will be generic,and if you want a more meaningful i

30、con,select t he propert ies of t he short cut,and t hen Change Icon,and navigat e t o c:Anacondaenvseconomet ricsMenu and select IPyt hon.ico.Opening t he bat ch file should creat e a window similar t o t hat in figure 1.1.The Windows command int erpret er(cmd.exe)is very limit ed compared t o ot he

31、r plat forms.Fort unat ely,cmd.exe can be replaced wit h an upgraded version known as Console2.To use Console2,ext ract t he cont ent s of t he zip file Console-2.00b 148-Bet a_64bit.zip(for example,t o AN AGON DAConsole2).Launch Console.exe,and select Edit Set t ings Tabs.Click on Add,and input t h

32、e following:Using natplotlib backend:Qt4AggIn 11 1:5Figure 1.2:IPython running in a QtConsole session.Title IPyt hon(Pylab)Icon Navigat e t o ANACONDAenvseconomet ricsMenu and select IPyt hon.ico.Shell cmd/k ANACONDAScript sact ivat e.bat economet rics&pyt hon ANACONDAenvseconomet ricsScript sipStar

33、tup dir ANACONDAenvseconomet ricsThis environment can be accessed by set t ing IPyt hon(Pylab)as t he default t ab in Console2,or by explicit ly opening new t ab wit h t his environment.A t hird opt ion,known as t he Qt Console,is provided by IPyt hon.The Qt Console offers addit ional feat ures such

34、 as running mult iple sessions simult aneously or having figures appear inline wit h code.Begin by ent ering t he following command in a t ext edit or,cmd/c cd ANACONDAScript s&act ivat e economet rics&st art pyt honw ANACONDAenvs economet ricsScript sipyt hon-script.py qt console-pylab=qt 4-colors=

35、linux-ConsoleWidget.font _size=ll-ConsoleWidget.font _family=Bit st ream Vera Sans Monoand t hen save t he file as ANACONDAenvseconomet ricsipyt hon-qt console.bat.Creat e a short cut for t his bat ch file,and change t he icon if desired.The t railing opt ions,such as-colors=linux,affect t he visual

36、 appearance of t he Qt Console.The opt ions list ed here are my preferred set up,and assume t hat t he free font Bit st ream Vera Sans Mono has been inst alled.Opening t he bat ch file should creat e a window similar t o t hat in figure 1.2.6Linux and OSXInst allat ion on Linux requires execut ingba

37、sh Anaconda-x.y.z-Linux-ISA.shwhere x.y.z will depend on t he version being inst alled and ISA will be eit her x86 or more likely x86_64.The OSX inst aller is available eit her in a GUI inst alled(pkg format)or as a bash inst aller which is inst alled in an analogous manner t o t he Linux inst allat

38、 ion.Aft er inst allat ion complet es,change t o t he folder where Anaconda inst alled(writ t en here as ANACONDA,default-/anaconda)and execut ecd ANACONDAcd bin./conda updat e conda./conda updat e anaconda./conda creat e-n economet rics cyt hon dist ribut e ipyt hon-not ebook ipyt hon-qt console ji

39、nja2 Ixml mat plot lib nose numba numexpr numpy pandas pip pygment s pyt ables scipy st at smodels xlrd xlwtwhich will first ensure t hat Anaconda is up-t o-dat e and t hen creat e a virt ual environment named economet rics wit h t he required packages.The act ivat e t he newly creat ed environment,

40、runsource ANACONDA/bin/act ivat e economet ricsand t hen run t he commandpip inst all openpyxlt o inst all t wo packages not included in Anaconda.The st andard IPyt hon environment can be launched in t he syst em console usingipyt hon-pylabor t he IPyt hon-provided Qt Console can be launched usingip

41、yt hon qt console-pylabFurt her opt ions can be passed t o IPyt hon t o improve t he appearance of t he Qt Console.For example,ipyt hon qt console-pylab=qt 4-colors=linux-ConsoleWidget.font _size=ll-ConsoleWidget.font _family=Bit st ream Vera Sans Mono1.4.2 Installation without AnacondaAnaconda grea

42、t ly simplifies inst alling t he scient ific Pyt hon st ack.However,t here may be sit uat ions where inst alling Anaconda is not possible,and so(subst ant ially more complicat ed)inst ruct ions are included for bot h Windows and Linux.WindowsThe list of required windows packages,along wit h t he ver

43、sion and Windows inst allat ion file,required for t hese not es include:7PackageVersionFile namePyt hon2.7.5pyt hon-2.7.5.amd64Set upt ools1.1.5set up t ools-1.1.5.win-amd64-py2.7Pip1.4.1pip-1.4.1.win-amd64-py2.7Virt ualenv1.10.1virt ualenv-1.10.1.win-amd64-py2.7Jinja22.7.1Jinja2-2.7.1.win-amd64-py2

44、.7.exeTornado3.1.1t ornado-3.1.1.win-amd64-py2.7.exePyCairo1.10.0pycairo-1.10.0.win-amd64-py2.7PyZMQ13.1.0pyzmq-13.1.0.win-amd64-py2.7PyQt4.9.6-1PyQt-Py2.7-x64-gpl-4.9.6-1NumPy1.7.1numpy-MKL-1.7.1.win-amd64-py2.7SciPy0.12.0scipy-0.12.0.win-amd64-py2.7Mat plot Lib1.3.0mat plot lib-1.3.0.win-amd64-py2

45、.7pandas0.12.0pandas-0.12.0.win-amd64-py2.7I Pyt hon1.1.0ipyt hon-1.1.0.win-amd64-py2.7These remaining packages are opt ional and are only discussed in t he final chapt ers relat ed t o performance.PackageVersionFile namePerformanceCythonCyt honNumba0.19.1Cyt hon-0.19.1.win-amd64-py2.7LEVMPy0.12.0ll

46、vmpy-0.12.0.win-amd64-py2.7LIVMMat h0.1.1Hvmmat h-0.1.1.win-amd64-py2.7Met a0.1.0met a-0.1.0dev.win-amd64-py2.7Numba0.10.2numba-0.10.2.win-amd64-py2.7pandas(Optional)Bot t leneck0.7.0Bot t leneck-0.7.0.win-amd64-py2.7NumExpr2.2.1numexpr-2.2.1.win-amd64-py2.7Pat sy0.2.1pat sy-0.2.1.win-amd64-py2.7St

47、at smodels0.5.0st at smodels-0.5.0.win-amd64-py2.7PyTables3.0.0t ables-3.0.0.win-amd64-py2.7Begin by inst alling Pyt hon,set upt ools,pip and virt ualenv.Aft er t hese four packages are inst alled,open an elevat ed command prompt(cmd.exe wit h administ rat or privileges)and init ialized t he virt ua

48、l environment using t he command:cd C:Dropbox virt ualenv economet rics8I prefer t o use my Dropbox as t he locat ion for virt ual environment s and have named t he virt ual environment economet rics.The virt ual environment can be locat ed anywhere(alt hough best pract ice is t o use a pat h wit ho

49、ut spaces)and can have a different name.Throughout t he remainder of t his sect ion,VIR-TUALENV will refer t o t he complet e direct ory cont aining t he virt ual environment(e.g.C:Dropboxeconomet rics).Once t he virt ual environment set up is complet e,runcd VIRTUALENVScript s act ivat e.batpip ins

50、t all xlrd xlwt openpyxl pyreadline pyt hon-dat eut il pyt z=2013d pygment s pyparsingwhich act ivat es t he virt ual eiiviroiiiiienl and inst alls some addit ional required packages.Finally,before inst alling t he remaining packages,it is necessary t o regist er t he virt ual environment as t he de

展开阅读全文
相似文档                                   自信AI助手自信AI助手
猜你喜欢                                   自信AI导航自信AI导航
搜索标签

当前位置:首页 > 管理财经 > 宏观经济

移动网页_全站_页脚广告1

关于我们      便捷服务       自信AI       AI导航        获赠5币

©2010-2024 宁波自信网络信息技术有限公司  版权所有

客服电话:4008-655-100  投诉/维权电话:4009-655-100

gongan.png浙公网安备33021202000488号   

icp.png浙ICP备2021020529号-1  |  浙B2-20240490  

关注我们 :gzh.png    weibo.png    LOFTER.png 

客服