讲义-sas多重共线性、异方差、自相关.doc

资源描述

1．如何输出结果 /*ols regression*/ proc reg data=new sse outest=outest1; model Y=x2 x3 x4 x5 x6 x7 /dw; output out=out1 r=e p=ey; title 'ols regression'; run; （1）由outest输出的数据集outest1可输出衡量模型优劣的指标_RSQ_、_RMSE_，同时可输出模型的残差平方和_SSE_和回归的各参数的系数。（2）如何将残差序列和拟合值序列输出到一个数据集中，使用以下语句可以实现。output out=out1 r=e p=ey; （3）对于结构检验中分步回归求Chow检验统计量，可分别输出每一步的_SSE_到一个数据集中，再合并数据集计算Chow检验统计量。 2．多重共线检验对于方差膨胀因子检验法（VIF）、容忍度检验法（TOL）、条件数法在讲义二中已讲过，且较简单就不再赘述了。现在主要说明一下主分量法。 proc princomp data=new out=result outstat=stat; proc print data=stat; title 'Multicollinearity test for princomp'; run; proc reg data=new pcomit=1 outest=outestmc ; model Y= x2 x3 x4 x5 x6 x7; output out=result1 p=yyy r=rrr;run; （1）outstat= stat,生成一个包含均值、标准差、观测个数、相关阵或协差阵、特征值和特征向量的输出数据集；（2）进行主分量回归时，k为不进入分析的主分量个数，model后用因变量和所有的自变量回归，机理是选择的主分量是原自变量的线性组合，程序先将因变量和主分量回归，然后在程序显示结果上表示为因变量和原自变量的关系。 3．异方差检验 (1) 模型存在异方差时，设其扰动项的方差协方差矩阵为故b的协方差矩阵为，要估计量，White在1980年得出估计量是的一致估计量。故得到，这时不能用同方差时的F检验，用WALD检验量检验。关键在于估计怀特估计量，程序为： proc reg data=new sse outest=outestacov; model Y=x2 x3 x4 x5 x6 x7 /acov; output out=out1 r=e p=ey; title 'ols regression'; run; 输出结果中的方差矩阵即为，可通过计算求出统计量。（2）White检验与修正的Breusch-Pagan（Koenker-Bassett）检验 /*white检验和 Breusch-pagan-Goldfreg检验*/ proc model data=new; parms const bata2 bata3 bata4 bata5 bata6 bata7; Y=const+bata2*x2+bata3*x3+bata4*x4+bata5*x5+bata6*x6+bata7*x7; fit Y/white breusch=(1 x2 x3 x4 x5 x6 x7); run; (3) 集团法（Goldfold-Quandt检验）同方差。异方差。将样本数据分为两个集团：为第I集团和为第II集团，对第I集团和第II集团分别回归，得到残差平方和分别为和。程序为 data goldqut; set out1; proc sort;by x2 ; proc print; run; data m1 m2; set goldqut; if _n_<=9 then output m1; if 21<=_n_<=29 then output m2; proc print; run; 通过排序和划分数据集后，可以根据定义逐步计算检验统计量。（4） Glasjer检验（a）假定。根据经验判断，认为经济中异方差可能存在三种形式。线性二次函数指数（b）步骤。将Y与、、、、、用普通最小二乘法得到残差序列：。，选择原模型的解释变量、、、、、。分别作以下三种回归：得到的估计量a，计算Wald统计量。现在的关键还是求出方差的估计量 data a;set out1;eee=e*e;run;proc print;run; proc reg data=a outest=outest3(keep=intercept x2 x3 x4 x5 x6 x7 ); model eee=x2 x3 x4 x5 x6 x7; output out=out3 p=p3 r=e3; run; data t1;set outest3;inter=intercept;drop intercept;proc print;run; data b;set out3; newx21=e3*x2;newx31=e3*x3;newx41=e3*x4;newx51=e3*x5; newx61=e3*x6;newx71=e3*x7;newe31=e3; proc print;run; proc iml; use b; read all into matrix1; x=2:8;y=14:20;z=1:29; nxx1=matrix1[z,x]; ns1=matrix1[z,y]; use t1;read all into matrix2; white1=inv(nxx1`*nxx1)*(ns1`*ns1)*inv(nxx1`*nxx1); wald1=matrix2*inv(white1)*(matrix2`); print white1 white2; quit; (5) GLS 开方可用程序可根据定义自己编程，并不复杂。 4．自相关检验（1）DW检验 proc reg data=new sse outest=outest; model Y=x2 x3 x4 x5 x6 x7 /dw; output out=out1 r=e p=ey; title 'ols regression'; run; 也可根据定义编程求解。 (2) B-G检验（Breusch-Goldfreg检验）假设检验：（无自相关）（有自相关）检验步骤：（1）将Y与X作普通最小二乘法回归得残差序列。（2）将作普通最小二乘法回归，回归得到（3）计算统计量，为样本容量。根据数据情况确定滞后阶数p，可根据靳老师上课讲的确定方法确定。通常时间序列资料受到季节因素的影响，取p=4。由于其的检验方法简单，不详细讲解编程了。（3）Box-Pierce检验假设检验：（无自相关）（有自相关）检验步骤：（1）将Y与X作普通最小二乘法回归得残差序列。（2）计算样本的自相关系数，，．．．，（3）检验统计量：如何对向量的分量求和，程序如下： proc means data=lagdata; var elage11 elage2 elage3; output out=sume sum=selage11 selage2 selage3; run; （5）存在自相关时的处理 proc autoreg data=m1; model Y=x2 x3 x4 x5 /nlag=10 backstep method=YW ; title 'autoreg for YW'; run; method=<>用来表示使用的估计方法，选项包括: ML,极大似然；YW，Yule-Walker估计法；ITYW，迭代Yule-Walker估计法。这些方法都是带滞后变量的后续内容，只要知道如何处理就可以了。

展开阅读全文