资源描述
.
*CH-1 基础
*导入后缀非.dta的数据集
insheet using class.csv, clear
*将变量命名为xxxx
rename (v1-v13) (xh xm pylb xy xs nj zy bj xq xkcgbz jxbh xnd xq2)
*为变量添加标签
label variable xh "学号"
*将所有缺失值转化为数据999;
mvencode _all, mv(999)
*删去变量。
drop xs
drop if xs==999
*导入Excel文件
import excel "路径"
*计算增长率
gen pop_gr=(pop-pop[_n-1])/pop[_n-1]
*CH-2、3 简单线性回归&多元
regress //回归
predict e , resid //生成残差序列e
predict y1, xb //生成y的预测值序列y1
predict yhat /*同predict y1, xb*/
rstandard //标准化残差
stdr //残差的标准误
sum educ, detail //detail是指更加详细的数据信息
count if female==1 //计算变量的数目
sum educ if female==1 //计算女性的教育程度
reg wage educ if married==1 //reg 命令和样本选择参数一起使用
reg lwage educ,noconstant //过原点的回归
_b[variable] //表示变量前的系数
scalar price_0=_b[bdrms]+140*_b[sqrft] //定义数字变量
corr // 计算相关系数
test //检验
*CH-4 检验
test x1 //检验x1对应的系数的显著性
test x1 x2 x3 //检验x1 x2 x3对应系数的联合显著性
test x1=x2 //检验x1 x2对应系数
test x1=2 //检验
test _b[x1]=2 //检验
lincom //系数线性组合的点估计、标准误、检验与推断
lincom x1+ x2
lincom x1+ x2 //t检验β1+β2=0
lincom x1+ x2 - 1 //t检验β1+β2=1
testnl //非线性假设检验
// 备择假设为beta1!=0
* 给定alpha=0.05,双侧检验,右尾端分得概率2.5%,该临界值等于t分布的第97.5分位数点
* 该t分布的自由度为n-k,即173-4=169
* 由于t分布是对称,我们只需考察右尾端,左尾端是等价的
scalar crit1 = invttail(e(df_r),0.025) //显著性水平等于0.05,双侧检验的临界值
scalar t1 = _b[expendA]/_se[expendA]
display "t critical value 97.5 percentile = " crit1
display "t-statistic for H0: beta1 = 0 =" t1
* 方法二:计算p值
// 备择假设为beta1!=0,如何计算p值呢? p-value = P(|T|>|t|)= 2P(T>|t|)
* 需要先计算t值
scalar p1 = 2*ttail(e(df_r), abs(t1))
scalar list t1 p1
*CH-6 深入专题
reg price nox crime rooms dist stratio, beta //报告标准化参数(beta参数)
*各种点效应的计算
*(转折点。。。)
display -1*_b[rooms]/(2*_b[rooms2]) // display turnaround value of rooms
display 100*(_b[rooms]+2*_b[rooms2]*5) // display change in price if rooms increases from 5 to 6
display 100*(_b[rooms]+2*_b[rooms2]*6) // display change in price if rooms increases from 6 to 7
*(在均值处。。。)
sum lnox
scalar mean = r(mean) // keep the mean value of lnox
ereturn list //列出回归后的相应指标,如自由度,sse,sst,ssr等
*CH-7 虚拟变量与Chow检验
*1. Basics of Dummy Independent Variables
gen male = (!female) // generate a dummy indicating male
*模型的两种表示方法,当心虚拟变量陷阱
reg wage female educ exper tenure // with constant
reg wage female educ exper tenure male, nocon // without constant
display exp(_b[female]*1)-1 //男性和女性平均工资水平的精确差异
*- 2. Multiple Categories 多类别模型
*生产多类别虚拟变量,注意变量系数的意义
*- 3. Interaction Involving Dummies 包括虚拟变量的交叉项
*注意点效应和转折点即可
*- 4. Chow Test 邹检验
* Method I
qui reg lwage educ exper expersq tenure tenursq fem*
test female = femed = femex = femeq = femte = femtq = 0
* Method II (邹检验基本形式)
* Step 1:
qui reg lwage educ exper expersq tenure tenursq if female==0
scalar rss1 = e(rss)
qui reg lwage educ exper expersq tenure tenursq if female==1
scalar rss2 = e(rss)
* Step 2:
qui reg lwage educ exper expersq tenure tenursq
scalar k = e(df_m)
scalar N = e(N)
scalar rss = e(rss)
* Step 3:
scalar d1 = k + 1
scalar d2 = N - 2*(k + 1)
scalar F = ((rss - (rss1 + rss2))/d1) / ((rss1 + rss2)/d2) // F statistics
scalar P = 1 - F(d1,d2,F) // p-value
display "F statistics = " F
display "p-value = " P
*如果允许前多少项变动的话,要注意ssr的自由度!!
*- 5. Effects of Education Rankings on Wage
use wage1.dta, clear
gen educ6_8 = (educ>=6 & educ<9)
gen educ9_11 = (educ>=9 & educ<12)
gen educ12 = (educ>=12)
gen educ_rank = 1 + educ6_8*2 + educ9_11*3 + educ12*4 // generate ranking variable for education
reg lwage female educ_rank exper expersq tenure tenursq // one-unit increase in education rankings has a constant effect on lwage
reg lwage female educ6_8 educ9_11 educ12 exper expersq tenure tenursq
display exp(_b[educ12]*1)-1 // display the average percentage difference in lwage between high school graduates and those without primary school diploma
* CH-8 异方差
ssc install bpagan, replace // install command bpagan
ssc install whitetst, replace // install command whitetst
reg narr86 qemp86 inc86 black hispan, robust//稳健回归,然后进行各种检验
*Heteroskedasticity-Robust LM Statistic 异方差稳健的LM统计量
qui reg narr86 pcnv ptime86 qemp86 inc86 black hispan
predict u1, resid
qui reg avgsen pcnv ptime86 qemp86 inc86 black hispan
predict r1, resid
qui reg avgsensq pcnv ptime86 qemp86 inc86 black hispan
predict r2, resid
gen ur1 = u1*r1
gen ur2 = u1*r2
gen i = 1
qui reg i ur1 ur2, nocons
scalar lm1 = e(N)-e(rss)
scalar lpl = chi2tail(2,lm1)
display "Robust LM statistics = " lm1
display "Robust LM p-value = " lpl
*异方差检验
*The Breusch-Pagan Test for Heteroskedasticity //BP检验
* Method I
qui reg narr86 avgsen avgsensq pcnv ptime86 qemp86 inc86 black hispan
estat hettest avgsen avgsensq pcnv ptime86 qemp86 inc86 black hispan
* Method II
qui reg narr86 avgsen avgsensq pcnv ptime86 qemp86 inc86 black hispan
bpagan avgsen avgsensq pcnv ptime86 qemp86 inc86 black hispan
*The White Test for Heteroskedasticity //怀特检验
use crime1.dta, clear
gen avgsensq = avgsen*avgsen
qui reg narr86 avgsen avgsensq pcnv ptime86 qemp86 inc86 black hispan
predict yhat, xb
predict u3, resid
* Method I (Original Form)
estat imtest, white
* Method II (Original Form)
whitetst // the same as above
* Method III (Alternate Form)
gen yhat2 = yhat^2
gen u3s = u3^2
qui reg u3s yhat yhat2
test yhat yhat2
*WLS和FGLS (使用时修改hi权重)
* Method I (WLS)
reg cigs lincome lcigpric educ age agesq restaurn [aw = 1/hi]
* Method II (FGLS)
qui reg cigs lincome lcigpric educ age agesq restaurn
predict ub, resid
gen lubar = log(ub*ub)
qui reg lubar lincome lcigpric educ age agesq restaurn
predict cigsh, xb
gen cigse = exp(cigsh)
reg cigs lincome lcigpric educ age agesq restaurn [aw = 1/hi]
*CH10-12 时间序列
*设置数据格式(时间序列)
tsset t
*在OLS回归中引入季节变量
reg lchnimp lchempi lgas lrtwex befile6 affile6 afdec6 spr sum fall
*季节虚拟变量联合显著性检验
test (spr) (sum) (fall)
*结果显示接受原假设,不需要进行季节调整。
*做D-W检验
reg lchnimp lchempi lgas lrtwex befile6 affile6 afdec6
estat dwatson
*检验结果显示存在自相关,这样一来,OLS估计的系数仍然是无偏和一致的,但对系数的显著性的判断将会发生偏差。所以(1)的
*推断将不再可信。
*做B-G检验
estat bgodfrey,lag(1/3)
*检验结果显示,扰动项至少存在3阶或者更高阶数的自相关。
*做C-O迭代
prais lchnimp lchempi lgas lrtwex befile6 affile6 afdec6, corc
*做P-W迭代
prais lchnimp lchempi lgas lrtwex befile6 affile6 afdec6
*不做迭代的话:prais 变量 变量 变量。。。,twostep
*做N-W方差协方差一致估计 (系数估计与OLS一样,se不同。又存在异方差,又存在序列相关)
newey lchnimp lchempi lgas lrtwex befile6 affile6 afdec6, lag(3)
*生成被解释变量的一阶滞后
gen lchnimp1=L1.lchnimp
*做包含被解释变量滞后的OLS
reg lchnimp lchnimp1 lchempi lgas lrtwex befile6 affile6 afdec6
reg lchnimp L.lchnimp lchempi lgas lrtwex befile6 affile6 afdec6
*做B-G检验
estat bgodfrey,lag(1/4)
*检验结果显示仍然存在自相关
*生成y的高阶滞后
gen lchnimp2=L2.lchnimp
gen lchnimp3=L3.lchnimp
*做包含y高阶滞后的OLS
reg lchnimp lchnimp1 lchnimp2 lchnimp3 lchempi lgas lrtwex befile6 affile6 afdec6
*做B-G检验
estat bgodfrey,lag(1/4)
*CH-15 工具变量
*ivreg 直接做2SLS估计
ivreg lwage (educ=fatheduc motheduc) exper expersq
*做Endogeneity(Durbin-Wu-Hausman)检验
ivendog
*做Overidentifying Restrictions检验(Sargan Test)
overid
*检验结果显示接受Sargan Test的原假设,说明IV符合外生性条件。
//对2SLS运用ivhettest命令完成估计结果的异方差检验
*做2SLS的异方差检验
ivhettest
*检验结果显示存在异方差
*生成残差
predict u,res
*生成残差的平方
gen u2=u^2
*辅助回归,用以得到异方差的形式
reg u2 motheduc fatheduc exper expersq
*生成修正异方差的权重
predict weight,xb
*做加权的2sls
ivreg lwage (educ=fatheduc motheduc) exper expersq [aw=1/weight]
*计算教育回报
scalar jyhb=_b[educ]*100
*显示结果
display jyhb
//对比:对OLS用B-PTest完成估计结果的异方差检验
*做OLS
reg lwage educ exper expersq
*B-P test
bpagan educ exper expersq
*生成残差
predict u1,res
*生成残差的平方
gen u12=u1^2
**辅助回归,用以得到异方差的形式
reg u12 exper expersq educ
*生成修正异方差的权重
predict weight1,xb
*做WLS
reg lwage educ exper expersq [aw=1/weight1]
*计算教育回报
scalar jyhb1=_b[educ]*100
*显示结果
display jyhb1
6 / 6
展开阅读全文