Stata：线性固定和随机效应模型

数字货币交易所 2022年09月27日 08:29 150 Connor

计量经济学服务中心专辑汇总！计量百科 ·资源·干货：

Stata |Python |Ma tlab |Eviews |R

Geoda |A rcGis |GeodaSpace |SPSS

一文读懂 |数据资源 |回归方法 |网络爬虫

门限回归 |工具变量 | 内生性 |空间计量

因果推断 |合成控制法 |倾向匹配得分 |断点回归 |双重差分

面板数据 | 动态面板数据

计量经济学服务中心专辑汇总！计量百科 ·资源·干货：

Stata |Python |Ma tlab |Eviews |R

Geoda |A rcGis |GeodaSpace |SPSS

一文读懂 |数据资源 |回归方法 |网络爬虫

门限回归 |工具变量 | 内生性 |空间计量

因果推断 |合成控制法 |倾向匹配得分 |断点回归 |双重差分

面板数据 | 动态面板数据

线性固定和随机效应模型

Stata对平衡和不平衡数据进行固定效应和随机效应模型的拟合。我们用这个表达式

y[i,t] = X[i,t]*b + u[i] + v[i,t]

即u[i]为固定或随机效应，v[i,t]为纯残差。

xtreg是Stata用于拟合固定和随机效应模型的特性。

Xtreg, fe估计固定效应模型的参数:

案例为：

. use "C:\Users\Metrics\Desktop\nlswork.dta"

(National Longitudinal Survey of Young Women, 14-24 years old in1968)

. xtset

Panel variable: idcode (unbalanced)

展开全文

Time variable: year, 68 to 88, but with gaps

Delta: 1 unit

xtreg ln_w grade age c.age #c.age ttl_exp c.ttl_exp#c.ttl_exp tenure ///

> c.tenure #c.tenure 2.race not_smsa south, fe

note: grade omitted because of collinearity.

note: 2.race omitted because of collinearity.

Fixed-effects (within) regression Number of obs = 28,091

Group variable: idcode Number of groups = 4,697

R-squared: Obs per group:

Within = 0.1727 min = 1

Between = 0.3505 avg = 6.0

Overall = 0.2625 max = 15

F(8,23386) = 610.12

corr(u_i, Xb) = 0.1936 Prob > F = 0.0000

ln_wage | Coefficient Std. err. t P>|t| [95% conf. interval]

grade | 0 (omitted)

age | .0359987 .0033864 10.63 0.000 .0293611 .0426362

c.age #c.age | -.000723 .0000533 -13.58 0.000 -.0008274 -.0006186

ttl_exp | .0334668 .0029653 11.29 0.000 .0276545 .039279

c.ttl_exp #c.ttl_exp | .0002163 .0001277 1.69 0.090 -.0000341 .0004666

tenure | .0357539 .0018487 19.34 0.000 .0321303 .0393775

c.tenure #c.tenure | -.0019701 .000125 -15.76 0.000 -.0022151 -.0017251

race |

Black | 0 (omitted)

not_smsa | -.0890108 .0095316 -9.34 0.000 -.1076933 -.0703282

south | -.0606309 .0109319 -5.55 0.000 -.0820582 -.0392036

_cons | 1.03732 .0485546 21.36 0.000 .9421496 1.13249

sigma_u | .35562203

sigma_e | .29068923

rho | .59946283 (fraction of variance due to u_i)

F testthat all u_i=0: F(4696, 23386) = 6.65 Prob > F = 0.0000

end of do-file

在上面的例子中，我们使用了因子变量。c.age#c.age, c.ttl_exp#c.ttl_exp和c.tenure#c.tenure分别是年龄的平方、工作经验的平方和任期的平方。

所有估计命令的语法都是一样的:因变量的名称后面跟着自变量的名称。

在这种情况下，因变量ln_w(工资对数)被建模为若干解释变量的函数。请注意，grade 和black在模型中被省略了，因为它们是不会变化的。

我们的数据集包含28091个“观察”，其中4,697人，平均每一个观察，在6.0个不同的年份。在我们的数据中观察到的是某一年的一个人。

数据集包含变量idcode，用于标识人员- x[i,t]中的i索引。在拟合模型之前，我们输入了xtset以表明我们之前已经告诉Stata面板变量。

为了拟合相应的随机效应模型，我们使用相同的命令，但将fe选项改为re。

xtreg ln_w grade age c.age #c.age ttl_exp c.ttl_exp#c.ttl_exp tenure ///

> c.tenure #c.tenure 2.race not_smsa south, re

Random-effects GLS regression Number of obs = 28,091

Group variable: idcode Number of groups = 4,697

R-squared: Obs per group:

Within = 0.1715 min = 1

Between = 0.4784 avg = 6.0

Overall = 0.3708 max = 15

Wald chi2(10) = 9244.74

corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000

ln_wage | Coefficient Std. err. z P>|z| [95% conf. interval]

grade | .0646499 .0017812 36.30 0.000 .0611589 .0681409

age | .0368059 .0031195 11.80 0.000 .0306918 .0429201

c.age #c.age | -.0007133 .00005 -14.27 0.000 -.0008113 -.0006153

ttl_exp | .0290208 .002422 11.98 0.000 .0242739 .0337678

c.ttl_exp #c.ttl_exp | .0003049 .0001162 2.62 0.009 .000077 .0005327

tenure | .0392519 .0017554 22.36 0.000 .0358113 .0426925

c.tenure #c.tenure | -.0020035 .0001193 -16.80 0.000 -.0022373 -.0017697

race |

Black | -.053053 .0099926 -5.31 0.000 -.0726381 -.0334679

not_smsa | -.1308252 .0071751 -18.23 0.000 -.1448881 -.1167622

south | -.0868922 .0073032 -11.90 0.000 -.1012062 -.0725781

_cons | .2387207 .049469 4.83 0.000 .1417633 .3356781

sigma_u | .25790526

sigma_e | .29068923

rho | .44045273 (fraction of variance due to u_i)

end of do-file

我们还可以进行Hausman检验，将一致的固定效应模型与有效的随机效应模型进行比较。要做到这一点，我们必须首先存储随机效应模型的结果，重新调整固定效应模型，使这些结果成为当前的结果，然后执行检验。

. estimates store random_effects

. quietly xtreg ln_w grade age c.age #c.age ttl_exp c.ttl_exp#c.ttl_exp tenure ///

> c.tenure #c.tenure 2.race not_smsa south, fe

. . hausman . random_effects

---- Coefficients ----

| (b) (B) (b-B) sqrt(diag(V_b-V_B))

| . random_eff~s Difference Std. err.

age | .0359987 .0368059 -.0008073 .0013177

c.age #c.age | -.000723 -.0007133 -9.68e-06 .0000184

ttl_exp | .0334668 .0290208 .0044459 .001711

c.ttl_exp #|

c.ttl_exp | .0002163 .0003049 -.0000886 .000053

tenure | .0357539 .0392519 -.003498 .0005797

c.tenure #|

c.tenure | -.0019701 -.0020035 .0000334 .0000373

not_smsa | -.0890108 -.1308252 .0418144 .0062745

south | -.0606309 -.0868922 .0262613 .0081345

b = Consistent under H0 and Ha; obtained from xtreg.

B = Inconsistent under Ha, efficient under H0; obtained from xtreg.

Test of H0: Difference incoefficients not systematic

chi2(8) = (b-B) '[(V_b-V_B)^(-1)](b-B)

= 149.43

Prob > chi2 = 0.0000

此外，Stata可以对随机效应进行Breusch和Pagan Lagrange乘数(LM)检验，并可以根据估计计算各种预测，包括随机效应。

与Stata用横断面时间序列数据拟合统计模型的能力同样重要的是Stata提供有意义的汇总统计数据的能力。

xtsum以一种有意义的方式报告平均值和标准偏差:

. xtsum hours

Variable | Mean Std. dev. Min Max | Observations

hours overall | 36.55956 9.869623 1 168 | N = 28467

between | 7.846585 1 83.5 | n = 4710

within | 7.520712 -2.154726 130.0596 | T-bar = 6.04395

小时内的负最小值不是一个错误;内表显示的是与人相处的小时数在全球平均值36.55956附近的变化。

xttab对单向列表也做了同样的事情:

. xttab msp

Overall Between Within

msp | Freq. Percent Freq. Percent Percent

0 | 11324 39.71 3113 66.08 62.69

1 | 17194 60.29 3643 77.33 75.75

Total | 28518 100.00 6756 143.41 69.73

(n = 4711)

SP是一个变量，如果被调查的女性已婚且配偶在家中，则该变量的值为1。总的来说，大约60%的人年观察值是msp。以女性为例，66%的女性达到msp, 77%没有;因此，有些妇女一年达到最低工资标准，而另一些则没有。每次取一个女人，如果一个女人是msp，她55%的观察结果是msp。如果一个女人曾经没有msp，她的观察结果有72%不是msp。(如果我们的数据中婚姻状况没有变化，那么误差百分比都是100。)

Xttrans报告转换矩阵:

xttrans msp

1 if|

married, | 1 ifmarried, spouse

spouse | present

present | 0 1 | Total

0 | 80.49 19.51 | 100.00

1 | 7.96 92.04 | 100.00

Total | 37.11 62.89 | 100.00

标签：线性效应模型随机固定