第5章のStataコード

第5章 推測統計の基礎

サンプルデータ

distributions.csv:異なる分布のシミュレーションデータ.

wage.csv:1976年の米国における男性労働者の賃金に関するデータ.

temperature_aug.csv:2014年8月の東京都の気温データ.

5.1 統計的仮説検定の考え方

5.1.3 コイン投げの例

clear
display binomial(100, 60, 0.5) - binomial(100, 40, 0.5)
.95395593

5.2 平均値の検定

5.2.5 Rによる例題演習

import delimited "distributions.csv", clear
summarize dista, detail
                            distA
-------------------------------------------------------------
      Percentiles      Smallest
 1%     1.005143        1.00268
 5%     1.071364       1.007606
10%     1.242875       1.027831       Obs                 100
25%     1.436728       1.061649       Sum of wgt.         100

50%     2.139604                      Mean           2.042323
                        Largest       Std. dev.      .6105609
75%     2.566881       2.906963
90%     2.839471       2.909304       Variance       .3727846
95%     2.889779       2.940179       Skewness      -.1250327
99%     2.947699        2.95522       Kurtosis       1.643932
display sqrt(100) / sqrt(r(Var)) * r(mean)
33.449949

5.2.6 \(p\)

display 1 - normal(1.250113) + normal(-1.250113)
.21125827

5.3 回帰係数の検定

5.3.2 \(\hat{\beta}_1\)の分布のシミュレーション

clear
set seed 2022

postfile sim beta1 using betahats1000, replace

forvalues i = 1/10000 {
    quietly capture drop X Y
    quietly set obs 100
    quietly generate X = rnormal(0, 1)
    quietly generate Y = 1 + 5 * X + rnormal(0, 1)
    quietly regress Y X
    post sim (e(b)[1, 1])
}

postclose sim
use betahats1000, clear
summarize beta1
                            beta1
-------------------------------------------------------------
      Percentiles      Smallest
 1%     4.924835       4.876307
 5%      4.94757        4.88193
10%     4.959229       4.882976       Obs              10,000
25%     4.978507       4.887352       Sum of wgt.      10,000

50%     5.000195                      Mean           4.999877
                        Largest       Std. dev.      .0320337
75%     5.021528       5.115664
90%     5.040558       5.115983       Variance       .0010262
95%     5.051876       5.117807       Skewness      -.0005134
99%     5.074158       5.122896       Kurtosis       3.105771
histogram beta1, frequency bin(12)

5.3.6 Rによる分析例

import delimited "wage.csv", clear
generate lwage = log(wage)

regress lwage educ exper
      Source |       SS           df       MS      Number of obs   =     3,010
-------------+----------------------------------   F(2, 3007)      =    333.00
       Model |  107.459149         2  53.7295746   Prob > F        =    0.0000
    Residual |  485.182496     3,007  .161351013   R-squared       =    0.1813
-------------+----------------------------------   Adj R-squared   =    0.1808
       Total |  592.641645     3,009  .196956346   Root MSE        =    .40169

------------------------------------------------------------------------------
       lwage | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
        educ |    .093168   .0036118    25.80   0.000     .0860863    .1002498
       exper |   .0406574   .0023344    17.42   0.000     .0360802    .0452346
       _cons |   4.666035     .06379    73.15   0.000     4.540958    4.791111
------------------------------------------------------------------------------
display (0.093168 / 0.0036118)
25.795448

5.4 信頼区間

5.4.2 Rによる分析例

regress lwage educ exper
      Source |       SS           df       MS      Number of obs   =     3,010
-------------+----------------------------------   F(2, 3007)      =    333.00
       Model |  107.459149         2  53.7295746   Prob > F        =    0.0000
    Residual |  485.182496     3,007  .161351013   R-squared       =    0.1813
-------------+----------------------------------   Adj R-squared   =    0.1808
       Total |  592.641645     3,009  .196956346   Root MSE        =    .40169

------------------------------------------------------------------------------
       lwage | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
        educ |    .093168   .0036118    25.80   0.000     .0860863    .1002498
       exper |   .0406574   .0023344    17.42   0.000     .0360802    .0452346
       _cons |   4.666035     .06379    73.15   0.000     4.540958    4.791111
------------------------------------------------------------------------------
regress lwage educ exper, level(99)
      Source |       SS           df       MS      Number of obs   =     3,010
-------------+----------------------------------   F(2, 3007)      =    333.00
       Model |  107.459149         2  53.7295746   Prob > F        =    0.0000
    Residual |  485.182496     3,007  .161351013   R-squared       =    0.1813
-------------+----------------------------------   Adj R-squared   =    0.1808
       Total |  592.641645     3,009  .196956346   Root MSE        =    .40169

------------------------------------------------------------------------------
       lwage | Coefficient  Std. err.      t    P>|t|     [99% conf. interval]
-------------+----------------------------------------------------------------
        educ |    .093168   .0036118    25.80   0.000     .0838589    .1024772
       exper |   .0406574   .0023344    17.42   0.000     .0346405    .0466742
       _cons |   4.666035     .06379    73.15   0.000     4.501618    4.830451
------------------------------------------------------------------------------