第2章のStataコード

2.1 母集団と標本

clear
set seed 2022
set obs 100
generate x = rnormal(50, 10) 
list x
     +----------+
     |        x |
     |----------|
  1. | 54.72184 |
  2. | 57.83728 |
  3. | 54.85993 |
  4. | 64.43429 |
  5. |   28.463 |
     |----------|
  6. | 46.86906 |
  7. | 54.04034 |
  8. | 43.09037 |
  9. | 43.55874 |
 10. | 67.44601 |
     |----------|
 11. | 65.16725 |
 12. | 45.85949 |
 13. | 69.22049 |
 14. | 52.78434 |
 15. | 51.74491 |
     |----------|
 16. | 51.55506 |
 17. | 59.95332 |
 18. |  39.3822 |
 19. | 45.14891 |
 20. | 41.70154 |
     |----------|
 21. | 55.55453 |
 22. | 36.13799 |
 23. | 51.67796 |
 24. |  54.6027 |
 25. | 39.50498 |
     |----------|
 26. | 42.91825 |
 27. | 49.60077 |
 28. | 60.91685 |
 29. | 53.24129 |
 30. | 59.11747 |
     |----------|
 31. | 63.41263 |
 32. | 65.53846 |
 33. | 54.42587 |
 34. | 54.49843 |
 35. | 31.52787 |
     |----------|
 36. | 51.61845 |
 37. | 46.18401 |
 38. |  54.8383 |
 39. | 52.29193 |
 40. | 43.75826 |
     |----------|
 41. | 57.47549 |
 42. | 59.45118 |
 43. | 48.10275 |
 44. |  54.0606 |
 45. |  41.1449 |
     |----------|
 46. | 44.57216 |
 47. | 40.63018 |
 48. | 54.38108 |
 49. | 54.39209 |
 50. | 66.16985 |
     |----------|
 51. | 49.45791 |
 52. | 71.02126 |
 53. |  50.3375 |
 54. | 52.88632 |
 55. | 45.25692 |
     |----------|
 56. | 36.40867 |
 57. | 43.52808 |
 58. | 48.95045 |
 59. | 46.06606 |
 60. | 50.77587 |
     |----------|
 61. |  36.0733 |
 62. | 56.27862 |
 63. | 45.95923 |
 64. | 48.35371 |
 65. | 47.45552 |
     |----------|
 66. | 64.44137 |
 67. | 51.39878 |
 68. | 46.63072 |
 69. | 41.59076 |
 70. | 49.48847 |
     |----------|
 71. | 46.62038 |
 72. | 38.88836 |
 73. | 33.08361 |
 74. | 49.27411 |
 75. | 45.50976 |
     |----------|
 76. | 52.08639 |
 77. | 34.54663 |
 78. | 53.93042 |
 79. | 37.40854 |
 80. | 49.52537 |
     |----------|
 81. | 13.65156 |
 82. | 62.04644 |
 83. | 55.87931 |
 84. | 70.52183 |
 85. | 55.12461 |
     |----------|
 86. | 52.93826 |
 87. | 55.92964 |
 88. | 63.00453 |
 89. | 59.47496 |
 90. | 52.97406 |
     |----------|
 91. | 74.39547 |
 92. | 67.05323 |
 93. | 34.03576 |
 94. | 50.66547 |
 95. | 44.32327 |
     |----------|
 96. | 45.39636 |
 97. | 64.69812 |
 98. | 60.00776 |
 99. | 53.95464 |
100. | 41.24168 |
     +----------+
clear
set seed 2022
set obs 1000
generate x = (_n - 1) / 10
generate y = normalden(x, 50, 10) 
line y x

clear
scalar z = (60 - 50) / 10
scalar x = normal(z)
display x
.84134475

2.2 無作為抽出

clear
set seed 2022
set obs 100
generate Z = rnormal(50, 10) 
histogram Z

list Z in 1/10, separator(0)
     +----------+
     |        Z |
     |----------|
  1. | 54.72184 |
  2. | 57.83728 |
  3. | 54.85993 |
  4. | 64.43429 |
  5. |   28.463 |
  6. | 46.86906 |
  7. | 54.04034 |
  8. | 43.09037 |
  9. | 43.55874 |
 10. | 67.44601 |
     +----------+
list Z if inlist(_n, 5, 10, 100)
     +----------+
     |        Z |
     |----------|
  5. |   28.463 |
 10. | 67.44601 |
100. | 41.24168 |
     +----------+
quietly summarize Z

display r(max)
74.39547
list Z if Z == r(max)
     +----------+
     |        Z |
     |----------|
 91. | 74.39547 |
     +----------+
display r(min)
13.651556
list Z if Z == r(min)
     +----------+
     |        Z |
     |----------|
 81. | 13.65156 |
     +----------+
display r(mean)
50.801357
summarize Z, detail
                              Z
-------------------------------------------------------------
      Percentiles      Smallest
 1%     21.05728       13.65156
 5%     34.29119         28.463
10%     38.14845       31.52787       Obs                 100
25%     44.86053       33.08361       Sum of wgt.         100

50%     51.58675                      Mean           50.80136
                        Largest       Std. dev.      10.16443
75%     55.90447       69.22049
90%     64.56974       70.52183       Variance       103.3157
95%     67.24962       71.02126       Skewness      -.3428058
99%     72.70837       74.39547       Kurtosis       3.903721

2.3 平均と大数の法則

clear
set seed 2022

postfile sim means1000 using means1000, replace

forvalues i = 1/1000 {
    quietly capture drop x
    quietly set obs 100
    quietly generate x = rnormal(50, 10)
    quietly summarize x
    post sim (r(mean)) 
}

postclose sim

use means1000, clear
summarize means1000
    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
   means1000 |      1,000    49.98582    .9900411   46.95685    53.6977
clear
set seed 2022

postfile sim means1000 using means1000, replace

forvalues i = 1/1000 {
    quietly capture drop x
    quietly set obs 10000
    quietly generate x = rnormal(50, 10)
    quietly summarize x
    post sim (r(mean)) 
}

postclose sim

use means1000, clear
summarize means1000
    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
   means1000 |      1,000    49.99832    .1024485   49.68475   50.34666
clear
set seed 2022
set obs 10
generate x = ceil(runiform(0, 6))
quietly summarize x
display r(mean)
2.8
clear
set seed 2022

postfile sim means1000 using means1000, replace

forvalues i = 1/1000 {
    quietly capture drop x
    quietly set obs 1000
    quietly generate x = ceil(runiform(0, 6))
    quietly summarize x
    post sim (r(mean)) 
}

postclose sim

use means1000, clear
summarize means1000
    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
   means1000 |      1,000    3.499558    .0546235      3.329      3.681
clear
set seed 2022

postfile sim means1000 using means1000, replace

forvalues i = 1/1000 {
    quietly capture drop x
    quietly set obs 10000
    quietly generate x = ceil(runiform(0, 6))
    quietly summarize x
    post sim (r(mean)) 
}

postclose sim

use means1000, clear
summarize means1000
    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
   means1000 |      1,000    3.499559    .0166011     3.4576     3.5529

2.4 分散と標準偏差

clear
set seed 2022
set obs 1000
generate x = rnormal(50, 10)
quietly summarize x
display r(Var)
101.28133
display r(sd)
10.063862
clear
set seed 2022

postfile sim sds1000 using sds1000, replace

forvalues i = 1/1000 {
    quietly capture drop x
    quietly set obs 1000
    quietly generate x = rnormal(50, 10)
    quietly summarize x
    post sim (r(sd)) 
}

postclose sim

use sds1000, clear
summarize sds1000
    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
     sds1000 |      1,000    10.00502    .2271609   9.258702     10.732

2.5 相関係数と共分散

clear
set seed 2022
set obs 100
generate x = rnormal(50, 10)
generate y = rnormal(50, 10)
twoway (scatter y x)

cor x y
(obs=100)

             |        x        y
-------------+------------------
           x |   1.0000
           y |  -0.1192   1.0000
cor x z
(obs=100)

             |        x        z
-------------+------------------
           x |   1.0000
           z |   0.7088   1.0000
twoway (scatter y z)