资源描述
医学统计学作业
目录
I.统计图……………………………………………………………… 1
II.双变量回归与相关………………………………………………… 2
III.多因素试验资料的方差分析………………………………………4
IV.重复测量设计的方差分析…………………………………………6
V.协方差分析………………………………………………………… 12
VI.多元线性回归分析…………………………………………………15
VII.Logistic回归分析…………………………………………………16
VIII.生存分析………………………………………………………… 20
I.统计图
例2-8 正态分布图
21
II.双变量回归与相关
例9-1某地方病研究所调查了8名正常儿童的尿肌酐含量(mmol/24h)如书中表9-1,估计尿肌酐含量(Y)对其年龄(X)的直线回归方程。
1.
2.例9-2检验例9-1数据得到的直线回归方程是否成立
答:建立假设检验,确立检验水准
H0:β=0,尿肌酐含量与年龄之间无直线关系
H1:β≠0,尿肌酐含量与年龄之间存在直线关系
α=0.05
ANOVAa
Model
Sum of Squares
df
Mean Square
F
Sig.
1
Regression
.813
1
.813
20.968
.004b
Residual
.233
6
.039
Total
1.046
7
a. Dependent Variable: 尿肌酐含量
b. Predictors: (Constant), 年龄
P=0.004,按照α=0.05水准,拒绝H0 ,接受H1 ,尿肌酐含量与年龄之间存在直线关系;直线回归方程成立
III. 多因素试验资料的方差分析
例11-1 将20只家兔随机等分4组,每组5只,进行神经损伤后的缝合试验。处理由A、B两因素组合而成,因素A为缝合方法,有两水平,一为外膜缝合,记作a1,二为束膜缝合,记作a2;因素B为缝合后的时间,亦有两水平,一为缝合后1月,记作b1,二为缝合后2月,记作b2。试验结果为家兔神经缝合后的轴突通过率(%)(注:测量指标,视为计量资料),见书中表11-1。欲用析因分析比较不同缝合方法及缝合后时间对轴突通过率的影响。
Tests of Between-Subjects Effects
Dependent Variable:轴突通过率
Source
Type III Sum of Squares
df
Mean Square
F
Sig.
Partial Eta Squared
Noncent. Parameter
Observed Powerb
Corrected Model
2620.000a
3
873.333
2.911
.067
.353
8.733
.580
Intercept
27380.000
1
27380.000
91.267
.000
.851
91.267
1.000
缝合后的时间
2420.000
1
2420.000
8.067
.012
.335
8.067
.760
缝合方法
180.000
1
180.000
.600
.450
.036
.600
.113
缝合后的时间*缝合方法
20.000
1
20.000
.067
.800
.004
.067
.057
Error
4800.000
16
300.000
Total
34800.000
20
Corrected Total
7420.000
19
a. R Squared = .353 (Adjusted R Squared = .232)
b. Computed using alpha = .05
A因素主效应所对应的检验假设为H0:A因素主效应=0, H1:A因素主效应≠0,α=0.05;
B因素主效应所对应的检验假设为H0:B因素主效应=0,H1:A因素主效应≠0,α=0.05;
AB交互作用所对应的检验假设为H0:AB交互作用=0,H1:A因素主效应≠0,α=0.05。
方差分析的检验界值为,
统计学结论:模型Corrected Model检验F=2.911,P=0.067>0.05,模型不具备统计学意义;
A(缝合方法)F=0.600,P=0.45>0.05,不具备统计学意义;按照α=0.05水准,不拒绝H0,拒绝H1;
B(缝合后的时间)F=8.067,P=0.012<0.05,具备统计学意义;按照α=0.05水准,拒绝H0,接受H1;
AB(缝合后的时间 * 缝合方法)F=0.067,P=0.8>0.05,不具备统计学意义;按照α=0.05水准,接受H0,拒绝H1。
专业结论:尚不能认为两种缝合方法对神经轴突通过率有影响;可以认为缝合后2月与缝合后1月相比,神经轴突通过率提高了。
IV.重复测量设计的方差分析
例12-3 将手术要求基本相同的15名患者随机分3组,在手术过程中分别采用A,B,C三种麻醉诱导方法,在T0(诱导前)、T1、T2、T3、T4 , 五个时相测量患者的收缩压,数据记录见表12-17。试进行方差分析。
1. 建立假设检验,确立检验水准
H0:三种麻醉诱导方法在五个时相时测得的收缩压均值无差别
H1:三种麻醉诱导方法在五个时相时测得的收缩压均值有差别
α=0.05
2. 正态性检验、方差齐性检验:
Tests of Normality
group
Kolmogorov-Smirnova
Shapiro-Wilk
Statistic
df
Sig.
Statistic
df
Sig.
t0
方法A
.300
5
.161
.836
5
.154
方法B
.227
5
.200*
.916
5
.503
方法C
.229
5
.200*
.867
5
.254
t1
方法A
.331
5
.077
.834
5
.148
方法B
.220
5
.200*
.913
5
.485
方法C
.184
5
.200*
.978
5
.921
t2
方法A
.258
5
.200*
.940
5
.666
方法B
.227
5
.200*
.969
5
.869
方法C
.221
5
.200*
.953
5
.758
t3
方法A
.283
5
.200*
.937
5
.647
方法B
.241
5
.200*
.902
5
.421
方法C
.251
5
.200*
.941
5
.672
t4
方法A
.186
5
.200*
.943
5
.687
方法B
.292
5
.189
.892
5
.367
方法C
.164
5
.200*
.984
5
.955
*. This is a lower bound of the true significance.
a. Lilliefors Significance Correction
符合正态分布,P值均大于0.05
Test of Homogeneity of Variance
Levene Statistic
df1
df2
Sig.
t0
Based on Mean
.145
2
12
.866
Based on Median
.127
2
12
.882
Based on Median and with adjusted df
.127
2
11.675
.882
Based on trimmed mean
.144
2
12
.868
t1
Based on Mean
.440
2
12
.654
Based on Median
.385
2
12
.689
Based on Median and with adjusted df
.385
2
10.205
.690
Based on trimmed mean
.438
2
12
.655
t2
Based on Mean
.950
2
12
.414
Based on Median
.908
2
12
.429
Based on Median and with adjusted df
.908
2
8.995
.437
Based on trimmed mean
.976
2
12
.405
t3
Based on Mean
.200
2
12
.821
Based on Median
.141
2
12
.870
Based on Median and with adjusted df
.141
2
11.781
.870
Based on trimmed mean
.207
2
12
.816
t4
Based on Mean
.172
2
12
.844
Based on Median
.024
2
12
.976
Based on Median and with adjusted df
.024
2
9.538
.976
Based on trimmed mean
.157
2
12
.857
方差齐性相等,P值均大于0.05
3. ANOVA
Descriptive Statistics
group
Mean
Std. Deviation
N
t0
方法A
121.00
3.536
5
方法B
121.20
4.324
5
方法C
126.20
3.633
5
Total
122.80
4.346
15
t1
方法A
112.40
5.128
5
方法B
119.80
5.975
5
方法C
123.00
3.391
5
Total
118.40
6.490
15
t2
方法A
118.40
5.639
5
方法B
118.00
5.431
5
方法C
118.60
1.949
5
Total
118.33
4.320
15
t3
方法A
125.80
4.712
5
方法B
128.20
5.215
5
方法C
142.60
4.827
5
Total
132.20
8.930
15
t4
方法A
120.80
3.701
5
方法B
135.20
4.382
5
方法C
130.60
3.715
5
Total
128.87
7.210
15
描述统计分析结果
Multivariate Testsa
Effect
Value
F
Hypothesis df
Error df
Sig.
factor1
Pillai's Trace
.983
126.659b
4.000
9.000
.000
Wilks' Lambda
.017
126.659b
4.000
9.000
.000
Hotelling's Trace
56.293
126.659b
4.000
9.000
.000
Roy's Largest Root
56.293
126.659b
4.000
9.000
.000
factor1 * group
Pillai's Trace
1.809
23.656
8.000
20.000
.000
Wilks' Lambda
.008
22.215b
8.000
18.000
.000
Hotelling's Trace
20.600
20.600
8.000
16.000
.000
Roy's Largest Root
13.376
33.440c
4.000
10.000
.000
a. Design: Intercept + group
Within Subjects Design: factor1
b. Exact statistic
c. The statistic is an upper bound on F that yields a lower bound on the significance level.
球形检验结果
Tests of Between-Subjects Effects
Measure: MEASURE_1
Transformed Variable: Average
Source
Type III Sum of Squares
df
Mean Square
F
Sig.
Intercept
1155433.080
1
1155433.080
14649.223
.000
group
912.240
2
456.120
5.783
.017
Error
946.480
12
78.873
组间效应检验结果
Pairwise Comparisons
Measure: MEASURE_1
(I) group
(J) group
Mean Difference (I-J)
Std. Error
Sig.b
95% Confidence Interval for Differenceb
Lower Bound
Upper Bound
方法A
方法B
-4.800
2.512
.080
-10.273
.673
方法C
-8.520*
2.512
.005
-13.993
-3.047
方法B
方法A
4.800
2.512
.080
-.673
10.273
方法C
-3.720
2.512
.164
-9.193
1.753
方法C
方法A
8.520*
2.512
.005
3.047
13.993
方法B
3.720
2.512
.164
-1.753
9.193
Based on estimated marginal means
*. The mean difference is significant at the .05 level.
b. Adjustment for multiple comparisons: Least Significant Difference (equivalent to no adjustments).
组间多重比较检验结果
Pairwise Comparisons
Measure: MEASURE_1
(I) factor1
(J) factor1
Mean Difference (I-J)
Std. Error
Sig.b
95% Confidence Interval for Differenceb
Lower Bound
Upper Bound
1
2
4.400*
.860
.000
2.526
6.274
3
4.467*
.764
.000
2.801
6.132
4
-9.400*
1.188
.000
-11.988
-6.812
5
-6.067*
.972
.000
-8.184
-3.949
2
1
-4.400*
.860
.000
-6.274
-2.526
3
.067
.527
.901
-1.082
1.215
4
-13.800*
.613
.000
-15.135
-12.465
5
-10.467*
.881
.000
-12.385
-8.548
3
1
-4.467*
.764
.000
-6.132
-2.801
2
-.067
.527
.901
-1.215
1.082
4
-13.867*
.843
.000
-15.704
-12.029
5
-10.533*
.775
.000
-12.221
-8.846
4
1
9.400*
1.188
.000
6.812
11.988
2
13.800*
.613
.000
12.465
15.135
3
13.867*
.843
.000
12.029
15.704
5
3.333*
.943
.004
1.279
5.388
5
1
6.067*
.972
.000
3.949
8.184
2
10.467*
.881
.000
8.548
12.385
3
10.533*
.775
.000
8.846
12.221
4
-3.333*
.943
.004
-5.388
-1.279
Based on estimated marginal means
*. The mean difference is significant at the .05 level.
b. Adjustment for multiple comparisons: Least Significant Difference (equivalent to no adjustments).
组内不同时间多重比较检验结果
统计结论:球形检验结果:Mauchly's W=0.293,P=0.178>0.05。组间效应检验结果F=14649.223,P<0.05,说明三种麻醉诱导方法间差异有统计学意义;进一步作LSD法多重比较,方法A组、方法C组间差异具体统计学意义,P=0.005,方法B组与方法A组、方法C组间差异无统计学意义,P值分别为0.08、0.164;不同时相间比较,T2与T3间差异无统计学意义P=0.901,其余各时相间比较均有统计学意义。按照水平,拒绝H0,接受H1认为三种麻醉诱导方法在五个时相时测得的收缩压均值有差别。
V.协方差分析
例13-1 为研究某降糖药物的有效性及其合用二甲双胍片的有效性,选择收治90名2型糖尿病患者,并采用随机对照试验,分为三个治疗组,第一组为该降糖药组,第二组为二甲双胍片组,第三组为该降糖组+二甲双胍片组,每组30名患者,治疗3个月,主要有效性指标为糖化血红蛋白。测得每个患者入组前(X)和3个月后(Y)的糖化血红蛋白含量(%)见书中表13-3的上部,试分析三种治疗降糖化血红蛋白的效果是否不同。
1. 设立假设检验,确立检验水准
H0:各组降糖的总体修正均数相等
H1:各组降糖的总体修正均数不全相等
α=0.05
2. 判断是否符合协方差分析条件:
三个组的入组前(X)糖化和3个月后(Y)糖化间都有明显的直线趋势;三组的直线趋势相近。因此,本资料符合协方差分析的条件。
3. 检验各组总体斜率是否相等:
Tests of Between-Subjects Effects
Dependent Variable: 3个月后
Source
Type I Sum of Squares
df
Mean Square
F
Sig.
Corrected Model
48.973a
5
9.795
56.618
.000
Intercept
6230.016
1
6230.016
36013.278
.000
c
18.725
2
9.362
54.120
.000
x
30.183
1
30.183
174.476
.000
c * x
.065
2
.033
.188
.829
Error
14.531
84
.173
Total
6293.520
90
Corrected Total
63.504
89
a. R Squared = .771 (Adjusted R Squared = .758)
I型方差分析模型的结果
模型中交互作用无统计学意义P=0.829,说明三组患者3月后糖化降低随着入组前糖化变化的斜率是相同的,故可对资料进行协方差分析。
4. 比较修正均数有无差异
Tests of Between-Subjects Effects
Dependent Variable: 3个月后
Source
Type III Sum of Squares
df
Mean Square
F
Sig.
Corrected Model
48.908a
3
16.303
96.053
.000
Intercept
.363
1
.363
2.140
.147
c
19.851
2
9.925
58.480
.000
x
30.183
1
30.183
177.835
.000
Error
14.596
86
.170
Total
6293.520
90
Corrected Total
63.504
89
a. R Squared = .770 (Adjusted R Squared = .762)
修正均数比较的方差分析结果
x的F=96.053,P<0.01,说明入组前糖化血糖蛋白对3个月后的糖化血红蛋白含量的下降有影响
Estimates
Dependent Variable: 3个月后
组别
Mean
Std. Error
95% Confidence Interval
Lower Bound
Upper Bound
试验组
8.356a
.076
8.206
8.506
盐酸二甲双胍
8.877a
.075
8.727
9.027
试验药+盐酸二甲双胍
7.728a
.075
7.578
7.877
a. Covariates appearing in the model are evaluated at the following values: 入组前 = 9.9700.
修正均数及其可信区间结果
试验药+二甲双胍组的糖化水平低于另外二组,同时提示修正均数是按照入组前糖化均值x为9.97%计算的。
Pairwise Comparisons
Dependent Variable: 3个月后
(I) 组别
(J) 组别
Mean Difference (I-J)
Std. Error
Sig.b
95% Confidence Interval for Differenceb
Lower Bound
Upper Bound
试验组
盐酸二甲双胍
-.521*
.107
.000
-.734
-.308
试验药+盐酸二甲双胍
.628*
.107
.000
.416
.840
盐酸二甲双胍
试验组
.521*
.107
.000
.308
.734
试验药+盐酸二甲双胍
1.149*
.106
.000
.938
1.361
试验药+盐酸二甲双胍
试验组
-.628*
.107
.000
-.840
-.416
盐酸二甲双胍
-1.149*
.106
.000
-1.361
-.938
Based on estimated marginal means
*. The mean difference is significant at the .05 level.
b. Adjustment for multiple comparisons: Least Significant Difference (equivalent to no adjustments).
各组修正均数是否相等的假设检验结果
Univariate Tests
Dependent Variable: 3个月后
Sum of Squares
df
Mean Square
F
Sig.
Contrast
19.851
2
9.925
58.480
.000
Error
14.596
86
.170
The F tests the effect of 组别. This test is based on the linearly independent pairwise comparisons among the estimated marginal means.
修正均数按方差分析法进行的检验结果
修正均数按方差分析法进行的检验结果,结论和修正均数比较的方差分析结果一致。
各组间总体修正均数间差别均有统计学意义(P均<0.05)。在α=0.05水平上,拒绝H0,接受H1,各组降糖的总体修正均数不全相等。可以认为在扣除了入组前糖化水平的影响后,第三组治疗患者的平均降糖量最多,第一组治疗次之,第二组治疗最少。
VI.多元线性回归分析
PPT例题2:有学者认为糖尿病人的血糖变化可能受胰岛素、糖化血红蛋白、血清总胆固醇、甘油三脂等多种生化指标的影响,现测量了27名糖尿病人的相关指标,资料如下表所示,请对此作分析。
1.建立假设检验
H0:β1=β2=β3=0
H1:βj(j=1,2,…,m)不全为0
Variables Entered/Removeda
Model
Variables Entered
Variables Removed
Method
1
x4, x2, x3, x1b
.
Enter
a. Dependent Variable: y
b. All requested variables entered.
Model Summary
Model
R
R Square
Adjusted R Square
Std. Error of the Estimate
1
.775a
.601
.528
2.0095
a. Predictors: (Constant), x4, x2, x3, x1
回归方程模型摘要
ANOVAa
Model
Sum of Squares
df
Mean Square
F
Sig.
1
Regression
133.711
4
33.428
8.278
.000b
Residual
88.841
22
4.038
Total
222.552
26
a. Dependent Variable: y
b. Predictors: (Constant), x4, x2, x3, x1
回归方程的方差分析结果
Coefficientsa
Model
Unstandardized Coefficients
Standardized Coefficients
t
Sig.
B
Std. Error
Beta
1
(Constant)
5.943
2.829
2.101
.047
x1
.142
.366
.078
.390
.701
x2
.351
.204
.309
1.721
.099
x3
-.271
.121
-.339
-2.229
.036
x4
.638
.243
.398
2.623
.016
a. Dependent Variable: y
多元回归方程的参数估计
回归方程的方差分析得出,F=8.278,P<0.001。拒绝H0:β1=β2=β3=β4=0,所以拟合的回归方程有统计学意义。
β1、β2、β3、β4的估计值b1、b2、b3、b4分别为0.142、0.351、-0.271、0.638,据此写出多元回归方程:
血糖的变化与甘油三酯、胰岛素和糖化血红蛋白有线性回归关系(总胆固醇的P=0.701,无统计学意义),且胰岛素与血糖的变化负相关,从标准化回归系数看出,糖化血红蛋白(β=0.398)对空腹血糖影响最大。
VII.Logistic回归分析
PPT例16-2为了探讨冠心病发生的有关危险因素,对26例冠心病病人和28例对照者进行病例-对照研究,各因素的说明及资料见表16-2和表16-3。试用logistic 逐步回归分析方法筛选危险因素。(α入=0.01,α出=0.15)
表16-2 冠心病8个可能的危险因素与赋值
表16-3 冠心病危险因素的病例-对照调查资料
Variables not in the Equation
Score
df
Sig.
Step 0
Variables
x1
5.789
1
.016
x2
5.968
1
.015
x3
4.747
1
.029
x4
4.311
1
.038
x5
7.460
1
.006
x6
10.117
1
.001
x7
5.244
1
.022
x8
6.818
1
.009
Overall Statistics
25.418
8
.001
Variables in the Equation
B
S.E.
Wald
df
Sig.
Exp(B)
Step 1a
x6
2.826
1.095
6.657
1
.010
16.875
Constant
-.523
.315
2.751
1
.097
.593
Step 2b
x5
1.828
.680
7.227
1
.007
6.219
x6
3.059
1.144
7.143
1
.008
21.303
Constant
-1.281
.461
7.715
1
.005
.278
Step 3c
x5
1.722
.714
5.814
1
.016
5.597
x6
3.028
1.176
6.627
1
.010
20.656
x8
1.663
.785
4.493
1
.034
5.277
Constant
-2.359
.770
9.378
1
.002
.095
Step 4d
x1
.924
.477
3.758
1
.053
2.519
x5
1.496
.744
4.044
1
.044
4.464
x6
3.135
1.249
6.303
1
.012
23.000
x8
1.947
.847
5.289
1
.021
7.008
Constant
-4.705
1.543
9.295
1
.002
.009
a. Variable(s) entered on step 1: x6.
b. Variable(s) entered on step 2: x5.
c. Variable(s) entered on step 3: x8.
d. Variable(s) entered on step 4: x1.
logitP=-4.705+0.924X1+1.495X5+3.135X6+1.947X8
最终进入模型的危险因素有4个(P值均小于0.05),它们分别是年龄(X1):OR=2.519,高血脂史(X5):OR=4.464,动物脂肪摄入量(X6):OR=23.000,A型性格(X8):OR=7.008。
VIII.生存分析
例17-4据例17-1和例17-2的资料,问甲种手术方式后和乙种手术方式后病人的其生存率有无差别?
1.建立假设检验,确立检验水准
H0:S1(t)=S2(t),两种手术方式的患者生存率相同
H1:S1(t)≠S2(t),两种手术方式的患者生存率不同
α=0.05
2. 运用Kaplan-Meler法进行log-rank检验,比较两组的生存率
Case Processing Summary
group
Total N
N of Events
Censored
N
Percent
甲种手术
18
11
7
38.9%
乙种手术
14
14
0
0.0%
Overall
32
25
7
21.9%
Survival Table
group
Time
Status
Cumulative Proportion Surviving at the Time
N of Cumulative Events
N of Remaining Cases
Estimate
Std. Error
甲种手术
1
1.000
1
.944
.054
1
17
2
3.000
1
.889
.074
2
16
3
5.000
1
.833
.088
3
15
4
6.000
1
.778
.098
4
14
5
7.000
1
.722
.106
5
13
6
8.000
1
.667
.111
6
12
7
10.000
1
.611
.115
7
11
8
14.000
0
.
.
7
10
9
17.000
1
.550
.119
8
9
10
19.000
0
.
.
8
8
11
20.000
0
.
.
8
7
12
22.000
0
.
.
8
6
13
26.000
0
.
.
8
5
14
31.000
0
.
.
8
4
15
34.000
1
.413
.149
9
3
16
34.000
0
.
.
9
2
17
44.000
1
.206
.164
10
1
18
59.000
1
.000
.000
11
0
乙种手术
1
1.000
1
.929
.069
1
13
2
2.000
1
.857
.094
2
12
3
3.000
1
.786
.110
3
11
4
4.000
1
.714
.121
4
10
5
6.000
1
.643
.128
5
9
6
8.000
1
.571
.132
6
8
7
9.000
1
.500
.134
7
7
8
10.000
1
.429
.132
8
6
9
11.000
1
.357
.128
9
5
10
12.000
1
.286
展开阅读全文