1、精心设计和报告临床研究 -— 解读CONSORT 方积乾 (中山大学公共卫生学院) 前言 推荐一篇文章 The CONSORT Statement for Reporting Randomized Trials: Explanation and Elaboration 作者:Altman, Schulz, Moher, Egger, Davidoff, Elbourne, Gøtzsche and Lang for the CONSORT Group. 杂志: Annals of Internal Medicine,Volume 134 • Number 8 6
2、6317 April 2001。 什么是 CONSORT Statement? Consolidated Standards of Reporting Trials : CONSORT statement (JAMA, 1996) DerSimonian and colleagues (N Engl J Med, 1982) 建议“编辑向作者提供一份他们希望认真报告的项目清单将会大大改善临床试验报告的质量” 。 早在1990年代,两组杂志编辑、试验人员和方法学专家独立地发表了关于试验报告的建议 (JAMA, 1994)。 接着,Rennie (JAMA, 1995) 强
3、烈要求这两个小组会面,并发展一套共同的建议; 其成果就是 CONSORT statement. 此后,CONSORT statement的修改始于1999年. 2001年, The CONSORT statement: revised recommendations for improving the quality of reports of parallel group randomized trials 同时发表于ANN Intern Med. , JAMA 和 Lancet 三个杂志,作者是The CONSORT Group。 CONSORT statemen
4、t (Consolidated Standards of Reporting Trials): (1) 医学报告基本项目的清单(checklist) (2) 医学试验的病人流程图 (Flow Chart) 许多杂志,如 The Lancet, British Medical Journal, Journal of the AmericanMedical Association, Annals of Internal Medicine, Journal of Hand Surgery, Journal of Pediatric Psychology …
5、和越来越多的生物医学编辑部,如 The International Committee of Medical Journals Editors (Vancouver Group) 和 The Council of Science Editors 都正式支持CONSORT. 当人们必须做观察时,总是可能有偏倚. 良好设计的随机对照试验(RCT)是干预效果的最佳证据; 但方法学不当会夸大疗效; 粗糙设计和报告的试验会误导医疗卫生决策. Table 2. Checklist of Items To Include When Reporting a R
6、andomized Trial† Paper Section and Topic Item Number Descriptor Title and abstract 1 How participants were allocated to interventions (e。g。, “random allocation," “randomized” or “randomly assigned")。 Introduction Background 2 Scientific background and explanation of rationale。 Method
7、s Participants 3 Eligibility criteria for participants and the settings and locations where the data were collected. Interventions 4 Precise details of the interventions intended for each group and how and when they were actually administered。 Objectives 5 Specific objectives and hypoth
8、eses. Outcomes 6 Clearly defined primary and secondary outcome measures and, when applicable, any methods used to enhance the quality of measurements (e.g., multiple observations, training of assessors). Sample size Randomization 7 How sample size was determined and, when applicable, explanat
9、ion of any interim analyses and stopping rules。 Sequence generation 8 Method used to generate the random allocation sequence, including details of any restriction (e。g。, blocking, stratification)。 Allocation concealment 9 Method used to implement the random allocation sequence (e.g。, numbered
10、containers or central telephone), clarifying whether the sequence was concealed until interventions were assigned。 Implementation 10 Who generated the allocation sequence, who enrolled participants, and who assigned participants to their groups。 Blinding (masking) 11 Whether or not participant
11、s, those administering the interventions, and those assessing the outcomes were blinded to group assignment. If done, how the success of blinding was evaluated。 Statistical methods 12 Statistical methods used to compare groups for primary outcome(s); methods for additional analyses, such as subgr
12、oup analyses and adjusted analyses. Results Participant flow 13 Flow of participants through each stage (a diagram is strongly recommended). Specifically, for each group report the numbers of participants randomly assigned, receiving intended treatment, completing the study protocol, and ana
13、lyzed for the primary outcome. Describe protocol deviations from study as planned, together with reasons. Recruitment 14 Dates defining the periods of recruitment and follow—up。 Baseline data 15 Baseline demographic and clinical characteristics of each group。 Numbers analyzed 16 Number of
14、participants (denominator) in each group included in each analysis and whether the analysis was by “intention to treat。” State the results in absolute numbers when feasible (e.g., 10 of 20, not 50%)。 Outcomes and estimation 17 For each primary and secondary outcome, a summary of results for each
15、group and the estimated effect size and its precision (e。g。, 95% confidence interval)。 Ancillary analyses 18 Address multiplicity by reporting any other analyses performed, including subgroup analyses and adjusted analyses, indicating those prespecified and those exploratory. Adverse events 19
16、 All important adverse events or side effects in each intervention group. Discussion Interpretation 20 Interpretation of the results, taking into account study hypotheses, sources of potential bias or imprecision, and the dangers associated with multiplicity of analyses and outcomes. Genera
17、lizability 21 Generalizability (external validity) of the trial findings。 Overall evidence 22 General interpretation of the results in the context of current evidence. † From references 56–58. Figure 1. Revised template of the CONSORT (Consolidated Standards of Reporting Trials) diagram
18、 showing the flow of participants through each stage of a randomized trial (56–58). Methods Item 3a。 Eligibility criteria for participants. (参加者合格的标准) Example … all women requesting an IUCD [intrauterine contraceptive device] at the Family Welfare Centre, Kenyatta National Hospital, w
19、ho were menstruating regularly and who were between 20 and 44 years of age, were candidates for inclusion in the study. They were not admitted to the study if any of the following criteria were present: (1) a history of ectopic pregnancy, (2) pregnancy within the past 42 days, (3) leiomyomata of t
20、he uterus, (4) active [pelvic inflammatory disease], (5) a cervical or endometrial malignancy, (6) a known hypersensitivity to tetracyclines, (7) use of any antibiotics within the past 14 days or long—acting injectable penicillin, (8) an impaired response to infection, or (9) residence outside
21、the city of Nairobi, insufficient address for follow—up, or unwillingness to return for follow-up (74). 在Kenyatta National Hospital 家庭福利中心要求宫内避孕装置IUCD的所有妇女, 有正常月经,年龄在20与44岁之间均可纳入研究。 如果发生以下任何一项,不可进入研究:(1)异位妊娠史,(2)前42天内怀孕,(3)子宫平滑肌瘤,(4)现患骨盆炎性疾病PID,(5)宫颈或子宫内膜恶性肿瘤,(6)已知四环素过敏,(7)前14天内用过任何抗生素或长期注射青霉素,(
22、8)弱感染反应,(9)Nairobi以外的居民,随访地址不详,或不愿返回做随访。 Explanation 研究者通常用“eligibility criteria"来限定一个人群,并在这个人群中选择一、二个中心进行试验。 “Eligibility criteria”:年龄、性别、临床诊断、病情… “Exclusion criteria”:常为保障病人安全。 l 计划书上必须准确描述“Eligibility criteria”和“Exclusion criteria” ——--以便研究者据此实施和报告, 读者据此判断研究结果可以应用到什么范围. Item 3b。 T
23、he settings and locations where the data were collected。(收集数据的地方和位置) Example Volunteers were recruited in London from four general practices and the ear, nose, and throat outpatient department of Northwick Park Hospital. The prescribers were familiar with homoeopathic principles but were not ex
24、perienced in homoeopathic immunotherapy (79). 从伦敦的四家全科医生和Northwick Park Hospital的耳鼻喉科招募志愿者。推荐的医生熟悉顺势疗法的原则,但没有顺势免疫治疗的经验(79)。 Explanation l 必须报告是否多中心,Settings和医务人员的数量 ---—“Settings and locations"影响该试验可推广的范围。 Item 4。 Precise details of the interventions intended for each group and how and
25、when they were actually administered.(确切描述各组干预的细节以及如何、何时实施)。 Example Patients with psoriatic arthritis were randomised to receive either placebo or etanercept (Enbrel) at a dose of 25 mg twice weekly by subcutaneous administration for 12 weeks . . . Etanercept was supplied as a sterile, lyophili
26、sed powder in vials containing 25 mg etanercept, 40 mg mannitol, 10 mg sucrose, and 1–2 mg tromethamine per vial. Placebo was identically supplied and formulated except that it contained no etanercept。 Each vial was reconstituted with 1 mL bacteriostatic water for injection。 银屑病关节炎患者随机地接受每周两次皮下注射安
27、慰剂或25mg Etanercept (Enbrel),共12周. 。 。 将Etanercept消毒冻干粉末加入小瓶,每瓶含25 mg Etanercept、40 mg 甘露醇、10 mg 蔗糖和 1–2 mg tromethamine。 安慰剂同样配制,只是不含Etanercept。每一瓶加1 mL抑菌水供注射。 Explanation 安慰剂及其伪装方式也要报告。 “常规治疗”或“联合治疗”尤其要全面描述。 l 有时,谁做干预特别重要,也是“干预"的内容 —-——手术干预,除手术方法外,必须描述外科医生的数量、训练和经验。 Item 5。 Specific
28、objectives and hypotheses. (规定目的和假设) Example We tested the hypothesis that a policy of active management of nulliparous labour would: 1. reduce the rate of caesarean section, 2。 reduce the rate of prolonged labour; 3。 not influence maternal satisfaction with the birth experience。 我们检验的假
29、设是:积极管理未产妇的分娩可以 1. 降低剖腹产率, 2. 降低拖延分娩率 3. 不影响产妇对分娩的满意度 Explanation 目的:试验打算回答的问题。 假设:事先规定的需要检验的若干问题,以帮助达到目的。 l 假设比目的要更具体,主要靠统计检验。 Item 6a. Clearly defined primary and secondary outcome measures。 (明确定义主要和次要结局指标) Example The primary endpoint with respect to efficacy in psoriasis was
30、 the proportion of patients achieving a 75% improvement in psoriasis activity from baseline to 12 weeks as measured by the PASI [psoriasis area and severity index]。 Additional analyses were done on the percentage change in PASI scores and improvement in target psoriasis lesions. 关于银屑病疗效的主要终点是12周内患者
31、达到基线银屑病活性水平75%的百分比。活性水平用PASI(银屑面积和严重指数)度量。附加的分析是:PASI得分变化百分比和目标银屑病损伤的改善。 Explanation “primary outcome measure”是事先规定的最重要的结局指标,通常以此为准来 计算样本量。 (item 7). 某些试验可能有多个“primary outcome”. 但招致多重分析问题 (see items 18 and 20) ,不提倡! 除“primary outcome”以外而感兴趣的结局指标都属于“secondary outcomes”,其中有可能包括意料之外的干预效果。 (it
32、em 19) 如果在随机化后多个时间点测定,必须事先规定时间点。(这有助于确定谁测定、多少测定者) l 所有结局指标必须在设计书上事先规定 统一、全面定义以及如何分析 --—- 报告时让读者知道, 所做的并非“事后诸葛" Item 6b。 When applicable, any methods used to enhance the quality of measurements (e。g。, multiple observations, training of assessors). (必要时,保证测定质量的方法,例如,多次测定、培训测定者) Examples
33、 The clinical end point committee . . . evaluated all clinical events in a blinded fashion and end points were determined by unanimous decision (89). Blood pressure (diastolic phase 5) while the patient was sitting and had rested for at least five minutes was measured by a trained nurse with a Copal
34、 UA-251 or a Takeda UA—751 electronic auscultatory blood pressure reading machine … 临床终点委员会. 。 。 。 。 .以“盲”的方式评估了所有的临床事件,共同决定终点.病人坐着、休息至少5分钟,由一名培训过的护士用Copal UA-251 or Takeda UA—751电子听诊血压机… Explanation 必须提供保证测定质量的细节. 可能的措施:多次测定、测定者盲(item 11a)、测定者培训。 Item 7a. How sample size was determined.
35、 (如何确定样本量) Examples l We believed that 。 . . the incidence of symptomatic deep venous thrombosis or pulmonary embolism or death would be 4% in the placebo group and 1。5% in the ardeparin sodium group. Based on 0.9 power to detect a significant difference (P 〈 0。05, two-sided), 976 patients wer
36、e required for each study group。 To compensate for nonevaluable patients, we planned to enroll 1000 patients per group (91)。 To have an 85% chance of detecting as significant (at the two sided 5% level) a five point difference between the two groups in the mean SF—36 [Short Form36] general healt
37、h perception scores, with an assumed standard deviation of 20 and a loss to follow up of 20%, 360 women (720 in total) in each group were required (92)。 l 我们相信. 。 。 安慰剂组的深度静脉 血栓或肺部栓塞或死亡的发生率,为4%,ardeparin sodium组为1。5%。基于0.9的功效检出有意义的差异(P < 0.05,双侧),每组要求976名患者,我们计划每组进入1000名(91). l 为以85%的机会检出两组的SF—36
38、 一般健康感知得分平均5分的差异(双侧,5%水平),假定标准差为20分,失访20%,每组要求360名妇女(总共720名)(92)。 Explanation 样本量太小的试验往往得到干预组无区别的错误结论. 为了科学和伦理,样本量需认真计划,在临床和统计学之间权衡。 需说明样本量是如何确定的. 样本量计算需要的四要素: 1) 预计每组的结局 (从而有所期待的干预组之间具有临床意义的差距) =? 2) 允许犯第I类错误的概率a=? 3) 允许犯第II类错误的概率b,或识别所期待差距的概率(功效,power)=? 4) (对于结局指标为测量值的情形,) 测量值的标准
39、差=? 还要说明留有多大的余地. 某些研究, 需做期中分析以决定是否继续试验. 若实际执行的样本量不同于原计划的样本量,必须解释理由。(例如,招募不得力或修改目标样本量) Item 7b。 When applicable, explanation of any interim analyses and stopping rules. (若做期中分析,必须交代期中分析和停止规则) Examples The results of the study 。 。 . were reviewed every six months to enable the study to be st
40、opped early if, as indeed occurred, a clear result emerged (98). Two interim analyses were performed during the trial. The levels of significance maintained an overall P value of 0.05 and were calculated according to the O’Brien–Fleming stopping boundaries. This final analysis used a Z score of 1.9
41、85 with an associated P value of 0.0471 (99). l 研究结果每6个月复习一次,以便一旦 出现明确结果及时停止(98). l 进行两次期中分析。检验水准控制在 总的P值等于0。05,按O’Brien–Fleming stopping boundaries计算。最后的分析使用对应于P值0。0471的Z得分1.985.(99) Explanation 若干预效果特别好或特别差,为了伦理缘故,试验必须提前结束. 这一点可通过期中分析来考察。 然而,在数据积累过程中做多次统计分析而无适当的校正的话,会引出错误结论和解释。 若每次期中分
42、析的假阳性率控制在5%,5次期中分析,总的假阳性率可高达19%。 现有一些成组序贯分析方法,若要使用,必须在研究计划书中事先讲明,事后报告:什么方法?多少次?停止规则? Item 8a。 Method used to generate the random allocation sequence。 (产生随机分配序列的方法) Example Independent pharmacists dispensed either active or placebo inhalers according to a computer generated randomization l
43、ist (62). 独立的药剂师按照计算机产生的随机化清单分发活性的或安慰剂的吸入器(62). Explanation 必须在不可预测的随机过程基础上将参加者分配到所比较的一个组中。(Table 1) 作者必须提供充分的信息,这样读者才能评价随机分配序列和分组偏倚的可能性。 读者不可能从单纯说术语“random allocation,” “randomization,” 或 “random"来判断随机化是否得当。 必须交代产生随机分配序列的方法 (1)随机数字表或计算机随机数发生器? (2)simple randomization? 基于单个随机数序列的随机化(
44、item 8a) Restricted randomization? 为达到样本量和基本特征的均衡而控制随机化的任 何方法.例如,Blocking,stratification Table 3 Restricted Randomization Blocking 例如,在病人流中规定每10位形成一个block,在这个 block内,随机地将5位分配给A组,5位分配给B组 ———— 数量容易均衡,但“不可预测性"降低了(代价)。 如果试验操作者发现了block的大小,可以推测下一个处理是什么. 改善办法:盲法(不让知道已做的干预)、用较大的block、blo
45、ck的大小是随机的。 Stratification 由于偶然性,研究组之间,基线特征(年龄、疾病阶段等)有可能不匹配,小型试验尤其如此. Stratified randomization 将参加者按重要特征(年龄、吸烟或病情等)分成若干小组(层); 在同一层内做随机化分配。可以提高试验的功效(约12%),尤其当分层因素是重要预后因素时。 多中心研究时,中心也可以视为为层。 同一层里再设置block,这样的Stratified randomization更有效. Minimization Minimization 保证在试验的全过程, 若干患者因素在干 预组之间尽可能均
46、衡。 随机分配序列不是事先产生的。 第一名患者是随机分配的;此后每一位患者是根据当时情况,按“不均衡”最小化 原则. 随机分配.哪个干预能使“不均衡”最小化,就以较大的机会(例如,0.8)分配给这个干预组. Minimization 对 randomization 作了可接受的改进, 等价于随机化试验。不少人认为这样是最好的。 (108)。 注: “Random” 一词经常误用: “确定性"分配方法:交替分组、医院号码、 生日等。 这些不能说是 “random” 也不能说是“quasi-random". 有证据表明,这样的试验给出偏倚的结果. ---- 因为不能 恰
47、当地conceal(遮盖)分配序列 (see item 9). Item 8b. Details of any restriction of randomization (限制性随机化的细节) Example Women had an equal probability of assignment to the groups. The randomization code was developed using a computer random number generator to select random permuted blocks。 The block l
48、engths were 4, 8, and 10 varied randomly . . . (74) 妇女们有相同的概率被分配到各组。利用计算机随机数发生器产生随机化码来选择blocks,block的长度为4、8、10等,随机地变化。 。 . (74)。 Explanation 在大型试验中,simple randomization*可产生数量相当的两个试验组,关于已知(或未知)变量这两组大致是可比的。 如果用的是restricted randomization, 必须交代细节。 如果block randomization, 需说明如何产生blocks、block 的大
49、小以及block 的大小是否有变化等? 如果Stratified randomization,按照什么因素分层?许多分层因素时,实施比较复杂。 如果minimization,对哪些变量均衡? Item 9. Method used to implement the random allocation sequence clarifying whether the sequence was concealed until interventions were assigned. (实施随机分配序列的方法,说明是否成功地保密) Example Women were
50、assigned on an individual basis to both vitamins C and E or to both placebo treatments。 They remained on the same allocation throughout the pregnancy if they continued in the study。 A computer generated randomisation list was drawn up by the statistician . . . and given to the pharmacy departments






