Introduction
Various antipsychotic clinical trials have yielded findings that conflict directly with one another. For example, the Clinical Antipsychotic Trials of Intervention Effectiveness (CATIE) study (Lieberman et al. Reference Lieberman, Stroup, McEvoy, Swartz, Rosenheck, Perkins, Keefe, Davis, Davis, Lebowitz, Severe and Hsiao2005) and the Cost Utility of the Latest Antipsychotic drugs in Schizophrenia Study (CUtLASS; Jones et al. Reference Jones, Barnes, Davies, Dunn, Lloyd, Hayhurst, Murray, Markwick and Lewis2006) were two landmark studies whose findings challenged those of previous industry-sponsored clinical trials (Leucht et al. Reference Leucht, Kissling and Davis2009b ). One explanation for the differences concerns variability in methodological robustness and quality of reporting of antipsychotic trials in general. Indeed, although Heres et al. (Reference Heres, Davis, Maino, Jetzinger, Kissling and Leucht2006) highlighted industry sponsorship as a possible source of bias, they also noted that choice of dose and dosing regimen, selection of participants, handling of missing data and selective reporting of side-effects were further potential aspects for concern.
Whether or not the quality of reporting for trials has improved in recent years remains to be determined. Pre-licensing phase II and III trials on newer antipsychotics are yet to have the quality of their reporting formally evaluated. The US Food and Drug Administration (FDA) licensed new antipsychotics including iloperidone, asenapine, paliperidone (oral) and paliperidone palmitate long-acting injection (LAI) in 2009 and lurasidone and olanzapine LAI in 2010. More recently developed antipsychotics, undergoing trials, include pomaglumetad methionil (LY2140023) and bitopertin (RG1678).
Whereas meta-analyses of antipsychotic pre-licensing trials have focused predominantly on summarizing key outcomes rather than more extensively evaluating the quality and rigour of reporting of the trials under review (e.g. Johnson & Jørgensen, Reference Johnson and Jørgensen2008; Leucht et al. Reference Leucht, Corves, Arbter, Engel, Li and Davis2009a , Reference Leucht, Cipriani, Spineli, Mavridis, Örey, Richer, Samara, Barbui, Engel, Geddes, Kissling, Stapf, Lässig, Salanti and Davis2013), our systematic review aimed to focus specifically on the quality of reporting for phase II and III (pre-licensing) trials of newer antipsychotics, published after the CATIE study. A particular emphasis on design and methodology was considered and was based on the international Consolidated Standards of Reporting Trials (CONSORT) statements for superiority and non-inferiority/equivalence designs (Begg et al. Reference Begg, Cho, Eastwood, Horton, Moher, Olkin, Pitkin, Rennie, Schulz, Simel and Stroup1996; Moher et al. Reference Moher, Schultz and Altman2001; Piaggio et al. Reference Piaggio, Elbourne, Altman, Pocock and Evans2006, Reference Piaggio, Elbourne, Altman, Pocock and Evans2012; Schulz et al. Reference Schulz, Altman and Moher2010).
Method
Search strategy
Electronic searches were conducted on 20 February 2012 in Medline (Ovid Medline and Medline In-Process and Other Non-Indexed Citations), EMBASE and Cochrane Central Register of Controlled Trials. Search terms that were also mapped to MeSH headings included: schizophrenia OR schizoaffective disorder (term 1); clinical trial phase 2 OR feasibility study OR pilot study/projects OR clinical trial phase 3 OR randomized controlled trial OR controlled clinical trial (term 2); iloperidone OR asenapine OR lurasidone OR paliperidone OR olanzapine OR olanzapine inject* OR LY2140023 OR RG1678 OR bitopertin (term 3). Terms 1 AND 2 AND 3 were combined. Limitations then applied were: human, English language, year 2006–current, adult. Additionally, ClinicalTrials.gov was searched on 7 March 2012 by combining the term ‘schizophrenia’ with each of the above-mentioned drugs. A preliminary search in ClinicalTrials.gov for cariprazine revealed that no studies had yet reported their findings; consequently, cariprazine was not further included.
Criteria
Study design
Phase II and III clinical trials, feasibility and pilot studies with the design of a randomized controlled trial (RCT) were all included. Phase I and IV trials and all non-randomized trials were excluded. Only articles describing individual phase II and III trials (and thus of newer antipsychotics) published between January 2006 and February 2012 were included, as these were published after the CATIE study (2005) and article by Heres et al. (Reference Heres, Davis, Maino, Jetzinger, Kissling and Leucht2006).
Participants and sample size
Adult participants aged 18–65 years were included. Studies only on children or older adults were excluded. Trials involving participants of mixed ages (e.g. aged 18–75 years) were excluded if the mean age of the participants was ⩾45 years. Diagnoses of schizophrenia and schizoaffective disorder were included. Excluded diagnoses comprised but were not limited to: schizotypal, schizophreniform, psychotic depression and bipolar disorder. Studies with a total sample size consisting of a minimum of 20 patients with schizophrenia or schizoaffective disorder per trial arm were included; consequently, studies with a total sample size of less than 40 participants were excluded.
Target interventions and comparators
Included antipsychotics reported on since January 2006 and which also subsequently had FDA approval were: lurasidone, asenapine, iloperidone, paliperidone (oral), paliperidone palmitate LAI and olanzapine LAI. Antipsychotics under investigation in pre-licensing clinical trials and thus included were LY2140023 (pomaglumetad methionil) and RG1678 (bitopertin). Both fixed dosing and flexible dosing were included. Excluded drugs (approved by the FDA prior to March 2001) comprised but were not limited to: olanzapine oral, amisulpride, aripiprazole, clozapine, risperidone, quetiapine and ziprasidone. Drugs administered by a method other than oral (tablet or capsule) or depot/LAI were excluded. Studies with an active antipsychotic comparator arm or placebo arm or both were included. If there was no comparison arm, the study was excluded.
Outcome measures
Completed studies with results on a primary efficacy outcome measure of the Positive and Negative Symptom Scale (PANSS; Kay et al. Reference Kay, Fiszbein and Opler1987) or the Brief Psychiatric Rating Scale (BPRS; Overall & Gorham, 1962) were included. Studies with ‘time to relapse’ as the primary efficacy measure and the PANSS or BPRS score as the secondary measure were included. Studies focusing primarily on change in cognitive symptoms were excluded. Studies with no results or with safety or tolerability measures as the primary outcome were excluded.
Data collection and quality analysis
Duplicates were identified. Where ClinicalTrials.gov identified a study and provided an official study title, this was used to further search for an associated publication for inclusion. Published articles were favoured over reports on ClinicalTrials.gov. All remaining abstracts were reviewed independently by two authors against the selection criteria based on a hierarchical order. Where agreement could not be reached, articles were examined in full and a third author mediated. Reference lists of articles selected for inclusion were hand-searched for further relevant articles. The quality analysis was based on the degree of appropriate detail provided by the studies on each methodological aspect outlined by the CONSORT statements for superiority and non-inferiority/equivalence designs (Begg et al. Reference Begg, Cho, Eastwood, Horton, Moher, Olkin, Pitkin, Rennie, Schulz, Simel and Stroup1996; Moher et al. Reference Moher, Schultz and Altman2001; Piaggio et al. Reference Piaggio, Elbourne, Altman, Pocock and Evans2006, Reference Piaggio, Elbourne, Altman, Pocock and Evans2012; Schulz et al. Reference Schulz, Altman and Moher2010), along with key aspects from the results section. Statistical analyses were examined for the primary outcome, methods for handling missing data and any associated sensitivity analyses. Items that were not considered by CONSORT but that were pertinent for antipsychotic trials, such as permitted concomitant medication, were also evaluated. Similarly, reporting on key groups of commonly occurring side-effects for antipsychotics were examined in addition to the standard trial requirement for the reporting of deaths (Isaac & Koch, Reference Isaac and Koch2010).
Results
Search strategy
The electronic searches identified 791 articles, of which 762 were excluded in hierarchical order: duplication (n = 279), not in English (n = 1), not a target drug (n = 432), participant group not adult/human (n = 4), diagnosis not schizophrenia or schizoaffective disorder (n = 2), study design not an RCT/phase II (P2)/phase III (P3) (n = 7), no comparison arm (n = 5), primary outcome measure not an efficacy measure (n = 26), abstract only published (n = 1; Umbricht et al. Reference Umbricht, Martin-Facklam, Youssef, Yoo, Dorflinger, Bausch, Alberati and Santarelli2011), only pooled analysis for multiple studies (n = 2; Meltzer et al. Reference Meltzer, Bobo, Nuamah, Lane, Hough, Kramer and Eerdekens2008; Kane et al. Reference Kane, Laurellio, Laska, Di Marino and Wolfgang2008), and the study was from ClinicalTrials.gov and had no available results and no associated publications (n = 3, one each for iloperidone, pomaglumetad methionil and olanzapine LAI). Two further articles (Weiden et al. Reference Weiden, Cutler, Polymeropoulos and Wolfgang2008; Fleischhacker et al. Reference Fleischhacker, Gopal, Lane, Gassmann-Mayer, Lim, Hough, Remerie and Eerdekens2012) were identified from the reference lists of included articles (Potkin et al. Reference Potkin, Litman, Torres and Wolfgang2008; Pandina et al. Reference Pandina, Lane, Gopal, Gassmann-Mayer, Hough, Remmerie and Simpson2011). Where no P2 studies for a drug were found, further specific searches both pre- and post-2006 were conducted. One article reporting an iloperidone P2 study (Borison et al. Reference Borison, Huff and Griffiths1996) was identified but excluded as it was published before 2006. No P2 articles for paliperidone extended release (ER) and olanzapine LAI were found. As no full articles were found for bitopertin, this drug was not considered further in the analysis.
A final set of 31 articles was identified for inclusion that provided detail on 32 studies (three iloperidone studies were reported across two articles: Potkin et al. Reference Potkin, Cohen and Panagides2007; Weiden et al. Reference Weiden, Cutler, Polymeropoulos and Wolfgang2008), which were published in 11 journals (see Table 1). Two included lurasidone trial reports were taken directly from ClinicalTrials.gov (2012a , b ) as no publication was otherwise found. One asenapine article included two identical studies and these were considered as one in our evaluation (Buchanan et al. Reference Buchanan, Panagides, Zhao, Phiri, den Hollander, Ha, Kouassi, Alphs, Schooler, Szegedi and Cazorla2012). There were six P2 and 26 P3 parallel RCTs and the study duration range was 3–53 weeks. The majority of studies (69%) were conducted across ⩾3 continents but some were contained within a single country, mostly in the USA. There were more than 15 500 participants across all studies.
Table 1. Summary of included phase II and III trials according to target drug (32 studies)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20160921045010842-0752:S0033291714001214:S0033291714001214_tab1.gif?pub-status=live)
FDA, Food and Drug Administration; ER, extended release; LAI, long-acting injection; NYA, not yet approved.
a Lack of phase II trials for iloperidone, paliperidone ER and olanzapine LAI is either a result of identified studies being outside the specified time period of the review or due to a lack of available results.
Quality analysis
Objectives, hypotheses and design
The primary objective of the study was stated by most (69%) but only 13% explicitly stated a primary hypothesis (see Table 2). Just under half (47%) explicitly reported the design as parallel and three P3 studies stated that the trial was a non-inferiority design (9%). Only 50% explicitly reported whether the study was a phase II or III RCT; the rest were verified using ClinicalTrials.gov.
Table 2. Number of studies per drug reporting sufficient information on methodology
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20160921045010842-0752:S0033291714001214:S0033291714001214_tab2.gif?pub-status=live)
ER, Extended release; LAI, long-acting injection.
a Design to be stated as parallel, and also non-inferiority if appropriate.
b Exclusion criteria required to state excluded diagnoses.
c Randomization required to include method [computer generated or use of an interactive voice response system (IVRS)] and ratio if not stated as ‘balanced’.
Eligibility criteria
Age
Almost all studies stated a lower age limit of 18 years (97%). The upper age limit for inclusion varied: 64 or 65 years (44%); 75 years (13%); explicit statement of no limit (3%); no statement (41%).
Diagnosis
Participants in all studies had a diagnosis of schizophrenia or schizoaffective disorder. Most studies (84%) used DSM-IV (APA, 1994) and the remainder did not specify the diagnostic classification method. Excluded diagnoses were reported by 78% of studies and these excluded diagnoses in part comprised: schizophreniform; other or concurrent DSM-IV psychiatric disorder (some specifying only Axis I diagnoses including substance dependence); and chronic organic central nervous system disease. Of studies only including schizophrenia or schizo-affective disorder respectively, only 18% explicitly stated that the other diagnosis was excluded. Specific subcategories of schizophrenia were also stated as being excluded in a few studies.
Illness severity
Only one study did not report illness severity criteria (3%). The scales used were: the PANSS-Total (PANSS-T) with a minimum eligibility score of 60, 70 or 80; a BPRS minimum score of 30, 42 or 45; and a Clinical Global Impressions Scale – Severity of Illness (CGI-S; Guy, Reference Guy1976a ) minimum score of 4 or 5. Twelve studies (38%) also included various additional PANSS subscale or individual item minimum eligibility scores.
Pre-consent medication
Eleven studies (34%) did not report antipsychotic history criteria prior to consent. Aspects of the antipsychotic history resulting in exclusion were: previous exposure to target drug; intolerability to target drug or active comparator; history of treatment resistance/unresponsiveness for certain period; received depot LAI within a certain period prior to trial; and previous participation in target drug trial. Seven studies (22%) did not report on prohibited medications. The remaining studies reported the following medications as prohibited: mood stabilizers; antidepressants; benzodiazepines; use of a second antipsychotic; anticonvulsants; beta-blockers; and ‘psychotropic medications’ (described as ‘over-the-counter and nutritional medications’ and prior use of ‘experimental’ and ‘investigational’ medications).
Target interventions and comparators
Antipsychotic dose and regimen for the target interventions were fully described by all studies but the formulation was not always stated explicitly. Of the studies using a placebo comparator, approximately half specified the formulation (17/31, 55%), some described allocation concealment of an oral placebo as ‘over-encapsulated’ or ‘matched’ to the target drug (10/25, 40%), and the chemical composition was not described for any oral placebos. Of the studies using an active comparator, over half did not explicitly report the formulation (12/21, 57%), and only some stated that the oral comparator was ‘over-encapsulated’ or ‘matched’ to match the target drug (5/20, 25%). For the 10 LAI studies, six stated whether reconstitution was required or whether a prefilled syringe was provided, and the nature of the placebo injections diluent was provided for only six studies.
Concomitant medication
Six studies (19%) did not report which concomitant medications, if any, were permitted during the trial. Concomitant medication allowed varied greatly between studies and included: benzodiazepines (lorazepam, temazepam, amobarbital sodium), hypnotics (zopiclone, zolpidem and zaleplon), antidepressants, anxiolytics (including beta-blockers), benztropine and biperiden.
Randomization and blinding
All studies reported that randomization occurred but only 44% adequately reported the method of randomization. Nine studies reported using a block design but only one gave block sizes and one out of 14 studies using stratified randomization did not specify the stratification variables, for example by study centre. Implementation of assignment to interventions by an interactive voice response system (IVRS) was reported by 10 studies. All studies reported the nature of the blinding as ‘double-blind’ except for one, which was single-blinded (rater), but only five of the double-blind studies described who was blinded (i.e. the participant, clinician, investigator or outcomes assessor).
Outcome measures
The primary efficacy outcomes included: change from baseline in PANSS-T score (72%); change from baseline in BPRS score (13%); the 16-item Negative Symptom Assessment (NSA-16, 3%; Buchanan et al. Reference Buchanan, Panagides, Zhao, Phiri, den Hollander, Ha, Kouassi, Alphs, Schooler, Szegedi and Cazorla2012); and time to first relapse (13%). If the PANSS score was not used as the primary outcome, then it was used as a secondary outcome. With regard to tolerability, all but one study reported the percentage of participants experiencing ⩾1 treatment-emergent adverse events and all studies used rating scales for evaluation of extrapyramidal symptoms; these included the Barnes Akathisia Scale (Barnes, Reference Barnes1989), the Simpson–Angus Scale (Simpson & Angus, Reference Simpson and Angus1970), the Abnormal Involuntary Movement Scale (AIMS; Guy, Reference Guy1976b ) and the Extrapyramidal Symptom Rating Scale (ESRS; Chouinard et al. Reference Chouinard, Ross-Chouinard, Annable and Jones1980). Weight gain was reported as mean change in kg (84%) and/or percentage of participants who had a weight increase of ⩾7% (88%); one study did not report weight gain and another only reported weight gain as a treatment-emergent adverse event. Twenty-seven studies (84%) reported on sedation/somnolence. Although most studies (91%) reported on prolactin levels, some as change from baseline in ng/ml or μm/l, none specified the reference range used for prolactin levels. Furthermore, eight studies (25%) did not explicitly state whether any deaths occurred during the trials.
Sample size and participant flow
Eleven studies (34%) failed to report a sample size calculation and eight studies (25%) gave only a partial description (see Table 3). A CONSORT participant flow diagram was provided either in the main article (by 72%) or in a supplementary section (by 13%). Where a diagram was lacking, the relevant information was provided as a table in the main article (3%) or supplementary section (3%) or not at all (9%). Attrition rates were 13–64%. The main reasons for discontinuation were: adverse events, unsatisfactory therapeutic effect, loss to follow-up, withdrawal of consent and protocol violation. Protocol violation was variously described as: an efficacy assessment deviation, an error in study drug administration and treatment deviation, and medication non-compliance.
Table 3. Number of studies reporting sufficient information on outcomes and analyses
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20160921045010842-0752:S0033291714001214:S0033291714001214_tab3.gif?pub-status=live)
CONSORT, Consolidated Standards of Reporting Trials; ER, extended release; LAI, long-acting injection; LOCF, last observation carried forward; MMRM, mixed-effect model repeated measure.
a Provided as a supplement to main article.
b Criteria for full sample size includes: significance level, power, difference and standard deviation (or effect size) and final sample size, along with attrition for primary per protocol analysis.
c Either LOCF or MMRM was used in the main analyses for missing data and where a sensitivity analysis was conducted, the alternative method was used.
Analysis and reporting of results
All studies reported conducting an intent-to-treat (ITT) analysis for the primary outcome, of which 28 studies (88%) used the ITT analysis as the main analysis, whereas 13% reported a per protocol (PP) analysis as the main analysis and one did not also provide the ITT results. All studies described the statistical method used for reporting the primary outcome but not all controlled for baseline measures. Three studies used a non-inferiority design and used change in PANSS-T score as the primary outcome but the margin for exclusion differed (two studies: >5; one study: >5.5). Methods for dealing with missing data were less well reported. The last observation carried forward (LOCF) method was used in 22 studies (69%) and six studies (19%) used the method of mixed-effect model repeated measure (MMRM), wherein the missing data are accounted for in the analysis. Of those undertaking MMRM, none reported the method for predicting missing outcome data-points or including predictive covariates in the analysis. Only 12 studies (38%) mentioned a sensitivity analysis for the method of dealing with missing data.
Discussion
Limitations
The searches conducted were not fully exhaustive and only publications in English were included because of resource restrictions. However, reference lists of included articles were searched, and it is likely that most, if not all, eligible multi-centred international pre-licensing clinical trials were found by electronic searching. The use of ClinicalTrials.gov as a resource yielded two incomplete ‘articles’ that may impact on the generalizability of the findings to completed peer-reviewed journal publications. However, these did not fully account for all aspects of suboptimal reporting. Since the search strategy was conducted, three further studies have been reported: two in abstracts including a P3 study for pomaglumetad methionil, which did not separate from placebo (Kinon et al. Reference Kinon, Millen, Downing, Zhang, Stauffer, Anderson and Gomex2013), and a P3 study for cariprazine (Citrome et al. Reference Citrome, Litman, Wang, Mokliatchouk, Németh, Laszlovsky and Durgam2013), and one as publication for one of the ClinicalTrials.gov studies on lurasidone (Loebel et al. Reference Loebel, Cucchiaro, Sarma, Xu, Hsu, Kalali, Pikalov and Potkin2013). Furthermore, the adequacy of reporting was agreed by two researchers based on the 2010 CONSORT statement and this differed from its earlier versions regarding need for study registration, protocol specification and funding details. The current review did not investigate reporting of cardiovascular side-effects or metabolic syndrome (other than weight gain), discussion points or study limitations.
Implications
This systematic review highlights areas of suboptimal reporting in phase II and III trials for more recently developed and/or licensed antipsychotics and could constitute an important step in enhancing standardization across all journals that publish antipsychotic trials. If an adequately justified and robust methodology is not used and reported for a trial, then signal detection of treatment effect is potentially compromised, which can result in a failed trial (Alphs et al. Reference Alphs, Benedetti, Flieschhacker and Kane2012; Agid et al. Reference Agid, Siu, Potkin, Kapur, Watsky, Vanderburg, Zipursky and Remington2013); the risk of biased interpretation of the findings is increased and there can also be a subsequent negative impact on future research and meta-analyses. Indeed, bias can result from a variety of aspects within trial design (Cochrane Bias Methods Group, 2013) and thus all should be given due consideration. Consequently, it is essential that a high standard of reporting is maintained so that policymakers and clinicians can make informed decisions.
Design
Without an explicit statement describing the nature of the trial design (superiority or non-inferiority parallel design) and hypothesis, the reader's ability to assess the suitability of the choice of primary outcome, statistical method and reporting of the outcomes may be compromised.
Eligibility criteria
Lack of detail on method of diagnostic classification or operationalized definition of first-episode psychosis used, variability regarding illness chronicity and severity for inclusion (i.e. minimum scores on symptom scales), along with excluded diagnoses (and subcategories), can all lead to interpretation bias, with misleading conclusions being drawn. For example, if antipsychotic A is compared with placebo in a trial for patients who are severely unwell, the degree of symptom reduction is likely to be greater than that seen for a second trial evaluating antipsychotic B against placebo for patients who are less severely unwell. This may in part be due to regression to the mean (Agid et al. Reference Agid, Siu, Potkin, Kapur, Watsky, Vanderburg, Zipursky and Remington2013). Thus, there is a risk of inferring that antipsychotic A is better than antipsychotic B because of the greater symptom reduction seen, and yet the reason for the difference in outcomes is likely to be accounted for, at least in part, by the difference in illness severity at trial onset. Additionally, variability in illness chronicity and severity inclusion criteria may impact on the number of patients who would be eligible for negative symptom antipsychotic trials (Rabinowitz et al. Reference Rabinowitz, Werbeloff, Caers, Mandel, Stauffer, Menard, Kinon and Kapur2013) and may also, to some extent, account for the increase in placebo response seen in trials (Agid et al. Reference Agid, Siu, Potkin, Kapur, Watsky, Vanderburg, Zipursky and Remington2013). Thus, we propose that standardization of illness chronicity and severity eligibility thresholds be considered for the key illness stages of schizophrenia. Furthermore, pre-consent medication both before and concomitant with a study may affect efficacy or tolerability outcomes and thus should be stated (Essock et al. Reference Essock, Covell, Davis, Stroup, Rosenheck and Lieberman2006; Barnes et al. Reference Barnes, Drake, Dunn, Hayhurst, Jones and Lewis2013), although we acknowledge that this is not a current specific requirement by CONSORT.
Intervention and comparison arms and concomitant medication
Ideally, the comparator should be described in as much detail as the target intervention (investigational product). For an active comparator, justification for the choice of both antipsychotic and dosage regimen enables the researcher to evaluate the appropriateness thereof (Patel et al. Reference Patel, Arista, Taylor and Barnes2013). For placebo comparators, provision of the chemical nature (including injection diluents where appropriate) is advocated so that we can clearly evaluate whether or not the only difference with the intervention was the active ingredient. Alternatively, creation of a standardized chemical placebo would allow for both unambiguous reporting and valid comparisons across studies. Furthermore, permitted concomitant medication may, to some extent, account for the differences seen in outcomes between two arms of a trial, if participants in one arm use concomitant medication more frequently. Thus, for antipsychotic trials, information on permitted benzodiazepine use in particular is helpful.
Randomization, blinding and allocation concealment
If the method of randomization is suboptimal, this can lead to allocation bias and either an under- or overestimation of the treatment effect depending on the subjective opinion of the investigator regarding the intervention. Furthermore, if randomization is not reported in sufficient detail, it is difficult to be reassured that group allocation has resulted in a balance in all variables, including both known and unknown confounders. Detailed description of blinding of study personnel of allocation concealment methods (including but not limited to identical appearance and frequency regimen for comparator and intervention) is required to verify the minimization of performance bias (identical care for both groups except for the intervention) and detection bias in measurement and analysis.
Outcome measures
It is notable that the PANSS total score is currently the most commonly used tool for assessing the severity of schizophrenia symptoms. Thus, justification of the use of other scales for the primary outcome, particularly when they are newly developed, would help to assist suitability of the alternative over and above the PANSS score to minimize interpretation bias and possible overestimation of the clinical benefit. Similarly, it seems not unreasonable to expect routine reporting on side-effects that are associated with the target intervention ‘receptor profile’, in addition to standardized reporting of commonly occurring side-effects for all antipsychotics (Ioannidis et al. Reference Ioannidis, Evans, Gøtzsche, O'Neill, Altman, Schulz and Moher2004), by using validated scales where possible. When under-reporting of side-effects occurs, for example for prolactin levels and sedation, or variation exists in how a side-effect is evaluated, such as change in weight (Pope et al. Reference Pope, Adams, Paton, Weaver and Barnes2010), this can constitute reporting bias that may be misleading when deriving the overall risk–benefit for a new antipsychotic.
Sample size and participant flow
When reporting a sample size calculation, sufficient detail is required and with provision of corrections for attrition to allow judgement of whether the trial was adequately powered (Schulz et al. Reference Schulz, Altman and Moher2010), as an underpowered study increases the likelihood of Type I or Type II errors. Furthermore, the lack of reporting of sufficient sample size calculations leaves open to question the margins of clinically meaningful differences that a study was originally designed to detect. For non-inferiority/equivalence studies, the margin used in a sample size calculation thus should also be stated and justified. We would further endorse that development be considered for a universally acceptable standardized margin of non-inferiority for the PANSS total score. Additionally, CONSORT (Schulz et al. Reference Schulz, Altman and Moher2010) state that the participant flow diagram should be provided within the main article of a clinical trial and this will facilitate detection of any attrition bias. However, we note that this diagram is sometimes relegated to an online supplementary information document or occasionally is entirely lacking and we postulate that this may be due to possible article length and space restrictions being imposed by journals.
Analysis and reporting of results
For all pre-licensing trials, an ITT analysis set is required. Whether or not it is acceptable to report on a per-protocol analysis, which can lead to overestimation of the effect, without the ITT analysis also being reported is currently subject to debate. In terms of method of analysis, use of ANCOVA is advantageous for testing whether an outcome measure differs between arms while controlling for pre-randomization baseline values and this is preferred over and above ‘change’ scores. Furthermore, the covariates in an ANCOVA model serve to reduce the variability of the outcome measure and hence increase the power of the statistical test (Vickers & Altman, Reference Vickers and Altman2001). For handling missing data, the LOCF method has been superseded by newer methods such as MMRM, The bias created is more evident for LOCF (than MMRM) and can under- or overestimate the treatment effect (Lane, Reference Lane2008; Siddiqui et al. Reference Siddiqui, Hung and O'Neill2009). If the distribution of missing data is unequal between the trial arms, the bias created could be further enhanced. Sensitivity analyses are useful for determining the extent of the impact of missing data and the imputation method used and thus can be used to test for the robustness of the conclusions. Alternatively, a new composite approach has been recently proposed that uses a single statistical method of analysis to combine differences in efficacy with differences in drop-out rates between trial arms without the need for data imputation (Rabinowitz et al. Reference Rabinowitz, Werbeloff, Caers, Mandel, Jaeger, Stauffer, Menard, Kinon and Kapur2014).
The role of the publishing journal
Updates to the CONSORT guidelines continue to be published and, if adopted by all journals, can help to raise standards. Currently, however, not all journals endorse the CONSORT statement, including some that published the articles included here, which instead have their own guidelines or sometimes rely solely on trial registration. An additional concern is that the article length and space restrictions imposed by journals on authors can, in turn and perhaps inadvertently, lead to suboptimal quality of reporting due to incompleteness. Furthermore, and in light of the similarity of some of our findings to those commented on by Heres et al. (Reference Heres, Davis, Maino, Jetzinger, Kissling and Leucht2006), we recommend that an independent clinical trial methodology expert peer reviews submitted trial manuscripts in addition to current review processes. Together with the required registration of all trials on a registry site such as ClinicalTrials.gov, which enables the summary details of the protocol to be made available, we hope that this will continue to improve the quality of reporting for antipsychotic trials.
Ultimately, we believe it is imperative that a trial that is conducted properly should also be reported properly and if not, we should be asking why. That being said, we acknowledge that a trial that is not reported properly is not necessarily invalid in terms of the method and results. Therefore, we conclude that no study is perfect and, perhaps inevitably, no reporting of a study can be perfect but some can be improved.
Declaration of Interest
M.X.P. holds a Clinician Scientist Award supported by the National Institute for Health Research (NIHR), has received consultancy fees, lecturing honoraria and/or research funding from Janssen, Lilly, Endo, Lundbeck, Otsuka and Wyeth, and has previously or is currently working on clinical drug trials or research studies for Janssen, Amgen and Lundbeck. R.M.M. has received honoraria for speaking at meetings supported by Janssen, Astra-Zeneca, Lilly, Roche and BMS. The views expressed in this publication are those of the authors and not necessarily those of the National Health Service (NHS), the NIHR, or the Department of Health.