Immunological parameters in girls with Turner syndrome

Stenberg, Annika E; Sylvén, Lisskulla; Magnusson, Carl GM; Hultcrantz, Malou

doi:10.1186/1477-5751-3-6

The interpretation of negative results demands more caution than exhibited here

Andy Lynch, Centre for Applied Medical Statistics, Department of Public Health and Primary Care, University of Cambridge, UK

9 June 2005

The concept of a journal for publishing ‘negative’ results arising from rigorous studies is sound, and as a tool to counter publication bias it has clear merit (although whether it is uniformly of benefit in this matter is open to debate). However there are a number of possible explanations for a negative result and care must be taken not to over-interpret such a finding. The oft-used motto ‘absence of evidence is not evidence of absence’ [1] has perhaps never been more at home than it would be if adopted by the JNRBM.

This article by Stenberg et al. [2] causes me concern with regard to two issues. One is how the authors interpret their results, and the other is the level of information provided to allow others to interpret the results.

The authors state that they “do not find any major immunological deficiency … that could explain the increased incidence of otitis media”, which is true, but is not the same as saying there is no major immunological deficiency. They also state that “Normal levels of most lymphocyte- and immunoglobulin subpopulations were registered”. One can see what the authors perhaps mean; for each individual the readings tend to be within the reference 95% confidence intervals, however taken as a group it would be hard to argue that there was not evidence of a difference in the mean levels for girls with TS verses controls. In particular I am thinking of IgG in figure 2 of their paper.

To conclude that “Therefore, treatment with immunotherapy is not an option…” seems rash based upon these results. The comparison of five otitis-prone girls with 6 otitis-free girls using the Mann-Whitney U test does not suggest that the test is adequately powered for any but the most extreme magnitudes of difference. Even if the rankings split as drastically as (1,2,3,4,9) and (5,6,7,8,10,11) then the authors would not find the results to be ‘significant’. Only one rogue result, perhaps one child who’s otitis has a different provenance, will prevent any ‘positive’ findings.

The strength of the authors’ interpretation would be less concerning if enough data were presented to allow us to make up our own minds. The very first paper published in this journal [3] ran with the title “Prominent medical journals often provide insufficient information to assess the validity of studies with negative results” wherein was stressed the importance of reporting confidence intervals for results and power calculations.

In fact power calculations are often unhelpful for interpreting negative results, being based on values that prove to be inaccurate, and tending to be superseded by the information provided by a confidence intervals. However it would be good practice to routinely report sample-size calculations for a variety of reasons. Indeed this is item 7 of the Revised CONSORT statement (www.consort-statement.org) and item 10 of the current version (version 2) of the STROBE Statement (www.strobe-statement.org), respectively guidelines for the reporting of randomized trials and observational studies.

Confidence intervals are certainly of great value for the interpretation of ‘negative’ results [4]. Unfortunately the paper in discussion is one where calculating confidence intervals is not straight-forward for the main results, since non-parametric methods are (possibly quite rightly) employed. The authors interpret the Mann-Whitney U test as a test for a difference in medians, and an approximate confidence interval for the difference in medians could be produced, however with so few observations this might not be an illuminating exercise.

Thus we are left with no way of adequately interpreting these ‘negative’ results. So what could have been done?

A) Examination of this article along with those referenced within it ([5-8]) suggests that it would not be unreasonable to model the difference in means on the log scale (equivalently the log-ratio of means) as being normally distributed. As a pragmatic second step, the confidence interval based on this model could have been presented to aid interpretation of the results of the Mann-Whitney.

B) For the comparison of girls with TS to controls, it is possible to estimate confidence intervals from the previous papers [5-8] (only approximately in some cases because information has to be extrapolated) and these could potentially be used to perform a ‘mini-meta analysis’. The authors of this article comment on the small size of some previous studies, but the combined numbers in these referenced studies far exceed those in the current one, so showing what the new findings add to previous studies would be of value.

C) Also given the relatively small size of the data-set and the electronic nature of the medium, it would seem trivial to present the raw data that have led to these conclusions.

Caution should always be shown when interpreting ‘negative’ results, and whilst it is sometimes unavoidable that sample sizes are small, and sometimes tricky to calculate confidence intervals, in these circumstances the tone of conclusions should see even greater moderation. Additionally I would suggest that as much effort should go into providing the reader with the tools to interpret a result as go into obtaining it.

1. Altman DG, Bland JM: Statistics Notes - Absence of Evidence Is Not Evidence of Absence. Br Med J 1995, 311(7003):485-485.

2. Stenberg A, Sylven L, Magnusson C, Hultcrantz M: Immunological parameters in girls with Turner syndrome. Journal of Negative Results in BioMedicine 2004, 3(1):6.

3. Hebert R, Wright S, Dittus R, Elasy T: Prominent medical journals often provide insufficient information to assess the validity of studies with negative results. Journal of Negative Results in BioMedicine 2002, 1(1):1.

4. Altman D, Bland JM: Confidence intervals illuminate absence of evidence. Br Med J 2004, 328(7446):1016-1017.

5. Cacciari E, Masi M, Fantini MP, Licastro F, Cicognani A, Pirazzoli P, Villa MP, Specchia F, Forabosco A, Franceschi C et al: Serum Immunoglobulins and Lymphocyte Sub-Populations Derangement in Turners Syndrome. Journal of Immunogenetics 1981, 8(5):337-344.

6. Jensen K, Petersen PH, Nielsen EL, Dahl G, Nielsen J: Serum Immunoglobulin-M, Immunoglobulin-G, and Immunoglobulin-a Concentration Levels in Turners Syndrome Compared with Normal Women and Men. Hum Genet 1976, 31(3):329-334.

7. Lorini R, Ugazio AG, Cammareri V, Larizza D, Castellazzi AM, Brugo MA, Severi F: Immunoglobulin Levels, T-Cell Markers, Mitogen Responsiveness and Thymic Hormone-Activity in Turners Syndrome. Thymus 1983, 5(2):61-66.

8. Rongenwesterlaken C, Rijkers GT, Scholtens EJ, Vanes A, Wit JM, Vandenbrande JL, Zegers BJM: Immunological Studies in Turner Syndrome before and During Treatment with Growth-Hormone. J Pediatr 1991, 119(2):268-272.

Competing interests

No competing interests

The interpretation of negative results demands more caution than exhibited here

Andy Lynch, Centre for Applied Medical Statistics, Department of Public Health and Primary Care, University of Cambridge, UK

9 June 2005

The concept of a journal for publishing ‘negative’ results arising from rigorous studies is sound, and as a tool to counter publication bias it has clear merit (although whether it is uniformly of benefit in this matter is open to debate). However there are a number of possible explanations for a negative result and care must be taken not to over-interpret such a finding. The oft-used motto ‘absence of evidence is not evidence of absence’ [1] has perhaps never been more at home than it would be if adopted by the JNRBM.
This article by Stenberg et al. [2] causes me concern with regard to two issues. One is how the authors interpret their results, and the other is the level of information provided to allow others to interpret the results.
The authors state that they “do not find any major immunological deficiency … that could explain the increased incidence of otitis media”, which is true, but is not the same as saying there is no major immunological deficiency. They also state that “Normal levels of most lymphocyte- and immunoglobulin subpopulations were registered”. One can see what the authors perhaps mean; for each individual the readings tend to be within the reference 95% confidence intervals, however taken as a group it would be hard to argue that there was not evidence of a difference in the mean levels for girls with TS verses controls. In particular I am thinking of IgG in figure 2 of their paper.
To conclude that “Therefore, treatment with immunotherapy is not an option…” seems rash based upon these results. The comparison of five otitis-prone girls with 6 otitis-free girls using the Mann-Whitney U test does not suggest that the test is adequately powered for any but the most extreme magnitudes of difference. Even if the rankings split as drastically as (1,2,3,4,9) and (5,6,7,8,10,11) then the authors would not find the results to be ‘significant’. Only one rogue result, perhaps one child who’s otitis has a different provenance, will prevent any ‘positive’ findings.
The strength of the authors’ interpretation would be less concerning if enough data were presented to allow us to make up our own minds. The very first paper published in this journal [3] ran with the title “Prominent medical journals often provide insufficient information to assess the validity of studies with negative results” wherein was stressed the importance of reporting confidence intervals for results and power calculations.
In fact power calculations are often unhelpful for interpreting negative results, being based on values that prove to be inaccurate, and tending to be superseded by the information provided by a confidence intervals. However it would be good practice to routinely report sample-size calculations for a variety of reasons. Indeed this is item 7 of the Revised CONSORT statement (www.consort-statement.org) and item 10 of the current version (version 2) of the STROBE Statement (www.strobe-statement.org), respectively guidelines for the reporting of randomized trials and observational studies.
Confidence intervals are certainly of great value for the interpretation of ‘negative’ results [4]. Unfortunately the paper in discussion is one where calculating confidence intervals is not straight-forward for the main results, since non-parametric methods are (possibly quite rightly) employed. The authors interpret the Mann-Whitney U test as a test for a difference in medians, and an approximate confidence interval for the difference in medians could be produced, however with so few observations this might not be an illuminating exercise.
Thus we are left with no way of adequately interpreting these ‘negative’ results. So what could have been done?
A) Examination of this article along with those referenced within it ([5-8]) suggests that it would not be unreasonable to model the difference in means on the log scale (equivalently the log-ratio of means) as being normally distributed. As a pragmatic second step, the confidence interval based on this model could have been presented to aid interpretation of the results of the Mann-Whitney.
B) For the comparison of girls with TS to controls, it is possible to estimate confidence intervals from the previous papers [5-8] (only approximately in some cases because information has to be extrapolated) and these could potentially be used to perform a ‘mini-meta analysis’. The authors of this article comment on the small size of some previous studies, but the combined numbers in these referenced studies far exceed those in the current one, so showing what the new findings add to previous studies would be of value.
C) Also given the relatively small size of the data-set and the electronic nature of the medium, it would seem trivial to present the raw data that have led to these conclusions.
Caution should always be shown when interpreting ‘negative’ results, and whilst it is sometimes unavoidable that sample sizes are small, and sometimes tricky to calculate confidence intervals, in these circumstances the tone of conclusions should see even greater moderation. Additionally I would suggest that as much effort should go into providing the reader with the tools to interpret a result as go into obtaining it.
1. Altman DG, Bland JM: Statistics Notes - Absence of Evidence Is Not Evidence of Absence. Br Med J 1995, 311(7003):485-485.
2. Stenberg A, Sylven L, Magnusson C, Hultcrantz M: Immunological parameters in girls with Turner syndrome. Journal of Negative Results in BioMedicine 2004, 3(1):6.
3. Hebert R, Wright S, Dittus R, Elasy T: Prominent medical journals often provide insufficient information to assess the validity of studies with negative results. Journal of Negative Results in BioMedicine 2002, 1(1):1.
4. Altman D, Bland JM: Confidence intervals illuminate absence of evidence. Br Med J 2004, 328(7446):1016-1017.
5. Cacciari E, Masi M, Fantini MP, Licastro F, Cicognani A, Pirazzoli P, Villa MP, Specchia F, Forabosco A, Franceschi C et al: Serum Immunoglobulins and Lymphocyte Sub-Populations Derangement in Turners Syndrome. Journal of Immunogenetics 1981, 8(5):337-344.
6. Jensen K, Petersen PH, Nielsen EL, Dahl G, Nielsen J: Serum Immunoglobulin-M, Immunoglobulin-G, and Immunoglobulin-a Concentration Levels in Turners Syndrome Compared with Normal Women and Men. Hum Genet 1976, 31(3):329-334.
7. Lorini R, Ugazio AG, Cammareri V, Larizza D, Castellazzi AM, Brugo MA, Severi F: Immunoglobulin Levels, T-Cell Markers, Mitogen Responsiveness and Thymic Hormone-Activity in Turners Syndrome. Thymus 1983, 5(2):61-66.
8. Rongenwesterlaken C, Rijkers GT, Scholtens EJ, Vanes A, Wit JM, Vandenbrande JL, Zegers BJM: Immunological Studies in Turner Syndrome before and During Treatment with Growth-Hormone. J Pediatr 1991, 119(2):268-272.

Competing interests

No competing interests

Archived Comments for: Immunological parameters in girls with Turner syndrome

The interpretation of negative results demands more caution than exhibited here

Competing interests

Journal of Negative Results in BioMedicine

Contact us