Article Text

PDF

Variable interpretation of ultrasonograms may contribute to variation in the reported incidence of white matter damage between newborn intensive care units in New Zealand
  1. D L Harris1,2,
  2. F H Bloomfield2,3,
  3. R L Teele4,
  4. J E Harding2,3,
  5. on behalf of the Australian and New Zealand Neonatal Network
  1. 1Newborn Intensive Care Unit, Health Waikato, Private Bag 3200, Hamilton, New Zealand
  2. 2Liggins Institute, University of Auckland, Private Bag 92019, Auckland, New Zealand
  3. 3Newborn Services, National Women’s Health, Private Bag 92189, Auckland
  4. 4Department of Paediatric Radiology, Starship Children’s Health, Private Bag 92024, Auckland
  1. Correspondence to:
    Professor Harding
    Liggins Institute, Faculty of Medical and Health Sciences, University of Auckland, Private Bag 92019, Auckland, New Zealand; j.harding{at}auckland.ac.nz

Abstract

Background: The incidence of cerebral white matter damage reported to the Australian and New Zealand Neonatal Network (ANZNN) varies between neonatal intensive care units (NICUs).

Hypothesis: Differences in the capture, storage, and interpretation of the cerebral ultrasound scans could account for some of this variation.

Methods: A total of 255 infants of birth weight <1500 g and gestation <32 weeks born between 1997 and 2002 and drawn equally from each of the six NICUs in New Zealand were randomly selected from the ANZNN database. Half had early cerebral ultrasound scans previously reported to ANZNN as normal, and half had scans reported as abnormal. The original scans were copied, anonymised, and independently read by a panel of three experts using a standardised method of reviewing and reporting.

Results: There was considerable variation between NICUs in methods of image capture, quality, and completeness of the scans. There was only moderate agreement between the reviewers’ reports and the original reports to the ANZNN (κ 0.45–0.51) and between the reviewers (κ 0.54–0.64). The reviewers reported three to six times more white matter damage than had been reported to the ANZNN.

Conclusion: Some of the reported variation in white matter damage between NICUs may be due to differences in capture and interpretation of cerebral ultrasound scans.

  • ANZNN, Australian and New Zealand Neonatal Network
  • GM/IVH, germinal matrix/intraventricular haemorrhage
  • NICU, neonatal intensive care unit
  • PVL, periventricular leucomalacia
  • WMD, white matter damage
  • echoencephalography
  • central nervous system ultrasonography
  • premature
  • interobserver variability
  • periventricular leucomalacia
  • white matter

Statistics from Altmetric.com

Cerebral white matter damage (WMD), reflected in periventricular leucomalacia (PVL) and ventriculomegaly, is a major cause of mortality and neurological morbidity, particularly spastic diplegia and intellectual defects,1,2 in extremely preterm babies. International neonatal databases have consistently reported wide variation in the incidence of clinical outcomes, including germinal matrix/intraventricular haemorrhage (GM/IVH) and PVL, between different neonatal intensive care units (NICUs).3–6 Although variations in clinical practice have been implicated, there are a number of other possible explanations, including case mix, unit size, and the performance and interpretation of the relevant diagnostic tests.

The Australian and New Zealand Neonatal Network (ANZNN) collates data from all 29 level III NICUs in Australia and New Zealand. A dataset of 70 variables is collected by each NICU, using agreed definitions, on all babies born before 32 weeks gestation or with a birth weight less than 1500 g, and on all babies needing major surgery or requiring assisted ventilation for over four hours. Data from the ANZNN have also reported variation in the incidence of WMD among NICUs, which ranged from 2% to 11% of babies <32 weeks gestation between the six NICUs in New Zealand (unpublished data from 2323 babies 1999–2002).

We tested the hypothesis that the reported variation in the incidence of WMD between NICUs in New Zealand may be attributable to variation in the collection and reporting of the cerebral ultrasound scans. We also assessed interobserver and intraobserver variability in the reporting of WMD. A parallel study investigating the variation in incidence of GM/IVH, based on early scans from the same cohort of babies, has been reported separately.7

METHODS

The study population comprised babies from the six level III NICUs in New Zealand born between 1997 and 2002 at <32 weeks gestation or weighing <1500 g. Forty four babies from each NICU were randomly selected from the ANZNN database using a computerised three digit random number table. Half had early cerebral ultrasound scans reported as normal, one quarter were reported to have grade 1 or 2 GM/IVH on early scans, and the final quarter grade 3 or 4 GM/IVH. The reviewers were unaware of this selection pattern. The NICUs report WMD to the ANZNN based on the scan performed closest to the time when the baby is 6 weeks old. As it was impossible to know which particular scan was originally reported, all scans between weeks 4 and 8 were included in this study.

In different hospitals the cerebral ultrasound scans were stored as digital images, on thermal paper, or on radiographic film. The hard copy scans were photographed using a digital camera, and identifying information was removed from all images.7

The three reviewers were a specialist paediatric radiologist, a neonatologist, and a neonatal nurse practitioner, all of whom performed and reported cerebral ultrasound scans as part of their clinical practice. The reviewers were not aware of the original reports, and the only clinical information available to them was the gestational age of the baby. They reviewed all the photographed scans in a dimmed room with a standard computer screen and reported only the worst scan for each baby using a data collection sheet.7

To assess intraobserver variability, all reviewers were asked to re-review scans from a randomly selected subset of 22 babies two months later.

Parenchymal change

PVL was defined as cystic transformation of the periventricular white matter. Germinal matrix cysts and paraventricular cysts within the anterior horn or lateral ventricles were excluded, as these cysts are variations of normal.1 Porencephalic cysts were defined as large cysts continuous with the ventricular lumen or ectatic ventricle.1,8,9

Cerebral cystic changes are reported to the ANZNN as porencephalic cysts or PVL. Porencephalic cysts are defined by the ANZNN as lesions corresponding to a grade 4 GM/IVH. PVL is defined as ischaemic brain injury, affecting the periventricular white matter in the boundary zones supplied by the terminal branches of the centripetal and centrifugal arteries.10 If the reviewers reported a single cyst in the parenchyma, this was considered comparable to a porencephalic cyst reported to the ANZNN. If the reviewers reported multiple cystic changes, this was considered comparable to PVL reported to the ANZNN.

Ventricle size

The reviewers assessed dilatation of the ventricles subjectively as normal, mild, moderate, or severe by comparison with four simple drawings showing the shape of the lateral ventricle with increasing distension in both coronal and sagittal views.2 For reporting to the ANZNN, dilatation of the ventricles is assessed using the ventricular index, measured as the furthest lateral extent of each ventricle from the midline at the level of the foramen of Monro. Scans are reported as normal (ventricular index ⩽97th centile), dilatation (97th centile < ventricular index <97th centile +4 mm), or hydrocephalus (ventricular index >97th centile +4 mm, or hydrocephalus present requiring a shunt or any form of drainage).10 However, the original measurements were not available for most scans, and it was not possible for the reviewers to measure each copied scan because of variation in magnification. Hence, if the reviewers reported mild or moderate dilatation, we considered this to be comparable to dilatation reported to the ANZNN. If the reviewers reported severe dilatation, we considered this to be comparable to hydrocephalus.

A complete scan was defined as a scan that included the standard six coronal views and five sagittal views.11 The quality of the scan was defined as good if it allowed interpretation; otherwise the scan was defined as poor. Scans that were impossible for the reviewers to read were reported as such. Scans were documented as missing if the baby had died before the late scan could be performed, or the late scan was not performed, or it was not available for copying. Scans were reported to the ANZNN as unknown if the baby had been transferred back to the referral centre and the late cerebral ultrasound result had not been forward to the original NICU.

All data were analysed using Statistical Discovery Software JMP 5. (SAS Institute Inc, Cary, North Carolina, USA). Gestation and birth weight were compared between NICUs using factorial analysis of variance. The incidences of different abnormalities were compared between reviewers and NICUs using the χ2 test. Interobserver and intraobserver variability were assessed using the kappa statistic (κ). κ<0.2 was considered to represent poor agreement, 0.21–0.4 fair, 0.41–0.6 moderate, 0.61–0.8 good, and >0.8 excellent agreement.12,13 Data are presented as number (%) or median (range).

The chairperson of the Auckland ethics committee confirmed in writing that there was no need for ethical approval for this study as it constitutes a clinical audit. All identifying data were removed from the copied films, and there was no contact with infants or their families.

RESULTS

A total of 669 cerebral ultrasound scans from 158 babies were copied and reviewed by the three reviewers. The median gestational age of the babies was 28 (range 22–31) weeks, and birth weight 1025 (470–2140) g.

Missing scans

Scans were missing for 97 babies (38%). This includes 32 babies reported to the ANZNN as having never been scanned because the baby had died or had moved back to a level II NICU. The remainder were reported as having been scanned, but the scans were missing from the records. Some cerebral ultrasound scans (20–48%) were missing from each of the NICUs, including some not performed because babies had died (3% for NICU D to 33% for NICU A) and some that were missing (1% for NICU C to 48% for NICUs D and F).

Quality and completeness

The reviewers considered 33% of the available scans to be incomplete and 27% to be of poor quality (table 1). The rate of both incomplete and poor quality scans differed between NICUs (p<0.001). NICUs E and F had 73% of the incomplete and 84% of the poor scans. Overall, 5% of the scans were impossible to read for single cysts and PVL, and 3% for dilatation and hydrocephalus. The number of scans reported as impossible to read varied between NICUs (p<0.001), with NICU E having the greatest proportion.

Table 1

 Variation between neonatal intensive care units A–F in the reported number, quality, and completeness of the cerebral ultrasound scans

Comparison with the ANZNN and between NICUs

The reviewers reported 180 scans as abnormal. They reported 87 scans (20%) as showing cystic changes, whereas only 21 scans (9%) were reported to the ANZNN as showing cystic changes (χ2  =  8.02, p  =  0.15) (table 2).

Table 2

 Variation between the three reviewers and the Australian and New Zealand Neonatal Network in the reported rates of cystic changes and ventricular dilatation for the cerebral ultrasound scans in each neonatal intensive care unit (A–F)

There was no consistent pattern to the variation in the reporting of single cysts between the reviewers and the ANZNN. However, the reviewers reported five times more PVL than was reported to the ANZNN (p<0.05). If the reviewers are considered the gold standard, then PVL was under-reported to the ANZNN by every NICU in 6–20% of scans (table 2).

There was little consistency for reporting of cystic changes between each of the reviewers and each of the NICUs. The κ statistics were 0.17–0.22 for NICU A (poor agreement), indicating that all three reviewers rarely agreed with the original reports to the ANZNN. For NICU F, κ statistics ranged from 0.0 (no agreement other than that achieved by chance) to 1.0 (perfect agreement), indicating that the reviewers did not agree with each other or with reports to the ANZNN.

The reviewers reported that 104 scans (23%) showed ventricular dilatation, whereas only 27 scans (11%) were reported to the ANZNN as showing ventricular dilatation (χ2  =  16.43, p  =  0.08) (table 2). This difference was almost entirely due to mild-moderate dilatation rather than hydrocephalus, with the reviewers reporting twice as much mild-moderate dilatation as was reported to the ANZNN (p<0.05). If the reviewers are regarded as the gold standard, then dilatation of the ventricles was under-reported by every NICU, by 7–20%. However, the reviewers’ reports did not differ from the reports to the ANZNN in the frequency of hydrocephalus.

The level of agreement for the reporting of ventricular dilatation for each of the reviewers and each of the NICUs varied (overall κ 0.45–0.51). All three reviewers reported low κ values for NICU E (0.0, no agreement with reports to the ANZNN other than that which had occurred by chance) and NICU C (0.05–0.31, fair agreement).

Interobserver variablity

The agreement between the three individual reviewers and the ANZNN for cystic changes was moderate (κ 0.45–0.48). Reviewer Y reported the same proportion of single cysts as was reported to the ANZNN, whereas reviewers X and Z reported one third and one quarter respectively (table 3). All reviewers reported 3–6 times more PVL than was reported to the ANZNN. Agreement between the three individual reviewers and the ANZNN for dilatation of the ventricles was also moderate (κ 0.45–0.51). All reviewers reported four times more mild-moderate dilatation than was reported to the ANZNN (p<0.05), but a similar frequency of hydrocephalus.

Table 3

 Variation between the three reviewers (X–Z) and the Australian and New Zealand Neonatal Network in the reported rates of cystic changes and ventricular dilatation

Intraobserver variability

On re-reading the subset of scans, the reviewers did not significantly change their reporting for the number of scans that were incomplete, poor quality, or impossible to read (table 4). They also did not significantly change their reporting of cystic changes and ventricular dilatation. The κ statistics ranged from 0.54 to 0.64 for cystic changes (moderate agreement), and from 0.76 to 1.0 for ventricular dilatation (excellent agreement). All of the 95% confidence intervals for the κ statistics overlapped, showing no significant differences in agreement between any sources of reports.

Table 4

 Variation in the reports from the subset of scans selected for re-reading (indicated by R column for each reviewer X–Z)

DISCUSSION

We sought to determine whether the reported variation in the incidence of WMD between the six NICUs in New Zealand was influenced by variation in the capture and interpretation of the cerebral ultrasound scans. We found only moderate agreement between the reviewers’ reports and the original reports to the ANZNN, and the reviewers consistently reported more PVL and ventricular dilatation than was reported to the ANZNN. There was only moderate interobserver agreement between reviewers for both cystic changes and ventricular dilatation. Intra-observer agreement was also only moderate for cystic changes although excellent for ventricular dilatation. Thus it seems likely that variation in the interpretation of the cerebral ultrasound scans could account for some of the variation in the reported incidence of WMD in preterm babies.

Periventricular leucomalacia and single cysts

The agreement between the reviewers regarding cystic changes was only moderate, and in every NICU the reviewers reported two to four times more PVL than was reported to the ANZNN. The reasons for this variation may include probe size, image quality, number of images taken, and the skill of the clinician performing the scan. It has been recommended that a 7.5 MHz transducer is essential for the detection of small cystic lesions.14 We aimed to collect data on the ultrasound machine and probe sizes used for each scan, but in many cases this information was not printed on the original scans, or was lost when the identifying data were deleted. Thus we do not know to what degree these factors influenced the quality of the scans reviewed.

NICUs E and F, which stored images on paper, had the greatest proportion of incomplete and poor quality scans. As the paper fades over time, the images become more difficult to assess for cystic change. However, these NICUs did not differ from the others in levels of agreement with the reviewers, suggesting that this could not explain our overall findings.

The timing of the late scan may also have contributed to the variation in reporting. The ANZNN collects data based on the scan performed closest to the time the baby is 6 weeks old. Because we were unable to identify which particular scan had been reported, all scans between weeks 4 and 8 were copied. It is possible that some of the increased incidence in PVL reported by the reviewers may have been because the PVL was not evident on the 6 week scan reported to the ANZNN, but was captured on a later scan included in this study.

Dilatation of the ventricles and hydrocephalus

There was a high level of agreement between the reviewers’ reports and reports to the ANZNN for hydrocephalus. However, for ventricular dilatation, agreement was only moderate, and reviewers reported four times more dilatation of the ventricles than was reported to the ANZNN. These differences may have been influenced by the data collection sheet. ANZNN reports the size of the ventricles as no dilatation, dilatation, or hydrocephalus. However, the reviewers reported the size of the ventricle using drawings illustrating normal, mild, moderate, or severe dilatation.15 By providing the reviewers with a greater choice, we may have increased the likelihood of variation, or perhaps allowed the reviewers to report more accurately than the ANZNN currently allows. This is particularly likely for mild dilatation, which would be reported using the data collection sheet, but may not have reached the ANZNN criteria of a ventricular index >97th centile.

Intraobserver agreement

Intraobserver agreement was only slightly better than interobserver agreement for cystic changes, but considerably better for dilatation and hydrocephalus. Thus it appears that the reviewers were able to report more consistently for ventriculomegaly, perhaps because dilatation is easier to see on the ultrasound scan image, whereas cystic changes are difficult to identify consistently. To our knowledge, there have been no previous studies reporting intraobserver and interobserver variability for interpretation of late cerebral ultrasound scans in preterm babies. Our findings suggest that this warrants further attention.

Under-reporting to the ANZNN

Our finding that both PVL and dilatation of the ventricles are under-reported to the ANZNN is concerning, as WMD has been linked with a poor neurological outcome.1,8,15–19 The ANZNN has consistently reported fewer ultrasound abnormalities than other international neonatal databases.4,20 Our findings suggest that the ANZNN data may be an underestimate. However, until similar audits are undertaken in other databases, the validity of any such comparisons must be questioned. A clinical audit raised concerns about the accuracy of interpretation of cerebral ultrasound scans in England.21 Our findings suggest that interpretation is also very variable in clinical practice in New Zealand.

Furthermore, the high interobserver and intraobserver variability in reporting raise concerns about the usefulness of the late ultrasound scan in assisting with prediction of neurological outcome. Although a number of studies have assessed its utility in preterm babies,13,21–23 none have measured interobserver or intraobserver agreement.

For practical reasons, ultrasound scanning is the mainstay of neurological imaging in the preterm baby, although magnetic resonance imaging is superior for detection of subtle white matter changes.22–25 A recent study reported excellent agreement between two neuroradiologists reviewing magnetic resonance imaging scans of 48 term neonates (κ  =  0.88).26 However, magnetic resonance imaging scanning is expensive, time consuming, and may require transport and sedation. Thus the limitations of ultrasound scanning are likely to remain important considerations in clinical practice for the foreseeable future.

Problems with the scans

The quality and completeness of the cerebral ultrasound scans varied significantly between NICUs in New Zealand, with one third of scans reported to be of poor quality and a similar proportion incomplete. This may have contributed to the high levels of disagreement, as the reliability of early cerebral ultrasound interpretation is reported to be higher when the scan is of high quality.27,28 However, it is possible that the report to the ANZNN was generated by a clinician present when the scan was performed, using real time images that were not recorded. Thus the clinician may have had information from better quality and more complete images than those available to the reviewers.

All reports defined as impossible to read were excluded from our analysis. This may have improved the agreement between the reviewers and the ANZNN by excluding those scans about whose interpretation the reviewers were most uncertain. Therefore we may have found an even greater level of disagreement if these had been included.

Overall more than one third of the scans were unable to be copied for various reasons. It is difficult to know how this may have influenced our findings. However, as the comparisons both between reviewers and with the ANZNN were only made on the scans that were available to be copied, it seems unlikely that either the wide variation in reporting or the difference in the reported incidence of PVL and dilatation that we found could be accounted for purely by selection bias in the scans reviewed.

CONCLUSION

We sought to determine whether the reported variation in the incidence of WMD between the six NICUs in New Zealand could be related to differences in capture and interpretation of the cerebral ultrasound scans. We found that there was only moderate agreement between the reviewers’ reports and the original reports to the ANZNN and between each of the reviewers. Intraobserver agreement was also only moderate, although slightly better for ventriculomegaly than for cystic changes. Furthermore, both PVL and ventriculomegaly may be under-reported to the ANZNN. These findings support the hypothesis that some of the reported variation in WMD between NICUs in New Zealand may be due to differences in the capture and interpretation of the cerebral ultrasound scans. Future studies comparing the incidence of WMD between different populations should carefully consider how consistency of reporting could be improved.

What is already known on this topic

  • The reported incidence of cerebral white matter damage in preterm babies varies between neonatal intensive care units

  • Possible explanations include variations in case mix, clinical practice, or the techniques and interpretation of the diagnostic cerebral ultrasonograms

What this study adds

  • Interobserver and intraobserver agreement was only moderate for diagnosis of white matter damage on cerebral ultrasonograms, with considerable variation between units

  • Some of the reported variation in white matter damage between neonatal intensive care units may be due to differences in techniques and interpretation of the ultrasonograms

Acknowledgments

The ANZNN Advisory Committee and Executive members were as follows: Australia: Centre for Perinatal Health Services Research (New South Wales): David Henderson-Smart and Deborah Donoghue; Flinders Medical Centre (South Australia): Peter Marshall; John Hunter Hospital (New South Wales): Chris Wake; King Edward Memorial and Princess Margaret Hospitals (Western Australia): Noel French, Ron Hagan, and Karen Simmer; Launceston General Hospital (Tasmania): Chris Bailey; Liverpool Health Service (New South Wales): Robert Guaran; Mater Mother’s Hospital (Queensland): David Tudehope; Mercy Hospital for Women (Victoria): Andrew Watkins; Monash Medical Centre (Victoria): Kaye Bawden, Andrew Ramsden, and Victor Yu; National Perinatal Statistics Unit (New South Wales): Paul Lancaster; Nepean Hospital (New South Wales): Lyn Downe; Newborn Emergency Transport Service (Victoria): Michael Stewart; New South Wales Newborn and Pediatric Emergency Transport Service: Andrew Berry; Perinatal Research Centre (Queensland): Paul Colditz; Royal Children’s Hospital (Victoria): Linda Johnstone and Peter McDougall; Royal Darwin Hospital (Northern Territory): Charles Kilburn; Royal Hobart Hospital (Tasmania): Peter Dargaville; Royal Hospital for Women (New South Wales): Kei Lui; Royal North Shore Hospital (New South Wales): Jennifer Bowen; Royal Prince Alfred Hospital (New South Wales): Nick Evans; Royal Women’s Hospital (Queensland): David Cartwright; Royal Women’s Hospital (Victoria): Lex Doyle, Colin Morley, and Neil Roy; Sydney Children’s Hospital (New South Wales): Barry Duffy; Canberra Hospital (Australian Capital Territory): Graham Reynolds; Children’s Hospital at Westmead (New South Wales): Robert Halliday; Townsville Hospital (Queensland): John Whitehall; Western Australia Neonatal Transport Service: Jenni Sokol; Westmead Hospital (New South Wales): William Tarnow-Mordi; Women’s and Children’s Hospital (South Australia): Ross Haslam; New Zealand: Christchurch Women’s Hospital: Nicola Austin; Christchurch School of Medicine: Brian Darlow; Dunedin Hospital: Roland Broadbent; Gisborne Hospital: Graeme Lear; Hastings Hospital: Jenny Corban; Hutt Hospital: Robyn Shaw; Middlemore Hospital: Lindsay Mildenhall; National Women’s Hospital: Carl Kushell; Nelson Hospital: Peter McIlroy; Palmerston North Hospital: Jeff Brown; Rotorua Hospital: Stephen Bradley; Southland Hospital: Paul Tomlinson; Taranaki Hospital: John Doran; Tauranga Hospital: Hugh Lees; Timaru Hospital: Philip Morrison; University of Auckland: Jane Harding; Waikato Hospital: David Bourchier; Wairau Hospital: Ken Dawson; Wanganui Hospital: John Goldsmith; Wellington Women’s Hospital: Vaughan Richardson; Whakatane Hospital: Chris Moyes; Whangarei Hospital: Peter Jankowitz.

REFERENCES

View Abstract

Footnotes

  • Published online first 13 September 2005

  • This work was supported by the Maurice and Phyllis Paykel Trust, the Rebecca Roberts scholarship, and the Waikato Sick Babies Trust.

  • Competing interests: none declared

Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.