Article Text

PDF

Does variation in interpretation of ultrasonograms account for the variation in incidence of germinal matrix/intraventricular haemorrhage between newborn intensive care units in New Zealand?
  1. D L Harris1,2,
  2. R L Teele3,
  3. F H Bloomfield2,4,
  4. J E Harding2,4,
  5. on behalf of the Australian and New Zealand Neonatal Network
  1. 1Newborn Intensive Care Unit, Health Waikato Private Bag 3200, Hamilton, New Zealand
  2. 2Liggins Institute, University of Auckland, Private Bag 92019, Auckland, New Zealand
  3. 3Department of Paediatric Radiology, Starship Children’s Health, Private Bag 92024, Auckland, New Zealand
  4. 4Newborn Services, National Women’s Hospital, Private Bag 92189, Auckland, New Zealand
  1. Correspondence to:
    Professor Harding
    Liggins Institute, Faculty of Medicine and Health Science, University of Auckland, Private Bag 92019, Auckland, New Zealand; j.hardingauckland.ac.nz

Abstract

Background: The incidence of germinal matrix/intraventricular haemorrhage (GM/IVH) reported to the Australian and New Zealand Neonatal Network (ANZNN) varies between neonatal intensive care units (NICUs).

Hypothesis: Differences in the capture, storage, and interpretation of the cerebral ultrasound scans may account for some of this variation.

Methods: A total of 255 infants with birth weight <1500 g and gestation <32 weeks born between 1997 and 2002 were randomly selected from the ANZNN database, 44 from each of the six NICUs in New Zealand. Twenty two infants from each NICU had cerebral ultrasound scans previously reported to ANZNN as normal; another 22 had scans reported as abnormal. The original scans were copied using digital photography and anonymised and independently read by a panel of three experts using a standardised method of reviewing and reporting.

Results: There was considerable variation between NICUs in methods of image capture and quality and completeness of the scans. However, there was little variation in the reporting of scans between the reviewers and the reports to ANZNN (weighted κ 0.75–0.91). Grade 1 GM/IVH was generally over-reported and grade 4 under-reported to the ANZNN.

Conclusion: For all NICUs, a high level of agreement was found between the reviewers’ reports and the reports to the ANZNN. Thus the variation between NICUs in the incidence of GM/IVH reported to the ANZNN is unlikely to be due to differences in capture, storage, and interpretation of the cerebral ultrasound scans. Further investigation is warranted into the reasons for the variation in incidence of GM/IVH between NICUs.

  • ANZNN, Australian and New Zealand Neonatal Network
  • GM/IVH, germinal matrix/intraventricular haemorrhage
  • NICU, neonatal intensive care unit
  • echoencephalography
  • central nervous system
  • ultrasonography
  • premature
  • interobserver variability

Statistics from Altmetric.com

Germinal matrix/intraventricular haemorrhage (GM/IVH) is the most common type of intracranial haemorrhage in the neonatal period, and contributes to long term neurological problems such as cerebral palsy and hydrocephalus.1 The incidence and severity of GM/IVH is closely related to gestational age and birth weight, occurring in 25–30% of babies born at less than 1500 g or less than 32 weeks gestation.

International neonatal databases have consistently reported wide variation in the incidence of clinical outcomes, including GM/IVH, between different neonatal intensive care units (NICUs).2–5 In 1993 the International Neonatal Network called for comparisons of staffing and organisational policies of NICUs to investigate reasons for these variations.6

The Australian and New Zealand Neonatal Network (ANZNN) collates data from all 29 level 3 NICUs in Australia and New Zealand. Since 1995 each NICU has collected a dataset of 60 variables, using agreed definitions, on all babies born before 32 weeks gestation or with a birth weight less than 1500 g, and all babies needing major surgery or requiring assisted ventilation for over four hours. Grading of GM/IVH follows the scale established by Papile, and the highest diagnosed grade of GM/IVH is reported.7

In the 1995 dataset, the reported rates of severe GM/IVH varied from less than 5% to 20% in the different NICUs.8 This reported variation may be influenced by a number of factors.5 There is often wide variation between NICUs in case mix, infant risk factors, and the number of neonates, all of which may have an impact on both mortality and clinical outcome.9 In addition, measurement bias in assessing the outcome, such as differences in techniques and interpretation of diagnostic tests, may affect the reported variation.

To allow appropriate adjustment for risk factors in different NICUs, a recent study established a predictive model for major GM/IVH based on the ANZNN database.8 However, when the predictive risk model was used to adjust for risk factors within each NICU population, the variation between NICUs in the reported incidence of GM/IVH increased rather then decreased.10

An alternative explanation for the differences in reported incidence of GM/IVH is that the scans are performed, documented, and reported differently in each NICU. Our study was undertaken to test the hypothesis that the reported variation in the incidence of GM/IVH was due to differences in the capture and interpretation of cerebral ultrasound scans. As part of this study, we also assessed both intraobserver and interobserver variability in the reporting of cerebral ultrasound scans.

METHODS

The study population was randomly selected from the New Zealand subset of the ANZNN database for 1997–2002. Only babies from the six level 3 NICUs in New Zealand were included to allow population based data and easier access to all of the original scans. All babies met the high risk criteria for GM/IVH in that they were born at <32 weeks gestation or weighing <1500 g. Forty four babies from each NICU were selected by ANZNN staff using a computerised three digit, random number table. Half of each group had cerebral ultrasound scans that were reported as normal, one quarter were reported as having grade 1 or 2 GM/IVH, and the final quarter grade 3 or 4 GM/IVH. The reviewers were unaware of this selection pattern. When an ultrasound scan was unavailable because, for example, the films were lost or had been culled from hospital files, the baby was replaced from the ANZNN database with another baby from the same NICU with the same reported cerebral ultrasound result whenever possible.

Depending on the hospital, the cerebral ultrasound scans were stored as digital images, on thermal paper, or on x ray film. The hard copy cerebral ultrasounds were photographed using a digital camera and a standardised protocol and stored digitally as Joint Photographic Experts Group (JPEG) files. All identifying information was deleted from digital images using PHOTOSHOP 5.0 LE (Adobe Systems Inc, San Jose, California, USA) by an assistant who was not involved in any other aspect of the study. The resulting images were stored as Photoshop document files.

The definition used for reporting GM/IVH to the ANZNN is the worst level of GM/IVH seen on either side of the brain by ultrasonography or post mortem examination.11 As it was impossible to know which particular cerebral ultrasound scan was originally considered the worst and reported by the NICU to the ANZNN database, all scans performed between 4 and 14 days of age were copied for each baby (unless the baby had died, in which case all available scans were copied).

The three reviewers were a specialist paediatric radiologist, a neonatologist, and a neonatal nurse practitioner, who are all required to perform and report cerebral ultrasound scans as part of their clinical practice. The reviewers were unaware of the original reports. The only clinical information provided was the gestational age of each baby. The reviewers were asked to evaluate all photographed scans in a dimmed room with a standard computer screen and to report on only the worst scan for each baby. The reviewers were not asked to report a grade of GM/IVH. Rather, each reviewer completed a data collection sheet (fig 1) noting the presence of blood within the germinal matrix or ventricles, ventricular distension, and parenchymal abnormality. Parenchymal abnormality was classified as blood, white matter changes, or cysts. These data were then converted into Papile’s grading method using a computer algorithm (table 1).7

Table 1

 Algorithm used to convert the reviewers’ reports into Papile’s grading system

Figure 1

 Data collection sheet used for reporting of cerebral ultrasound scans for this study.

A complete scan was defined as a scan that included the standard six coronal views and five sagittal views.[12 13] If there were any views missing, then the scan was defined as incomplete. The quality of the scan was defined as good if it allowed interpretation. Otherwise the scan was defined as poor. Scans that were unable to be categorised by the Papile grading method using a computer algorithm were tabulated as not graded. Scans that were impossible for the reviewers to read were also tabulated.

To assess intraobserver variability, all three reviewers reviewed again a randomly selected subset of cerebral ultrasound scans from 22 babies two months after the initial review.

The chairperson of the Auckland ethics committee confirmed in writing that there was no need for ethical approval for this study as it constitutes a clinical audit. All identifying data were removed from the copied films, and there was no contact with any of the infants or their families.

All data were analysed using Statistical Discovery Software Jmp 5. (SAS Institute Inc, Cary, North Carolina, USA). Gestation and birth weight were compared between NICUs using factorial analysis of variance. The incidences of different abnormalities were compared between reviewers and NICUs using χ2. Interobserver and intraobserver variability were assessed using the weighted kappa statistic, κ (w). All data are presented as number (%) or median (range).

RESULTS

A total of 2225 cerebral ultrasound scans from 255 babies were copied, then reviewed by the panel of three experts. The median gestational age of the babies was 28 weeks (range 22–31), and the median birth weight was 1025 g (470–2140). In this sample the incidence of normal scans reported to the ANZNN from different NICUs ranged from 45% to 55%, and of grade 4 GM/IVH from 8% to 23%.

Ultrasound equipment, probe sizes, and scanning techniques varied among NICUs (table 2). Sixty nine percent of the scans were taken via the anterior fontanel, 28% via both the anterior and posterior fontanels, and 3% were axial views.

Table 2

 Ultrasound equipment and technique used in each neonatal intensive care unit (A–F)

Quality and completeness

Thirty seven percent of scans were considered incomplete, and 33% were of poor quality. There was significant variation between the NICUs in the quality and completeness of the cerebral ultrasound scans (p<0.001). NICUs E and F had 90% of the incomplete and 70% of the poor quality scans, and it was only scans from these two NICUs that the reviewers found impossible to read (table 3).

Table 3

 Variation between neonatal intensive care units (NICUs) A–F in the reported quality and completeness of scans

Variation between NICUs

There was some variation in the reporting of GM/IVH between the reviewers and the ANZNN for each NICU. If the reviewers are considered the gold standard, then grade I GM/IVHs were over-reported to the ANZNN by 5–12% by every unit apart from NICU F. Grade 2 GM/IVH had less variation between the reviewers and the ANZNN reports. Grade 3 GM/IVH showed the greatest variation in reporting, being both under-reported by 11% and over-reported by 10%. Grade 4 was under-reported by every NICU by 1–11% (table 4).

Table 4

 Comparison between the three reviewers and the ANZNN in the reported grades of germinal matrix/intraventricular haemorrhage for each neonatal intensive care unit (A–F)

Intraobserver variability

When the original reports for each reviewer were compared with the reports from the re-read subset, we found that the reports did not change, with the κ (w) statistics ranging from 0.78 to 0.96 (table 5).

Table 5

 Variation in the reported grades of germinal matrix/intraventricular haemorrhage in the re-read subset of scans (indicated by the R column for each reviewer X–Z)

Interobserver variability

The reviewers did not differ in their overall reporting of numbers of scans that were incomplete, poor quality, or impossible to read (table 6). However, they did not always agree on which specific scans were in each of these categories. For this reason the reports from each reviewer were compared individually with those reported to ANZNN. There were no consistent differences between reviewers in the reporting of grades of GM/IVH (table 7). κ (w) statistics ranged from 0.75 to 0.91, and all 95% confidence intervals overlapped, showing no significant differences in the level of agreement between any sources of reports (table 8).

Table 6

 Variation between the three reviewers (X–Z) in the reported quality and completeness of the cerebral ultrasound scans

Table 7

 Variation between the three reviewers (X–Z) and the Australian and New Zealand Neonatal Network (ANZNN) in the reported grades of germinal matrix/intraventricular haemorrhage

Table 8

 Weighted kappa (κ (w)) statistics showing the level of agreement between the three reviewers (X–Z) and the Australian and New Zealand Neonatal Network (ANZNN) in the reported grades of germinal matrix/intraventricular haemorrhage

DISCUSSION

We sought to discover whether some of the variation in the incidence of GM/IVH reported from the six NICUs in New Zealand could be accounted for by differences in the cerebral ultrasound scanning method and interpretation. We found that there was significant variation in the quality and completeness of cerebral ultrasound scans from each of the NICUs in New Zealand. Despite this, there was a very high level of agreement between the reviewers’ reports and the NICU reports to the ANZNN, and very low levels of intraobserver and interobserver variability. Thus variations in cerebral ultrasound scanning method and interpretation were unlikely to explain the reported variation in the incidence of GM/IVH between New Zealand NICUs.

Poor quality and incomplete cerebral ultrasound scans

Overall the reviewers reported one third of the scans to be of poor quality. This is concerning, as these scans had been previously reported on in clinical practice, and the reliability of cerebral ultrasound interpretation is reported to be higher when the scan is of high quality.14,15 However, we cannot tell from this study if these images were of poor quality at the time the scan was performed or if the poor quality is related to the method of storage.

The methods of storage included digital images, x ray film, and thermal paper. We found that digital images were much clearer than other forms of storage. NICU D, which used a digital system, had no scans reported as impossible to read, and only 4% reported as of poor quality. The images on the thermal paper were never as clear as either digital or x ray film, and faded over time. The earliest scans we used were five years old before being copied for the study. NICUs E and F, which stored scans on thermal paper, or a combination of thermal paper and x ray film, had the highest number of poor quality scans (70%).

These NICUs also accounted for 90% of the incomplete scans. One of these NICUs had no radiology support for cerebral ultrasound scanning, which may have contributed to this finding. If the clinician performing the cerebral ultrasound scan has had limited formal training, they may only take the images of the baby’s brain that they know how to interpret. Furthermore, there is no opportunity for collaboration between the neonatal and radiology experts in this NICU. Other recent studies have reported an increase in both the level of correct interpretation and the level of agreement between reviewers when radiology and neonatal staff have collaborated.16,17

Despite this considerable variation in the quality and completeness of the scans between the NICUs, no NICU stood apart from the others as having a greater degree of variation between the reviewers’ reports and those to the ANZNN, suggesting that these quality issues did not contribute to the reported variation in incidence of GM/IVH.

Grading of cerebral ultrasound scans

The variation between the NICUs and the reviewers in the reported grade of GM/IVH was small, and the level of agreement for each NICU ranged from good to excellent (κ (w)  = 0.61 to 0.99). Agreement was highest for normal scans, and the greatest variations were in grades 1, 3, and 4. Others have also reported greater disagreement between reviewers for grade 1 GM/IVH,15–18 perhaps because it can be difficult to distinguish between germinal matrix congestion and haemorrhage.

What is already known on this topic

  • The reported incidence of germinal matrix/intraventricular haemorrhage in preterm babies varies between neonatal intensive care units

  • Case mix does not explain the variation between units within the Australian and New Zealand Neonatal Network. Other possible explanations include variations in clinical practice, or in the techniques and interpretation of the diagnostic cerebral ultrasonograms

What this study adds

  • Variation in the reported incidence of germinal matrix/intraventricular haemorrhage between New Zealand neonatal intensive care units is unlikely to be due to the differences in the capture, storage, and interpretation of the cerebral ultrasonograms

  • Further investigation into practice variations is warranted

Grade 3 was both under-reported and over-reported, perhaps because of difficulties in definitions, as was also suggested from the Vermont-Oxford Network.16 Papile defined grade 3 GM/IVH as a ventricle that is distended with the blood that is within it. However, the ventricle is often distended with both blood and cerebrospinal fluid, and this is included in the definition of a grade 3 GM/IVH in some NICUs.

All three reviewers reported a greater number of grade 4 GM/IVH than had been reported to the ANZNN. This is of concern as this important outcome is sometimes used as a clinical indicator,

Intraobserver and interobserver variability

To measure the level of agreement between reviewers, we used κ (w) statistics, which compares the observed agreement with that expected due to chance.19 We found excellent agreement between both the original and the re-read reports for each reviewer and between reviewers

Four previous studies have assessed the accuracy of reporting of cerebral ultrasound scans in preterm babies.14,16,17,20 All reported that disagreement was common among reviewers. Others have assessed intraobserver and interobserver variability for other types of radiological images, and again describe poorer levels of agreement than in our study, along with statistically significant differences between reviewers.21–23 Thus our study is unique in finding a relatively high level of agreement between reviewers across images from different NICUs.

Data collection sheet

We developed and adapted the data collection sheet used for this study from a previously published version.24 Its use may have contributed to the high levels of agreement in our study, as the reviewers were required to report the scan in a uniform way, answering simple observational questions but not determining the grade of GM/IVH. Pinto-Martin and colleagues17 assessed observation and interpretation separately, and found greater agreement between reviewers when they were asked only to report observations. We are not aware of other published studies that have used this approach to convert observational data into Papile’s grading system. We suggest that future studies should also consider separating observation from interpretation in this way. Introducing a similar data collection sheet into clinical practice may also improve the consistency of reporting of cerebral ultrasound scans.

Summary

A number of studies have reported significant variation in the incidence of GM/IVH between NICUs.2,3,5,8 However, in none of these studies did the researchers re-read the original scans, so they could not determine how much of this may be due to variations in recording and reporting of the scans themselves.

In this study we have shown that the quality and completeness of cerebral ultrasound scanning does vary widely between the NICUs in New Zealand. However, despite this, there was a high level of agreement between the reviewers and the reports to the ANZNN, with high intraobserver and interobserver agreement.

We conclude that the reported variation in the incidence of GM/IVH between NICUs in New Zealand is unlikely to be due to the differences in cerebral ultrasound scanning and interpretation. Thus further investigation is warranted into practice variations that may contribute to the differing incidence of GM/IVH in different NICUs.

Acknowledgments

This work was supported by the Maurice and Phyllis Paykel Trust, the Rebecca Roberts scholarship, and the Waikato Sick Babies Trust.

The ANZNN Advisory Committee and Executive members were: Australia: Centre for Perinatal Health Services Research (New South Wales): David Henderson-Smart and Deborah Donoghue; Flinders Medical Centre (South Australia): Peter Marshall; John Hunter Hospital (New South Wales): Chris Wake; King Edward Memorial and Princess Margaret Hospitals (Western Australia): Noel French, Ron Hagan, and Karen Simmer; Launceston General Hospital (Tasmania): Chris Bailey; Liverpool Health Service (New South Wales): Robert Guaran; Mater Mother’s Hospital (Queensland): David Tudehope; Mercy Hospital for Women (Victoria): Andrew Watkins; Monash Medical Centre (Victoria): Kaye Bawden, Andrew Ramsden, and Victor Yu; National Perinatal Statistics Unit (New South Wales): Paul Lancaster; Nepean Hospital (New South Wales): Lyn Downe; Newborn Emergency Transport Service (Victoria): Michael Stewart; New South Wales Newborn and Pediatric Emergency Transport Service: Andrew Berry; Perinatal Research Centre (Queensland): Paul Colditz; Royal Children’s Hospital (Victoria): Linda Johnstone and Peter McDougall; Royal Darwin Hospital (Northern Territory): Charles Kilburn; Royal Hobart Hospital (Tasmania): Peter Dargaville; Royal Hospital for Women (New South Wales): Kei Lui; Royal North Shore Hospital (New South Wales): Jennifer Bowen; Royal Prince Alfred Hospital (New South Wales): Nick Evans; Royal Women’s Hospital (Queensland): David Cartwright; Royal Women’s Hospital (Victoria): Lex Doyle, Colin Morley, and Neil Roy; Sydney Children’s Hospital (New South Wales): Barry Duffy; Canberra Hospital (Australian Capital Territory): Graham Reynolds; Children’s Hospital at Westmead (New South Wales): Robert Halliday; Townsville Hospital (Queensland): John Whitehall; Western Australia Neonatal Transport Service: Jenni Sokol; Westmead Hospital (New South Wales): William Tarnow-Mordi; Women’s and Children’s Hospital (South Australia): Ross Haslam; New Zealand: Christchurch Women’s Hospital: Nicola Austin; Christchurch School of Medicine: Brian Darlow; Dunedin Hospital: Roland Broadbent; Gisborne Hospital: Graeme Lear; Hastings Hospital: Jenny Corban; Hutt Hospital: Robyn Shaw; Middlemore Hospital: Lindsay Mildenhall; National Women’s Hospital: Carl Kushell; Nelson Hospital: Peter McIlroy; Palmerston North Hospital: Jeff Brown; Rotorua Hospital: Stephen Bradley; Southland Hospital: Paul Tomlinson; Taranaki Hospital: John Doran; Tauranga Hospital: Hugh Lees; Timaru Hospital: Philip Morrison; University of Auckland: Jane Harding; Waikato Hospital: David Bourchier; Wairau Hospital: Ken Dawson; Wanganui Hospital: John Goldsmith; Wellington Women’s Hospital: Vaughan Richardson; Whakatane Hospital: Chris Moyes; Whangarei Hospital: Peter Jankowitz.

REFERENCES

View Abstract

Footnotes

  • Competing interests: none declared

Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.