Article Text

Download PDFPDF

A systematic review of administrative and clinical databases of infants admitted to neonatal units
  1. Yevgeniy Statnikov1,
  2. Buthaina Ibrahim2,
  3. Neena Modi2
  1. 1Neonatal Data Analysis Unit, Section of Neonatal Medicine, Department of Medicine, Imperial College London, Chelsea & Westminster Hospital campus, London, UK
  2. 2Section of Neonatal Medicine, Department of Medicine, Imperial College London, Chelsea & Westminster Hospital campus, London, UK
  1. Correspondence to Yevgeniy Statnikov, Neonatal Data Analysis Unit, Section of Neonatal Medicine, Department of Medicine, Imperial College London, Chelsea & Westminster Hospital campus, 4th Floor, Lift Bank D, London SW10 9NH, UK; y.statnikov{at}


Objectives High quality information, increasingly captured in clinical databases, is a useful resource for evaluating and improving newborn care. We conducted a systematic review to identify neonatal databases, and define their characteristics.

Methods We followed a preregistered protocol using MesH terms to search MEDLINE, EMBASE, CINAHL, Web of Science and OVID Maternity and Infant Care Databases for articles identifying patient level databases covering more than one neonatal unit. Full-text articles were reviewed and information extracted on geographical coverage, criteria for inclusion, data source, and maternal and infant characteristics.

Results We identified 82 databases from 2037 publications. Of the country-specific databases there were 39 regional and 39 national. Sixty databases restricted entries to neonatal unit admissions by birth characteristic or insurance cover; 22 had no restrictions. Data were captured specifically for 53 databases; 21 administrative sources; 8 clinical sources. Two clinical databases hold the largest range of data on patient characteristics, USA's Pediatrix BabySteps Clinical Data Warehouse and UK's National Neonatal Research Database.

Conclusions A number of neonatal databases exist that have potential to contribute to evaluating neonatal care. The majority is created by entering data specifically for the database, duplicating information likely already captured in other administrative and clinical patient records. This repetitive data entry represents an unnecessary burden in an environment where electronic patient records are increasingly used. Standardisation of data items is necessary to facilitate linkage within and between countries.

  • neonatal unit
  • infant
  • database
  • electronic health records
  • international

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

What is already known on this topic?

  • High quality information on patient care captured in electronic databases can be used to improve service delivery and patient outcomes.

  • Globally the number of neonatal databases for monitoring and improving patient care is unknown.

What this study adds?

  • In this systematic review we identified 82 neonatal databases across the world that have potential to improve patient care.

  • We identified considerable variation between the identified databases in population coverage, data source, patient characteristics and availability of accompanying metadata.


Neonatal units provide specialist care for approximately 1 in 10 newborn infants. They are generally high technology, data-rich environments. This has contributed to increasing interest in the possibilities offered by clinical neonatal databases to evaluate and improve patient care. A prerequisite for data sharing and comparison is a good understanding of the breadth and scope of databases, and the extent of consistency in the data held. We aimed to conduct a systematic review to identify existing neonatal databases and define their characteristics.


Literature search

The study was preregistered with the University of York Centre for Reviews and Dissemination (CRD42015017439).1 We searched MEDLINE via Ovid, EMBASE via Ovid and CINAHL via Athena using the following terms ‘intensive care units, neonatal/’ OR ‘intensive care, neonatal/’ OR ‘neonatal intensive care units’ OR ‘NNU’ OR ‘NICU’ OR ‘neonatal ICU’ AND ‘infant/’ OR ‘neonat$’ AND ‘database$’ or ‘registry’ OR ‘registries’ OR ‘dataset$’ OR ‘data set$’ OR ‘vital statistics’ covering the period 1 January 2000 to 15 March 2015. We included English, French, German, Italian, Russian and Spanish articles. Grey literature searches were carried out in Web of Science and OVID Maternity and Infant Care Databases. Free-text terms were ‘neonatal intensive care unit’ AND ‘infant’ AND ‘database’.


We exported all identified abstracts into EndNote X7 where duplicate results were removed. Two researchers reviewed titles and abstracts to identify databases containing patient-level information and covering populations of infants from more than one neonatal unit. The articles were sorted, wherever possible, by the country and database name identified within the title and abstract. Full-text articles were obtained for all selected abstracts.

Data fields

We created a spreadsheet in MS Excel 2011 into which we extracted information on the data fields shown in online supplementary appendix 1. The data fields were prespecified in our PROSPERO registration, however, in the case of ‘population coverage’ and ‘data source’ we widened the breadth of information extracted. The new ‘population limits’ field included ‘admission to neonatal unit, all infants included’; ‘admission to neonatal unit with gestational age and/or birth weight cut off’, ‘admissions or births in a hospital participating in submissions to the database’; ‘health insurance enrolment’, where a database comprises information for patients covered by a single insurance provider; and ‘all births including neonatal unit admissions’. The data source criterion was expanded to specify if ‘data were extracted from a clinical source’, such as electronic patient records, or data were ‘extracted from an administrative source’. We also identified whether data definitions for variables held were available and identified how the databases are supported financially.

Data extraction

We populated the spreadsheet summarising results with information extracted from full-text articles. In instances where full-text articles did not provide sufficient details, such as metadata about variables captured in the database, we accessed the websites of organisations operating the databases to obtain additional information. Foreign language articles in French, Italian or Spanish were translated by the authors; German translations were carried out by an external researcher.


The results of the literature search are shown in figure 1. The search yielded 2037 unique papers, from which 1622 were removed during screening. We identified 82 databases (table 1) of which 78 (39 regional, 39 national) contained data from 24 individual countries and 4 from multiple countries. Five countries accounted for more than half (48/82) of all identified databases: USA (n=24), Canada (n=11), UK (n=7) and Australia and New Zealand (n=5).

Table 1

Databases identified by systematic review by country and geographical coverage

Figure 1

Flow chart of the search strategy used in the review.

Of the 39 regional databases 23 were administrative, 15 were clinical and 1 was established for research purposes (see online supplementary appendix 2). Of the 39 national databases, the primary purpose was administrative in 13, clinical in 20 and research in 6. International databases included two for clinical, one for research and one for surveillance purposes.

Inclusion criteria

Twenty-seven databases were restricted to admissions to neonatal units with a gestational age and/or birthweight restriction; 23 were restricted to admissions or births in specific hospitals; 5 were limited by health insurance cover; 22 held data on all hospital births including neonatal unit admissions; 5 included all neonatal unit admissions without any restrictions by birth characteristics (see online supplementary appendix 2).

Data source

Data were recorded specifically for 53 databases (21 regional, 28 national and 4 international).

Twenty-one databases were created from extracts from administrative sources (14 regional and 7 national) (see online supplementary appendix 2). Eight databases were based upon extracts from clinical data sources; in the USA (Consortium of Safe Labor Database; Intermountain Healthcare database; Kaiser Permanente Medical Care Program; Pediatrix BabySteps Clinical Data Warehouse), Denmark (NeoBase), France (Bourgogne database) and the UK (Neonatal Intensive Care Outcomes and Research Evaluation; National Neonatal Research Database (NNRD)). Three of these eight databases had some form of national population data coverage (the UK NNRD; the US Consortium of Safe Labor Database and the US Pediatric BabySteps Clinical Data Warehouse).

Database size and years active

The largest database created by extracts from clinical records was the National Perinatal Registry of the Netherlands with 903 000 infants reported between 2003 and 2007. The longest running national clinical database was that of the Australia and New Zealand Neonatal Network with 27 189 infants reported between 1994 and 2012; it is still active.

Maternal and infant characteristics

The range of variables captured in each database is summarised in table 1. Data dictionaries were accessible for 52 databases. Neonatal unit admission based databases that capture infant gestational age, birth weight and sex are shown in table 2. Across the 27 databases in table 2 there is a wide range of data available, however none contained all of the variables sought. The UK NNRD followed by the US Pediatrix BabySteps Clinical Data Warehouse contains the largest number of variables.

Table 2

Characteristics of databases of admissions to neonatal units; all hold data on gestational age, birth weight and sex


Of the 82 databases 70 receive some form of public funding (see online supplementary appendix 2). Of the remaining, eight were funded through hospital subscription and one through private insurance; we were unable to identify the funding source for two. The NNRD has no core funding and is maintained through grants and commissions.


We identified 82 databases that cover one or more neonatal units, contain patient-level information obtained from either administrative or clinical sources, and have variable geographical coverage, patient inclusion criteria and data items. The databases are roughly split between capturing information across regions and across whole countries. We were only able to locate eight databases created through extractions from electronic patient records, of which only two contain information on admissions to neonatal units across defined geographical areas. One of these is the NNRD that captures data from all neonatal units in England, Scotland and Wales; the other is the Pediatrix BabySteps Clinical Data Warehouse that contains data from neonatal units operated by the Pediatrix Medical Group, a private enterprise operating in over 30 US states. Both of these databases also have the largest variety of infant and maternal data fields. Furthermore, the NNRD and Pediatrix record all infants admitted to neonatal units in contrast with some long-standing databases such as the Vermont Oxford Network where only infants meeting particular birthweight or gestational age criteria are featured.

We acknowledge that we may not have identified all large multicentre databases as our search was limited by language; for example since completion of our search we have been made aware of a South Korean neonatal database of admissions to neonatal care of babies weighing <1500 g at birth, managed by the Korean Neonatal Network. It is also possible that countries have databases that are not widely known because outputs are not in the public domain or cited in peer-reviewed publications. Fifty-three databases were classified as having data ‘Recorded specifically for the database’ but we were unable to verify whether this equated to manual data entry and if so by whom, or some other method of obtaining data.

Linkage of databases offers opportunity to explore between-country variations on care and patient outcomes. The International Network of Evaluations of Outcomes of Neonates (iNeo) ( and eNewborn ( are two examples, the former is an international quality improvement initiative that has linked data from Australia, Canada, Israel, Japan, New Zealand, Spain, Sweden, Switzerland and the UK; the latter is a platform for benchmarking, quality improvement and research that to date is confined to European countries.

The development of neonatal medications and delivery of pragmatic clinical trials facilitated by large databases are two areas of growing interest. However for the global neonatal community to realise the full potential offered by large databases effort is required to ensure consistency of clinical definitions and technical specifications of each variable captured, and to have a clear understanding of population coverage. For instance, the NNRD is formed of the Neonatal Data Set, an approved National Health Service information standard that comprises over 400 data fields standardised in accordance with the National Health Service Data Dictionary service. This enables data to be merged across multiple neonatal units and stored in a single repository. There are a number of international initiatives currently underway that are attempting to address these and related challenges; for example the International Neonatal Consortium led by the Arizona-based Critical Path Institute is attempting to develop standardised approaches for incorporating laboratory, physiological and imaging data into clinical databases ( Work by a number of research groups to develop core outcome sets and evidence-based case definitions for common neonatal conditions that may be incorporated into clinical databases will also add to the growing strength of this approach.

Resourcing the infrastructural requirements for databases and overcoming regulatory restrictions for sharing data across countries are potential obstacles that require to be addressed to realise the full potential of large high quality patient care databases. In the early days of neonatal medicine, before this became an established specialty, charities such as Bliss in the UK, led the way in providing spearhead support, initially for equipment, followed by medical, then nurse training. Perhaps while awaiting the power of high quality ‘big databases’ to be realised in mainstream channels, this philanthropy may once again be brought to bear to further advance the specialty of newborn care.

In conclusion we have identified a number of neonatal databases internationally that have been developed for differing purposes and contain widely varying data variables. A number of measures are now necessary if their potential to advance newborn care is to be harnessed. These include national and international collaboration to define standards for data quality assurance, technical specifications for variables, choice of international nomenclatures, details of population coverage, and provision of metadata, in addition to addressing inconsistencies in data and case definitions.


View Abstract


  • Twitter Follow Yevgeniy Statnikov @Y_Stat

  • Contributors NM conceived the study. NM and YS designed the protocol. YS performed the systematic search and data extraction. BI verified the systematic search. YS prepared the first draft of the paper; this and all subsequent drafts were reviewed and revised by all authors. All authors approved the final version submitted.

  • Funding This work represents independent research funded by the National Institute for Health Research (NIHR) under its Programme Grants for Applied Research Programme (Reference RP-PG-0707-10010).

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.