Article Text

Randomised crossover study on pulse oximeter readings from different sensors in very preterm infants
  1. Christian Achim Maiwald1,2,
  2. Christoph E Schwarz2,3,
  3. Katrin Böckmann2,
  4. Laila Springer2,
  5. Christian F Poets2,
  6. Axel Franz1,2
  1. 1 Department of Pediatrics, Center for Pediatric Clinical Studies (CPCS), University Hospital Tübingen, Tübingen, Germany
  2. 2 Department of Neonatology, Tübingen University Hospital, Tübingen, Germany
  3. 3 Department of Neonatology, University of Heidelberg, Heidelberg, Germany
  1. Correspondence to Professor Christian F Poets, Department of Neonatology, University Hospital Tubingen Department of Pediatrics, Tübingen, 72076, Germany; christian-f.poets{at}


Objective In extremely preterm infants, different target ranges for pulse oximeter saturation (SpO2) may affect mortality and morbidity. Thus, the impact of technical changes potentially affecting measurements should be assessed. We studied SpO2 readings from different sensors for systematic deviations.

Design Single-centre, randomised, triple crossover study.

Setting Tertiary neonatal intensive care unit.

Patients 24 infants, born at <32 weeks’ gestation, with current weight <1500 g and without right-to-left shunt via a patent ductus arteriosus.

Interventions Simultaneous readings from three SpO2 sensors (Red Diamond (RD), Photoplethysmography (PPG), Low Noise Cabled Sensors (LNCS)) were logged at 0.5 Hz over 6 hour/infant and compared with LNCS as control using analysis of variance. Sensor position was randomly allocated and rotated every 2 hours. Seven different batches each were used.

Outcomes Primary outcome was the difference in SpO2 readings. Secondary outcomes were differences between sensors in the proportion of time within the SpO2-target range (90–95 (100)%).

Results Mean gestational age at birth (±SD) was 274/7 (±23/7) weeks, postnatal age 20 (±20) days. 134 hours of recording were analysed. Mean SpO2 (±SD) was 94.0% (±3.8; LNCS) versus 92.2% (±4.0; RD; p<0.0001) and 94.5% (±3.9; PPG; p<0.0001), respectively. Mean SpO2 difference (95% CI) was −1.8% (−1.9 to −1.8; RD) and 0.5% (0.4 to 0.5; PPG). Proportion of time in target was significantly lower with RD sensors (84.8% vs 91.7%; p=0.0001) and similar with PPG sensors (91.1% vs 91.7%; p=0.63).

Conclusion There were systematic differences in SpO2 readings between RD sensors versus LNCS. These findings may impact mortality and morbidity of preterm infants, particularly when aiming for higher SpO2-target ranges (eg, 90–95%).

Trial registration number DRKS00027285.

  • Intensive Care Units, Neonatal
  • Neonatology

Data availability statement

All data relevant to the study are included in the article or uploaded as supplementary information.

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


  • SpO2 target ranges affect outcome of extremely preterm infants. Current recommendations on SpO2 targets are based on one instrument brand and sensor type.


  • Some new generation sensors resulted in SpO2 readings that were 2% lower than with the previous standard. This may lead to higher oxygen levels and thus potentially affect oxygen-related morbidity and mortality in extremely preterm infants.


  • Recommendations on SpO2 targets should refer to a specific instrument brand and sensor type. There is a need for better standardisation of SpO2 technology.


Medical oxygen is one of the most common drugs administered in neonatal intensive care units (NICUs).1 The majority of infants with a gestational age (GA) at birth <32 weeks require supplemental oxygen, and both too much and too little oxygen may impact on outcome. Therefore, considerable effort has been, and continues to be, employed for achieving an optimal oxygen supply strategy in these infants.

In a recent Cochrane report comparing two different SpO2-target ranges (85–89% vs 91–95%, effective difference 2.8%) that was largely based on the results of the NeoPROM collaboration, the higher range was associated with lower rates of death and necrotising enterocolitis (NEC), but higher rates of retinopathy of prematurity (ROP) and bronchopulmonary dysplasia.2 Therefore, the higher SpO2 targets (eg, 90–94%, with alarm limits of 89% and 95%) are recommended by experts in the field3 and also in consensus guidelines for the treatment of neonates.4

While the NeoPROM studies were performed exclusively with Masimo SET oximeters with ‘Low Noise Cabled’ Sensors (LNCS), the manufacturer currently recommends the use of ‘Red Diamond’ sensors (RD) because of reportedly improved accuracy when compared with arterial haemoglobin oxygen saturation by co-oximetry (SaO2; ±3% points vs ±1.5% points in SpO2 in LNCS vs RD sensors5 6). The Photoplethysmography (PPG) sensors have the same accuracy as LNCS, but would have the advantage of enabling wireless transmission.7 In NeoPROM, a 2.8% difference in achieved SpO2 changed the outcome, and therefore, any change in measurement technology (or components thereof) should be carefully assessed for their potential impact on achieved SpO2 in this very vulnerable population of extremely preterm infants. Consequently, we performed a head-to-head comparison between SpO2 readings from two new sensor types (RD; PPG) against our local standard, the LNCS.

Material and methods

Study design

This is a single-centre, randomised, triple cross-over, prospective observational study of CE-marked medical devices applied according to their intended use.


Infants born at <32 weeks GA and with an excluded bi-directional or right-to-left shunt through a patent ductus arteriosus on echocardiography were screened during their postnatal hospitalisation; those receiving less than 12 feeds per day (to align study-driven changes in sensor site with clinically indicated disturbance, ie, feeding and nursing) or on palliative care were excluded. Due to six possible randomisation clusters, we initially planned to examine 18 infants (group 1; three infants per cluster) and then added another six infants to exclude sensor batch-related differences (group 2; study flow diagram (figure 1)). The study protocol required group 1 to include at least nine infants each with a current GA <28 weeks and receiving supplemental oxygen (FiO2>0.21).


This study took place in the tertiary NICU at the Department of Neonatology, University Hospital Tübingen, Germany.


‘Radical 7’ oximeters, 2012 version (MCU: 1064; Tech-card: 7e23 (RD and LNCS) and 7f10 (PPG); processor: V. were used. Docking stations were RDS-1 (ASCII1 IAP Flexport 5143) and trends were downloaded using the Masimo Instrument Configuration Tool (V., 2020). Sensor types were LNCS as the local standard (Masimo internal Order No: 1862 and for 2 infants <800 g: 1901); for comparison, we used RD (Order No: 4003) and PPG (Order No: 4585). In group 1, we used a single batch per sensor type in all infants; in group 2 every recording was performed with different batches for all sensor types (online supplemental table 1) to exclude biased results due to production errors. All devices and sensors were produced by Masimo, Irvine, California, USA.


Parents of eligible infants were approached and written informed parental consent was obtained. The three different SpO2 sensors were simultaneously attached to three IV-access-free limbs. Limbs were numbered clockwise in supine position, starting on the right hand. Sensor types were randomly allocated to sensor sites (see: Randomisation).

Sensors were placed and repositioned every 2 hours, exclusively during care periods or meals. Data from all three sensors were simultaneously recorded at a sampling rate of 0.5 Hz for a total duration of 6 hours (ie, each sensor type and position for at least 2 hours each. The expected 10.800 measurements per patient were considered to be sufficient to demonstrate any clinically relevant difference. The 2-hour period was chosen to meet nursing practices and to avoid sensor changes independent of care rounds. Averaging time was set to 2–4 s.

FiO2 was manually or automatically controlled (if infant participated in our multicentre FiO2 controller trial8) to achieve SpO2 values within the target range of 90–95% according to the SpO2 readings of the LNCS.


Six different algorithms for changing the three sensors, each with different starting positions (see online supplemental table 2 for randomisation clusters), were randomly assigned with appropriate allocation concealment using consecutively numbered sealed opaque envelopes.


Since the different sensors have different patient cable/sensor interfaces, blinding was not feasible.

Efforts to reduce bias and to assess potentially influencing variables

  • Deviations based on limb allocation

    • Echocardiography: All infants had routine echocardiography at maximum 48 hours before start of recording to exclude right-to-left ductal shunting

    • Two-hourly, clockwise rotation of sensor positions.

    • Randomised assignment of starting position with adequate allocation concealment

  • Deviations based on signal quality

    • Bedside nurses were advised to check (and if necessary correct) sensor position in the event of persistently poor signal quality (‘low signal IQ’-alarm) but not if there were discrepancies between readings

    • Exclusion of data lines with invalid values (ie, if at any given time point any of the three SpO2 or pulse rate readings showed either ‘no value’ or ‘zero’ or an exception code such as ‘sensor OFF’ or ‘low signal IQ’ and all data recorded during care periods (to exclude any impact of motion artefacts).

    • Comparison of pulse rate readings in the analysed data to check the validity of the recordings

  • Influence of batches

    • After recruitment of 18 infants with RD and PPG sensors from a single batch (group1), we repeated measurements in six additional infants (group 2) using a different batch for all sensor types in each infant to rule out that the observation made was based on a single batch and possibly biased by production errors.

Outcome variables

Primary outcome was the SpO2 difference (95% CI) between RD or PPG sensors compared with LNCS as control. Therefore, mean values (±SD) were compiled for every infant over all sensor positions and compared between sensor types. Secondary outcomes were the proportion of time in SpO2 target (90–95% for infants in FiO2>0.21 and 90–100% for infants in FiO2=0.21) and the proportion of time above target (only for infants with FiO2>0.21). Infants with an FiO2 of both, 0.21 and >0.21, were excluded because FiO2 was not logged. Proportion of time with SpO2 below target was calculated for all sensors in all infants. FiO2 was controlled throughout the study according to LNCS readings.

Statistical analysis

Time stamp, SpO2 and pulse rate readings were downloaded as CSV files and compiled using Microsoft Office Excel 2019 (V.1808). Analysis was descriptive using mean (±SD) and Friedman test performed if the mean difference was >0.1 in any comparison, using Prism V.9.4.1 (GraphPad, Boston, USA). p<0.05 was considered statistically significant. Bland-Altman plots for visualisation of differences were created for individual values of SpO2 and pulse rate in both groups and sensor comparisons (RD vs LNCS and PPG vs LNCS).



Twenty-four infants (12 female) were recruited between 10/2021 and 11/2022.

In group 1, we recruited 10 girls and 8 boys; 8 infants had a GA<28 weeks. Mean GA (±SD) at birth was 280/7 (±23/7) weeks and mean birth weight (±SD) 925 (±345) g. Mean postnatal age (±SD) was 18 (±21) days.

In group 2, we recruited two girls and four boys with a mean GA at birth (±SD) of 264/7 (±21/7) and a mean birth weight (±SD) of 714 (±241) g. Mean postnatal age (±SD) was 26 (±18) days.

For a more detailed description of weight and GA distributions, see online supplemental table 3: demographic data.


147.2 hours of data were recorded (group 1: 110.2 hours; group 2: 37.0 hours). After exclusion of invalid data, we analysed 241.595 data points (group 1: 178.426; group 2: 63.169), corresponding to 134.2 hours (91.2% of recorded data) and a mean duration (±SD) of 5.6 hours (±0.5) per patient.

Between sensor comparisons

For all measurements, mean pulse rates were identical for LNCS, RD and PPG sensors. These and between-sensor differences in pulse rate for individual measurements are represented in the online supplemental table 4 and figure 1.

Mean SpO2 values were significantly lower with RD sensors (92.2% vs 94.0%; p<0.0001) and significantly higher with PPG sensors (94.5% vs 94.0%; p<0.0001) compared with LNCS. Mean differences (95% CI) between simultaneous SpO2 values were −1.84% (–1.85% to −1.83%) for RD versus LNCS and 0.46% (0.45% to 0.47%) for PPG versus LNCS (table 1, online supplemental file 1). The graphical illustration of counts for all SpO2 values also showed a deviation towards lower values for the RD sensor compared with the LNCS and PPG sensor (figure 2). Additionally, all infants had a lower mean SpO2 with RD sensors compared with LNCS, while mean SpO2 was similar for PPG sensors versus LNCS (figure 3). In periods with SpO2 between 90% and 95% as measured by LNCS, the mean SpO2 was 93.0% (±1.5) for LNCS, 91.5% (±2.6) for RD and 93.7% (±2.7) for PPG.

Figure 2

Counts of SpO2 values per sensor in all infants. LNCS, Low Noise Cabled Sensors; PPG, Photoplethysmography; RD, Red Diamond

Figure 3

Comparison of mean SpO2-values of all 24 infants. LNCS, Low Noise Cabled Sensors; PPG, Photoplethysmography; RD, Red Diamond.

Table 1

Outcome measurements

Proportion of time in SpO2 target (90–95% for 8 infants with FiO2 continuously >0.21 and 90–100% for 10 infants with FiO2 continuously =0.21)

Compared with LNCS (which had been used to control FiO2), mean proportion of time with SpO2 in target was significantly lower with RD, but similar with PPG sensors (table 1).

Only one infant at an FiO2 of 0.24–0.28 spent a higher proportion of time in target with RD compared with LNCS (figure 4). This infant had a high proportion of time above the target range with LNCS and a mean difference in SpO2 of −1.88% between RD and LNCS.

Figure 4

Distributions of proportions of time in- and outside of SpO2-target range. LNCS, Low Noise Cabled Sensors; PPG, Photoplethysmography; RD, Red Diamond.

Proportion of time above target (eight infants with FiO2 continuously>0.21)

The mean proportion of time spent above the target range was not statistically significantly different across sensors (table 1).

Proportion of time below target range (all 24 infants)

The mean proportion of time with SpO2 80–89% and with SpO2<80% was increased for RD sensors compared with that for LNCS and similar for PPG sensors compared with LNCS (table 1 and figure 4).


To our knowledge, this is the first study systematically comparing SpO2 readings obtained with different sensor types from the same manufacturer in the vulnerable population of extremely preterm infants most in need of tight oxygen targeting. Previous studies compared instruments from different manufacturers9–14 or SpO2 with SaO2 to verify, for example, the impact of skin colour or fetal haemoglobin.

Whereas most neonatologists will be familiar with the fact that simultaneous pulse oximetry readings from different limbs are not identical for substantial proportions of time, even if identical technology and equipment is used, our finding of a systematic deviation between LNCS and RD sensors is disturbing.

Both new sensors (PPG and RD) showed statistically significant differences in mean SpO2 compared with LNCS, but for the PPG sensor (differing from LNCS technology only in wireless transmission), this mean difference in SpO2 was smaller, less reproducible (figure 3) and there was no difference in the proportion of time outside the SpO2-target range, indicating that subsequent clinical practice of FiO2 control would not be different after changing sensors from LNCS to PPG. These findings agree with the expectation that the wireless transmission should have no effect on the SpO2 readings. In contrast, the difference between RD sensors and LNCS was of clinical importance and found in every infant. The relevant difference in proportion of time outside the target range may indicate that using RD sensors for FiO2 control would have resulted in relevantly higher oxygen exposure.

Since pulse detection is essential for pulse oximetry, the exact concordance of mean pulse rates between all sensor types confirms that care was taken to avoid any systematic bias in sensor application and that data collection and processing were of high quality. Whereas pulse rate measurements directly rely on the detection of an alternating signal, SpO2 measurements are more complex as they rely on the relative extinction of light of at least two wavelengths within this alternating signal over a non-alternating background to approximate arterial oxygen saturation, which is more sensitive to external perturbations. This is supported by the observation that the coefficient of variation (ie, the SD divided by the mean) for SpO2 measurements is much higher than for pulse rate measurements. According to the manufacturer, LNCS yield an SD of ±3% and RD sensors of 1.5% within 70–100% SaO2. This means that at an SaO2 of 90%, 95% of SpO2 readings will be between 84% and 96% for LNCS and between 87% and 93% for RD sensors.

Comparing this imprecision in SpO2 readings, given the narrow target ranges of 90–95% currently recommended for extremely preterm infants, is worrying, as is the systematic mean difference of almost 2% between readings from LNCS and RD sensors, independent of mean SpO2 and across all batches tested.

This is particularly true because the Cochrane analysis of the NeoPROM studies reported significant and clinically relevant differences concerning the risk of death, NEC or ROP with an effective difference in SpO2 of only 2.8%.2 We believe that the difference in mean SpO2 between RD sensor versus LNCS, although likely imperceptible during routine neonatal care, might be clinically relevant. Patients who are within the SpO2 target range based on LNCS readings are below target for substantial proportions of time based on RD sensor readings, likely resulting in systematically higher FiO2 settings with the use of RD sensors, which in turn may impact on clinical outcome. Therefore, a switch from LNCS to RD sensors may potentially have the same clinical consequences as changing the SpO2-target ranges from 90–95% to 92–97%, which may have only a debatable impact on the proportion of time with PaO2 values >80 mm Hg (eg, in the studies by Bachman et al,15 Wackernagel et al,16 Christie et al 17), but the clinical impact on oxygen-related morbidity and mortality has not yet been explored.

One limitation of our study is that our data do not allow to assess the accuracy of SpO2 readings with the different sensor types in comparison to SaO2. However, because current recommendations on SpO2 targeting are based on measurements with LNCS, we aimed to verify the agreement of newly introduced sensors with the previous ‘standard’.


Our study results show a systematic difference in SpO2 readings between RD sensors and LNCS. Particularly for NICUs that aim for the upper NeoPROM target range (91–95%, centre value 93%), this may result in an unintendedly high oxygen exposure when replacing LNCS by RD sensors without adjusting the SpO2 target range (ie, a median value of 93% with RD technology might represent a value of 95% with the LNCS). This may impact clinical outcomes in extremely preterm infants and should lead to caution when implementing changes in SpO2 technology in an NICU, irrespective of the manufacturer and also when transferring an SpO2 target range from one to another oximeter technology. Independent international standardisation of pulse oximetry technology would be desirable.

Data availability statement

All data relevant to the study are included in the article or uploaded as supplementary information.

Ethics statements

Patient consent for publication

Ethics approval

This study involves human participants and was approved by Ethics Committee of University Hospital Tuebingen, reference: 366/2021BO2. Participants gave informed consent to participate in the study before taking part.


We gratefully acknowledge the families for their willingness to participate in this study, as well as the contribution from Masimo Corporation, Irvine, for providing the required LNCS, RD and PPG sensors for this study.


Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.


  • Contributors CAM, AF, LS and CFP were involved in the study design. CAM, CES and KB collected the data and performed together with AF echocardiographic examinations (all echocardiographic examinations were validated by AF). CAM analysed the data, drafted the first version of the manuscript and is responsible for the overall content as the guarantor. AF, CFP, LS, KB and CES reviewed the manuscript with respect to clinical interpretation of the data and made important contributions. All authors have reviewed and approved the final version of the manuscript.

  • Funding Masimo Corporation, Irvine, provided the required LNCS, RD and PPG sensors for this study. No other specific grants were received from any funding agency in the public, commercial or not-for-profit sectors.

  • Competing interests CFP received advisory board-honoraria from Masimo, Irvine, California in 09/2020. All other authors have indicated they have no conflicts of interests relevant to this article to disclose. AF and CFP declare that Masimo generously supported SpO2 measurements in a previous and an ongoing clinical trial. In this study, Masimo provided also the required LNCS, RD and PPG sensors. However, Masimo had no impact on the design of this study, analysis of the data and writing of this manuscript.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.