Article Text

Download PDFPDF

The Safety Attitudes Questionnaire as a tool for benchmarking safety culture in the NICU
  1. Jochen Profit1,2,3,
  2. Jason Etchegaray4,
  3. Laura A Petersen2,3,
  4. J Bryan Sexton5,
  5. Sylvia J Hysong3,
  6. Minghua Mei2,3,
  7. Eric J Thomas4
  1. 1Department of Pediatrics, Baylor College of Medicine, Texas Children's Hospital, Houston, Texas, USA
  2. 2Section of Health Services Research, Department of Medicine, Baylor College of Medicine, Houston, Texas, USA
  3. 3Houston Veterans Affairs (VA) Health Services Research and Development Center of Excellence, Health Policy and Quality Program, Michael E DeBakey VA Medical Center, Houston, Texas, USA
  4. 4University of Texas – Memorial Hermann Center for Healthcare Quality and Safety, University of Texas Medical School, Houston, Texas, USA
  5. 5Department of Psychiatry, Duke University School of Medicine, Duke University Health System, Durham, North Carolina, USA
  1. Correspondence to Jochen Profit, Section of Neonatology, Houston Center for Quality of Care and Utilization Studies, Baylor College of Medicine, VA HSR&D (152), 2002 Holcombe Boulevard, Houston, TX 77030, USA; profit{at}


Background Neonatal intensive care unit (NICU) safety culture, as measured by the Safety Attitudes Questionnaire (SAQ), varies widely. Associations with clinical outcomes in the adult intensive care unit setting make the SAQ an attractive tool for comparing clinical performance between hospitals. Little information is available on the use of the SAQ for this purpose in the NICU setting.

Objectives To determine whether the dimensions of safety culture measured by the SAQ give consistent results when used as a NICU performance measure.

Methods Cross-sectional survey of caregivers in 12 NICUs, using the six scales of the SAQ: teamwork climate, safety climate, job satisfaction, stress recognition, perceptions of management and working conditions. NICUs were ranked by quantifying their contribution to overall risk-adjusted variation across the scales. Spearman rank correlation coefficients were used to test for consistency in scale performance. The authors then examined whether performance in the top four NICUs in one scale predicted top four performance in others.

Results There were 547 respondents in 12 NICUs. Of 15 NICU-level correlations in performance ranking, two were >0.7, seven were between 0.4 and 0.69, and the six remaining were <0.4. The authors found a trend towards significance in comparing the distribution of performance in the top four NICUs across domains with a binomial distribution p=0.051, indicating generally consistent performance across dimensions of safety culture.

Conclusion A culture of safety permeates many aspects of patient care and organisational functioning. The SAQ may be a useful tool for comparative performance assessments among NICUs.

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


The National Initiative for Children's Healthcare Quality has highlighted that “promoting safety requires changing the culture of medicine to recognise that the potential for errors exists and that teamwork and communication are the basis to guarantee change”.1

A culture of safety is the shared values, attitudes, perceptions and patterns of behaviour that determine the observable degree of effort with which organisational members direct their attention and actions towards minimising patient harm.2 Of the several safety culture survey instruments described in the literature, the Safety Attitudes Questionnaire (SAQ) is widely used, has good psychometric properties3 and is associated with clinical outcomes.4,,8

The SAQ measures clinician assessments of ‘the way we do things around here’, providing a snapshot of the unit-level care delivery context. Given that safety culture is associated with clinical outcomes, SAQ scores themselves might be used as a unit-level clinical outcome tool for use in comparative performance measurement. Whether the SAQ would be valuable for this purpose has not been studied, however it does meet normative criteria in that (1) significant variation in quality of care among providers exists, (2) this variation is not random and (3) the measurement of provider performance will provide an impetus and path to improvement.

What is already known on this topic

  • Safety culture as measured by the Safety Attitudes Questionnaire (SAQ) varies across neonatal intensive care unit (NICUs).

  • Most NICUs have opportunities for improvement on one or several SAQ domains.

  • Safety culture may become a target for comparative performance measurement.

What this study adds

  • NICU care giver safety culture assessments showed moderate to strong correlations across the dimensions of the SAQ.

  • The SAQ can be used to compare safety culture across NICUs.

When a performance measurement instrument measures multiple aspects of quality, it is important to determine whether performance across these aspects is consistent.9 10 High performance consistency suggests that care quality can be classified with a high degree of confidence, therefore supporting the use of the SAQ for purposes of comparative performance assessment. Performance consistency across the SAQ's domains would suggest that the instrument reflects care giver perceptions of a unified systems-based construct that permeates the care delivery system. Information from ongoing comparative measurement of safety culture in the neonatal intensive care unit (NICU) setting would offer important complementary information to current measurements based solely on clinical outcomes.

This study examines the extent to which the SAQ detects consistency of performance across NICUs.


Sample and procedure

The SAQ (intensive care unit (ICU) Version) was administered to all caregivers in 12 NICUs in a faith-based non-profit health system in July and August 2004. All staff with a ≥50% commitment to the NICU for at least the four consecutive weeks prior to survey administration were invited to participate. This included critical care and other staff physicians, fellows/residents, critical care registered nurses, charge nurses, nurse managers, pharmacists, respiratory therapists and nursing assistant/aides. There were no physician respondents in two NICUs because those physicians were assigned to complete surveys for other paediatric units where they met inclusion criteria more fully (ie, they spent significantly more time in units other than the NICU). Surveys were administered during pre-existing departmental and staff meetings, together with a pencil and return sealable envelope to maintain confidentiality. Individuals not captured in pre-existing meetings, were hand delivered a survey, pencil and return envelope. This administration technique has generated high response rates.11 The original study was approved by the Johns Hopkins University Institutional Review Board, and the analysis of a de-identified data set was approved by the Institutional Review Board at Baylor College of Medicine.


The ICU version of the SAQ contains 65 items with response scales ranging from 1 (disagree strongly) to 5 (agree strongly). In previous SAQ development work, 30 items loaded on six domains: teamwork climate, safety climate, job satisfaction, perceptions of management, stress recognition and working conditions.

The SAQ also captures respondent characteristics including job position, time at institution, gender, race/ethnicity and predominant work shift. To facilitate analysis, we grouped respondents with different job positions as follows: (1) doctors – includes all medicine and critical care physicians across all levels of training; (2) nurses – includes critical care nurses, licensed vocational nurses, nurse managers and charge nurses; and (3) ancillary personnel – includes pharmacists, nursing aides and assistants, ward clerks and respiratory therapists. Respondents designated as ‘other’ (n=34) were excluded from the analyses.

Statistical analysis

We conducted a secondary data analysis of prospectively collected data. The NICU was the unit of analysis. Negatively worded items were reverse scored. Response scores were transformed to a 100-point scale using the following equation: scale score for a respondent=(mean of the scale items−1)×25.12

Justification for aggregation

Given that culture is a unit-level phenomenon (ie, shared perceptions of different aspects of the work environment), it is necessary for researchers to demonstrate that aggregation of individual survey responses within each unit is warranted. Two statistical conditions for aggregation need to be satisfied: (1) respondents from each NICU reported similar scores for the NICU on a given item; and (2) there is significant variance for a given item between NICUs. The four main metrics to determine whether aggregation is appropriate are analysis of variance (ANOVA, one way), intraclass correlation coefficient (ICC(1)), ICC(2) and rwg(j).2 These metrics were computed for each scale of the SAQ domains. We then calculated a ‘composite score’ from the arithmetic mean of the six SAQ scale scores.

Basic descriptive analyses of unadjusted scale scores are published elsewhere.13 Here we present the variation in scale scores across NICUs adjusted for site and respondent mix, with which they are significantly associated. For each scale we ranked NICUs according to their composite score.

To what extent can the SAQ detect consistency of performance across NICUs?

We examined the degree to which superior NICU care giver assessments in one key scale (safety climate) were associated with superior perceptions in the other scales. We used two approaches to test for consistency. First, we transformed ratings on individual scales into ranks and tested for correlation across ranks using the Spearman rank correlation coefficient. Second, we compared the distribution of ranking in the top four NICUs across scales to a binomial distribution using a χ2 test.9 A negative test indicates independence of performance among scales. For all analyses, we considered two-sided p values of <0.05 as statistically significant.


Characteristics of respondents and NICUs

We received completed surveys from 547 of 639 respondents for a response rate of 86% (range 69%–100%). Table 1 describes the characteristics of the survey sample. Of note is the strong preponderance of the aggregated category of nursing respondents (82%) in this sample, especially when compared to doctors (5%). NICUs A and I had only 10 and 12 respondents, respectively.

Table 1

Respondent and NICU characteristics

Justification of aggregation of scale scores to a composite score

Table 2 shows the details of the psychometric results justifying aggregation to a composite score. These metrics were acceptable for all of the SAQ dimensions except stress recognition. For stress recognition, the one-way ANOVA was not significant (p<0.06), ICC(1)=0.02 was outside typical values of 0.05–0.30, and ICC(2)=0.69 and rwg(j)=0.64 were just below the traditional 0.70 cut-off level needed. While we decided to aggregate all scales, we are cautious in our interpretation of stress recognition.

Table 2

Statistical justification for aggregation of scale scores to a composite score

Performance on composite safety culture score and individual domains

Table 3 describes the range of NICU performance on the studied quality domains. We display adjusted results and NICU ranks. Rankings across domains were quite stable, indicating that performance tracks across domains. NICU performance within domains was quite variable except for stress recognition (range 54–64).

Table 3

Adjusted composite and mean scale scores and (ranks)

To what extent can the SAQ detect consistency of performance across NICUs?

Table 4 displays the NICU level rank correlation matrix among quality domains. Except for the stress recognition domain, correlations were moderate to strong. Of 15 NICU level rank correlations, six were significant at p<0.05. Correlations between pairs of safety culture domains were strong (ρ≥0.7) for two pairs, moderate (ρ=0.4–0.69) for seven pairs, weak for (ρ=0.2–0.39) for three pairs and absent (ρ≤0.2) for three pairs.

Table 4

Correlation of NICU ranks across scales

High performance of NICUs was consistent across SAQ domains. The number of times NICUs were among the top four NICUs (a ‘high performer’) for the six safety attitudes domains ranged from none (never in the top four) to five. Figure 1 shows the observed and expected distribution under an assumption that ‘high performance’ on different domains occurs at random (according to a binomial distribution in which the probability of success on each trial is 0.25 and the six trials are independent). There was a trend towards significance between the actual and the binomial distributions (p=0.05), indicating that one can infer high overall performance based on performance on individual domains.

Figure 1

Distribution of rankings in the top four NICUs across scales compared to a binomial distribution. The binomial distribution indicates expected random performance of NICUs across safety culture scales. If actual NICUs are consistently high or poor performers, the distribution should be U-shaped. We indeed found an approximate U-shaped distribution among our study sample (χ2 value=9.43, p=0.051). NICUs, neonatal intensive care units.


In this study, we examined the SAQ as a tool for comparative performance assessment of safety culture among 12 NICUs. The most notable conclusion is that while there is wide variation of performance within domains of the SAQ, NICUs were quite consistent in their performance across domains.

The consistency of NICU performance across domains of the SAQ implies that performance on one subscale predicts performance on another. This suggests that the different scales of the SAQ may measure a cohesive underlying construct. NICUs with high performance on safety, value teamwork and have better working conditions, relationships with management and job satisfaction. This result makes the SAQ an attractive tool for comparative measurement of safety culture among NICUs.

Comparative measurement of safety culture in the NICU setting may be particularly salient as preterm infants are fragile, often very ill and exposed to complex and prolonged intensive healthcare interventions. These circumstances make preterm infants vulnerable to lapses in patient safety.14 In a study of voluntarily reported errors in the NICU setting, poor teamwork and poor communication contributed to errors in 9% and 22% of incidents, respectively.15 In the labour and delivery setting, poor teamwork and communication breakdowns were a root cause of perinatal deaths and injuries in 55% and 72% of cases, respectively.16 Team performance is especially important in emergent situations where a rescue team must assemble quickly, communicate clearly and collaborate effectively to avoid needless morbidity or mortality.17

Safety culture has not been widely studied in the NICU setting. Despite a clear rationale to improve safety culture and encouraging literature on positive associations with improved clinical outcomes in other areas of healthcare,4,,8 it is not yet known whether and how improvements in NICU safety culture will translate into improved quality of care and outcomes for infants. In this study two of the SAQ domains, stress recognition and perceptions of management, did not link well to the others. This finding may be explained in a number of ways. Realistically acknowledging threats to safety and quality (stress recognition) and having the requisite trust in leadership to engage meaningfully in quality improvement efforts (perceptions of management) may act as gatekeepers that subsequently facilitate better teamwork and safety-related norms to flourish. As such, we could expect associations between these two domains and the remaining four domains to be lower. In specific NICUs, where intense and successful quality improvement has taken place over many years, we would expect the relationships to be higher for perceptions of management in particular. Second, improvements in stress recognition and perceptions of management may only represent a first step in a series of actions an NICU needs to take to improve clinical outcomes. For example, one study found associations between a non-punitive approach to error, hospital management support for patient safety and overall perceptions of safety with incident reporting behaviour in the NICU.18 Possibly, organisations which facilitate openness in error detection and encourage learning may eventually achieve better clinical results. Third, the questions asked in this version of the SAQ related to hospital management, not unit management (current versions of the SAQ distinguish between various levels of leadership).

In an accompanying paper, we demonstrated wide variations in safety culture among this sample of NICUs.13 On the other hand, in previous work, we found little performance consistency among NICUs across various common measures of clinical quality.19 Clearly, more work in the NICU setting, including prospective hypothesis testing, is required to better understand the correlation between safety culture, clinical processes, operational processes and health outcomes.

Despite these unresolved areas of inquiry, the ability of the SAQ to capture NICU safety culture makes it attractive for comparative measurement, especially given that individual scales and items of the SAQ can be linked to specific safety interventions. For example, collaborative rounds,20 aviation based crew resource management training21 or improved communication across hierarchies22 23 improve teamwork, whereas Leadership WalkRounds24 25 or a Comprehensive Unit-based Safety Program26 improve safety. In addition, ICU care giver safety culture assessments have been shown to predict their ability to implement complex safety practices.27

Since the SAQ measures frontline worker assessments of safety culture, we believe its use for comparative performance measurement is most valid for the purposes of internal benchmarking and quality improvement. By internal benchmarking we mean here the use of the SAQ within individual NICUs or within neonatal quality collaboratives that already collect and compare clinical data. In this environment, the SAQ provides useful and complementary data to clinical quality of care measures.

Traditionally, NICUs have focused on disease-specific aspects of clinical care and devised remedies for improvement.28 29 Although this approach is intuitive and necessary, it may not address underlying preconditions which may enable many adverse outcomes. In contrast, systematic monitoring and efforts to improve safety culture may improve the system of care delivery by promoting safe and teamwork-based care of infants throughout their hospital stay.

We emphasise the importance of interpreting our results in light of the intended context of the study. For this proof of concept study, we used the mean score across the SAQ's domains as a composite index for benchmarking. Although aggregation works technically, such a score implies that all domains are equally important and that poor performance in one domain (safety climate) can be offset with good performance in another (stress recognition). A better solution would be a composite that encourages high performance in all domains. Methods are available to accomplish this,30 and we are testing these in our work on a clinical composite index for NICU care.31 In order to ensure that an SAQ composite score would be actionable, reliable and valid in the eyes of frontline workers, future research will need to test the links between safety culture domain scores and NICU outcomes that include clinical and operational metrics.

Finally, our study sample was quite small and from a single health system. Although it is possible that our results do not generalise to the wider universe of NICUs, our study is strengthened by finding consistency among even a small sample.


We found moderate to strong correlations in NICU care giver safety culture assessments across the dimensions of the SAQ. The results of the SAQ may provide a useful starting point for assessing and trending of safety culture among NICUs and can serve as a yardstick by which to assess the need for, pace of and opportunities underlying quality improvement initiatives.


In addition to thanking the NICU personnel who participated by sharing their assessments, the authors would like to acknowledge the contribution of the study staff, Christen Fullwood, Chris Holzmueller, Angelina Barbosa and Linda Marcellino.



  • Funding JP's contribution is supported in part by the Eunice Kennedy Shriver National Institute of Child Health and Human Development (#1 K23 HD056298-01, PI: Profit). LAP was a recipient of the American Heart Association Established Investigator Award (#0540043N) at the time that this work was carried out. LAP, SJH and MM also receive support from a Veterans Administration Center Grant (VA HSR&D CoE HFP90-20). SJH's contribution is supported in part by the Department of Veterans Affairs Health Services Research and Development Program (#CD 2-07-0818). EJT's effort is supported in part by grants from the Eunice Kennedy Shriver National Institute of Child Health and Human Development (#1 K24 HD053771-01, PI: Thomas and #1 PO1 HS1154401, PI: Thomas). JBS received support from the Agency for Healthcare Research and Quality (AHRQ) (grant # 1UC1HS014246). JE's effort is supported by a K02 award from AHRQ (#1 K02 HS017145-02) and the University of Texas at Houston – Memorial Hermann Center for Quality and Safety.

  • Competing interest None.

  • Ethics approval The original study was approved by the Johns Hopkins University Institutional Review Board, and the analysis of a de-identified data set was approved by the Institutional Review Board at Baylor College of Medicine.

  • Provenance and peer review Not commissioned; externally peer reviewed.