Berlin Questionnaire (BQ), an English language screening tool for obstructive sleep apnea (OSA) in primary care, has been applied in tertiary settings, with variable results.
AimsDevelopment of BQ Portuguese version and evaluation of its utility in a sleep disordered breathing clinic (SDBC).
Material and methodsBQ was translated using back translation methodology and prospectively applied, previously to cardiorespiratory sleep study, to 95 consecutive subjects, referred to a SDBC, with OSA suspicion. OSA risk assessment was based on responses in 10 items, organized in 3 categories: snoring and witnessed apneas (category 1), daytime sleepiness (category 2), high blood pressure (HBP)/obesity (category 3).
ResultsIn the studied sample, 67.4% were males, with a mean age of 51 ± 13 years. Categories 1, 2 and 3 were positive in 91.6, 24.2 and 66.3%, respectively. BQ identified 68.4% of the patients as being in the high risk group for OSA and the remaining 31.6% in the low risk. BQ sensitivity and specificity were 72.1 and 50%, respectively, for an apnea-hipopnea index (AHI) > 5, 82.6 and 44.8% for AHI > 15, 88.4 and 39.1% for AHI > 30. Being in the high risk group for OSA did not infl uence significantly the probability of having the disease (positive likelihood ratio [LR] between 1.44-1.49). Only the items related to snoring loudness, witnessed apneas and HBP/obesity presented a statistically positive association with AHI, with the model constituted by their association presenting a greater discrimination capability, especially for an AHI > 5 (sensitivity 65.2%, specificity 80%, positive LR 3.26).
ConclusionsThe BQ is not an appropriate screening tool for OSA in a SDBC, although snoring loudness, witnessed apneas, HBP/obesity have demonstrated being significant questionnaire elements in this population.
O Questionário de Berlim (QB), originalmente desenvolvido em língua inglesa como um instrumento de rastreio da síndrome de apneia obstrutiva do sono (SAOS) em cuidados de saúde primários, tem sido aplicado no âmbito dos cuidados secundários, com resultados variáveis.
ObjectivosObtenção da versão em língua Portuguesa do QB e avaliação da sua utilidade numa consulta de Patologia Respiratória do Sono. Material e métodos: O QB foi traduzido utilizando a metodologia back translation e aplicado, previamente ao estudo cardiorespiratório do sono, a 95 indivíduos consecutivos referenciados à consulta de patologia respiratória do sono por suspeita de SAOS. A avaliação do risco para a SAOS baseou-se nas respostas a 10 itens, organizados em 3 categorias: roncopatia e apneias presenciadas (categoria 1), sonolência diurna (categoria 2), hipertensão arterial (HTA)/obesidade (categoria 3).
ResultadosNa amostra estudada, 67,4% era do sexo masculino, com uma média de idades de 51 ± 13 anos. As categorias 1, 2 e 3 foram positivas em 91,6, 24,2 e 66,3%, respectivamente. O QB identificou 68,4% dos doentes como apresentando alto risco para a SAOS e os restantes, 31,6%, baixo risco. A sensibilidade e a especificidade do QB, considerando um índice de apneia/hipopneia (IAH) > 5, foi de 72,1 e 50,0% respectivamente, de 82,6 e 44,8% para um IAH > 15 e de 88,4 e 39,1% para um IAH > 30. Estar incluído no grupo de alto risco para a SAOS não infl uenciou significativamente a probabilidade de ter doença (likelihood ratio (LR) positivo entre 1,44–1,49). Apenas os itens referentes à intensidade sonora da roncopatia, apneias presenciadas e HTA/obesidade, apresentaram uma associação positiva estatisticamente significativa com o IAH, com o modelo constituído pela associação destes itens a apresentar uma maior capacidade de discriminação, especialmente para um IAH > 5 (sensibilidade 65,2%, especificidade 80,0%, LR Positivo 3,26).
ConclusãoO QB não é um instrumento apropriado de rastreio da SAOS numa consulta de patologia respiratória do sono, embora a intensidade da roncopatia, as apneias presenciadas e a HTA/obesidade tenham demonstrado ser elementos do questionário com expressão significativa nesta população.
Obstructive Sleep Apnea (OSA) syndrome is a common disorder that affects 9 to 24 % of adult middle-aged population,1 characterised by repeated episodes of upper airway obstruction during sleep, intermittent arterial oxygen desaturation, increasing respiratory efforts and sleep disruption.2
Diagnosis suspicion of OSA is of particular relevance, since it has been implicated with high medical 3–6 and peri-operative 7,8 morbidity and mortality and has an effective treatment available.9
Polysomnography is the gold standard test for OSA diagnosis, 10 although its availability, as also the cardiorespiratory sleep study, may be limited. In fact, with the growing recognition of OSA epidemiological relevance and pathophysiological consequences, the medical community has been confronted with a rise in requests for the diagnostic test, with implications regarding resources rationalization. 11 With this in mind, screening tools, such as questionnaires and clinical models, have been developed, trying to combine different OSA risk factors that in a cost-effective way could help clinicians in identifying patients that should be referred or have priority on the waiting list for diagnostic testing.12,13
Berlin Questionnaire (BQ), an outcome from the Sleep in Primary Care Conference, in April 1996 in Berlin, is one of the most recognized screening tools used in this area. 14 It includes 10 items organized in 3 categories concerning snoring and witnessed apneas (5 items), daytime sleepiness (4 items) and high blood pressure (HBP)/obesity (1 item). Patients are also asked to provide information on age, gender, weight, height, neck circumference and ethnicity. Predetermination of high or lower risk for OSA is based on responses to each category of items.14
Netzer and colleagues validated the BQ as a screening tool in primary care, where it demonstrated a high internal validity (Cronbach alpha correlations 0.86-0.92) and performed accurately with a higher sensitivity and specificity in OSA identification (86 and 77 %, respectively, for an apnea/hipopnea index [AHI] > 5 –positive likelihood ratio [LR] of 3.79–, 54 and 97 % for an AHI > 15, 17 and 97 % for an AHI > 30).14
More recently, BQ had been applied in tertiary settings with variable results.15–19
In general cardiovascular patients, namely in a subgroup with atrial fibrillation, BQ showed a good performance (sensitivity of 86 % and specificity of 89 % for an AHI > 5),15 although in another subgroup with resistant HBP, an adapted BQ version presented a lower specificity (sensitivity of 85.5 % and specificity of 65 % for an AHI > 10).16
The same occurred in surgical patients, where BQ demonstrated a high-moderate sensitivity and again a lower specificity (65.6 and 60 % respectively for an AHI > 5, 74.3 and 53.3 % for AHI > 15, 79.5 and 48.6 % for AHI > 30).17
Lastly, low sensitivity and specificity were found in patients undergoing pulmonary rehabilitation (62.5 and 53.8 %, respectively, for an AHI > 10, 67.2 and 52.8 % for an AHI > 15),18 as well as in a sleep clinic population (68 and 49 % respectively for an AHI > 5, 62 and 43 % for an AHI > 10, 57 and 41 % for an AHI > 15). 19 It should be noted that the last study, besides having a retrospective nature, took place in a general sleep clinic population, so BQ has never been applied as OSA screening tool in a specific sleep disordered breathing clinic (SDBC).
In addition, BQ was originally developed in English, being necessary to translate this questionnaire to other languages, so that it can be applied in other countries and contexts.
The mains goals of this study were: 1) development of BQ Portuguese version; 2) assessing the usefulness of BQ in OSA identification, compared with AHI obtained by cardiorespiratory sleep study, in a SDBC. As a secondary objective, authors analyzed the association between the BQ items and AHI.
Material and methodsTranslation of BQ to Portuguese languageAs suggested by back translation methodology, 20 BQ was submitted to a process of translation, back translation and final review of obtained versions. Questionnaire was translated into Portuguese by two bilingual translators (two pulmonary specialists doctors) working independently, thus generating two Portuguese versions. These versions were translated into English by another two bilingual individuals and then compared with the original English and discussed in order to carry out necessary adjustments to obtain a single Portuguese BQ version (Annex 1), thus ensuring meaning equivalence.
Berlin Questionnaire
Altura _____mPeso _____kgIdade _____Sexo Masculino/Feminino |
Escolha a resposta correcta para cada questão |
Categoría 1: |
1. Ressona?
|
Se ressona: |
2. O seu ressonar é:
|
3. Com que frequência ressona?
|
4. O seu ressonar alguma vez incomodou outras pessoas?
|
5. Alguma pessoa notou que parava de respirar durante o sono?
|
Categoria 2 |
6. Com que frequência se sente cansado ou fatigado depois de uma noite de sono?
|
7. Durante o dia, sente-se cansado, fatigado ou sem capacidade para o enfrentar?
|
8. Alguma vez “passou pelas brasas” ou adormeceu enquanto guiava?
|
Se respondeu sim |
9. Com que frequencia é que isso ocorre?
|
Categoria 3 |
10. Tem tensão arterial alta?
|
Pontuaao do Questionário de Berlim: |
Categoria 1: itens 1, 2, 3, 4 e 5 |
Item 1 se a resposta foi sim - 1 ponto |
Item 2 se a resposta foi c ou d - 1 ponto |
Item 3 se a resposta foi a ou b - 1 ponto |
Item 4 se a resposta foi a - 1 ponto |
Item 5 se a resposta foi a ou b - 2 pontos |
Categoria 1 é positiva se a pontuação é maior ou igual a 2 pontos |
Categoria 2: itens 6, 7 e 8 (item 9 deve ser considerado separadamente) |
Item 6 se a resposta foi a ou b - 1 ponto |
Item 7 se a resposta foi a ou b - 1 ponto |
Item 8 se a resposta foi a - 1 ponto |
Categoria 2 é positiva se a pontuaao é maior ou igual a 2 pontos |
Categoria 3 é positiva se a reposta ao item 10 é sim ou se o índice de massa corporal (IMC) do doente é superior a 30kg/m2 |
Doente de alto risco para SAOS: duas ou mais categorias com pontuaao positiva |
Doente de baixo risco para SAOS: nenhuma ou apenas uma categoria com pontuação positiva |
A prospective observational study was conducted during November and December 2008. Ninety five consecutively patients referred to a SDBC of a university hospital pulmonology department, with suspicion of OSA, who underwent cardiorespiratory sleep study and completed translated BQ Portuguese version, were included.
OSA diagnosisA domiciliary six-channel cardiorespiratory sleep study (Alphascreen; Vyasis) was the test performed for OSA diagnosis. Oronasal airflow and snoring (nasal cannula and microphone), pulse rate and arterial oxygen saturation (finger pulse oximetry), thoracoabdominal movements and body position (impedance belts) were the analysed variables. The sleep data recorded by the device were manually scored by counting apnea events (airflow cessation lasting for at least 10 seconds) and hypopnea episodes (events of airflow reduction to 20 to 50 % of the previously observed lasting for at least 10 seconds, joined with a 4 % dip in oxygen saturation), dividing the total number of these episodes by the sleep time in hours, thus obtaining the manual AHI. According to established recommendations,2 the AHI was the gold standard criteria used for OSA diagnosis and severity defining (AHI > 5/h, 15/h and 30/h respectively considered mild, moderate and severe OSA).
BQ scoreThe BQ was scored (Annex 1) as previously reported by Netzer and colleagues.14 For items in categories 1 and 2, one point is assigned in the presence or occurrence of a symptom in a persistent or frequent way (3–4 times per week). Item 5, about witnessed apneas, is an exception, so for the same assumptions are awarded two points. Category 2 presents an additional item concerning frequency of drowsiness behind the wheel (item 9) for which no punctuation is consigned. Categories 1 and 2 are positive when the sum of all items punctuation is equal to or greater than two, and category 3 in the presence of HBP21 and/or obesity (Body Mass Index [BMI] > 30). Positivity in two or three categories defines a high risk score for OSA, while positivity in only one or none defines low risk.
Statistical analysisData were described as mean ± standard deviation for quantitative variables and as counts for proportions. Statistical analysis was performed using the SPSS software (SPSS, Inc., Chicago, Illinois, USA).
BQ performance, for each cut-off of diagnosis gold standard, was evaluated by calculation of sensitivity, specificity, predictive values, LR, odds ratio and their 95 % confidence intervals, as also under the curve area (AUC).
The LR provides a direct estimation of how the BQ score changes the odds of disease: > 10 large change, 5–10 moderate change, 2–5 small change, 0.5-2 little or no change, 0.2-0.5 small change, 0.1-0.2 moderate change, < 0.1 large change.
BQ discrimination capability was considered insufficient if the area under the ROC curve was less than 0.6, acceptable when between 0.6 and 0.8 and excellent if above 0.8.
Linear regression and respectively regression coefficients were used to estimate the association between each item and the gold standard. AHI was logarithm because it showed a skewed distribution. A separated model for the each category of items was estimated, as also a global model using only the items that showed a statistically positive association with AHI (P <.1). This last model was used to calculate the modified Berlin score.
ResultsGeneral characteristics of populationNinety-five individuals, more often male (67.4 %; n = 64), with ages between 20 and 79 years (51 ± 13) were evaluated. Studied population characteristics are presented in Table 1.
General population characteristics
Age, years | 51 – 13 |
Male gender, n (%) | 64 (67.4) |
Caucasian race, n (%) | 95 (100) |
Neck circumference, cm | 42 – 4 |
Epworth sleepiness scale | 10 – 6 |
High blood pressure, n (%) | 41 (43.2) |
Body mass index | 31 – 6 |
> 30, n (%) | 52 (54.7) |
Cardiorespiratory study | |
Minimum SaO2, % | 77 – 12 |
Mean SaO2, % | 92 – 4 |
Apnea/hipopnea index, per hour events | 24 – 17 |
< 5, n (%) | 16 (16.8) |
5-15, n (%) | 33 (34.7) |
15-30, n (%) | 20 (21.1) |
> 30, n (%) | 26 (27.4) |
Berlin Questionnaire | |
Category 1, n (%) | 87 (91.6) |
Category 2, n (%) | 23 (24.2) |
Category 3, n (%) | 63 (66.3) |
SaO2: peripheral arterial oxygen saturation.
Cardiorespiratory sleep study confirmed OSA diagnosis in most of the included individuals, 83.2 % (n = 79).
BQ categories 1 and 3 had a higher percentage of positivity in 91.6 % (n = 87) and 66.3 % (n = 63) respectively, with category 2 being positive in 24.2 % (n = 23).
BQ performanceBQ identified 68.4 % (n = 65) of the patients as being in the high risk group for OSA and the remaining 31.6 % (n = 30), in the low risk (Table 2). In subjects with a high risk score, OSA diagnosis was confirmed by cardiorespiratory study in 87.7 % (n = 57), while in those with a low-risk score, diagnosis was excluded in only 26.7 % (n = 8). Most of the subjects in the high risk group, 35.4 % (n = 23), had severe OSA, 29.2 % (n = 19) presented mild OSA and 23.1 % (n = 15) moderate. Of the subjects in the low risk group, the majority, 46.7 % (n = 14), had mild OSA, followed by those 26.7 % (n = 8), in which the syndrome was excluded. The global agreement between BQ score and AHI was 68.4 % (n = 65).
The sensitivity and the specificity of BQ for OSA diagnosis were 72.1 and 50 %, respectively, for an AHI > 5, 82.6 and 44.8 % for an AHI > 15, 88.4 and 39.1 % for an AHI > 30 (Table 3).
Predictive parameters of BQ
AHI > 5 (95 % CI) | AHI > 15 (95 % CI) | AHI > 30 (95 % CI) | |
Sensitivity, % | 72.1 (61.7-81.1) | 82.6 (70.2-91.5) | 88.4 (72.9-96.9) |
Specificity, % | 50 (27.8-72.2) | 44.8 (31.7-58.6) | 39.1 (28.3-50.7) |
PPV, % | 87.7 (78.3-94.1) | 58.4 (46.4-69.8) | 35.4 (24.6-47.3) |
NPV, % | 26.7 (13.4-43.5) | 73.3 (56.4-86.6) | 90 (76.1-97.4) |
Positive LR | 1.44 (0.87-2.40) | 1.49 (1.13-1.99) | 1.45 (1.15-1.84) |
Negative LR | 0.56 (0.3-1.02) | 0.39 (0.19-0.78) | 0.29 (0.1-0.89) |
Odds ratio | 2.59 (0.86-7.9) | 3.87 (1.55-10.5) | 4.93 (1.52-22.2) |
AUC | 0.611 | 0.637 | 0.638 |
AHI: apnea/hipopnea index; AUC: area under the curve; BQ: Berlin Questionnaire; LR: likelihood ratio; NPV: negative predictive value; PPV: positive predictive value.
A small change in the probability of not having moderate or severe OSA (negative LR of 0.39 [AHI > 15] and 0.29 [AHI > 30]) was found in subjects with a BQ low risk score. In the remaining situations, BQ score demonstrated only a little or no change in the disease probability.
The obtained AUC of 0.611, 0.637 and 0.638, respectively for different AHI cut-offs, showed that BQ discrimination capability is within the limit of acceptable.
Discrimination capability of BQ itemsUnivariate linear regression demonstrated that in category 1, only the items 2 and 5 showed a statistically significant positive association with AHI that remained after multivariable adjustment (Table 4). In relation to items in category 2, none showed a significant crude or adjusted association with the gold standard. Category 3 showed a significant positive association with AHI in both, the univariate and multivariable analysis (Table 4).
Linear regression of BQ items with statistically significant positive association with AHI logarithm
Crude coefficient (95 % CI) | Adjusted coefficient (95 % CI) | |
Category 1 | ||
2. Snoring loudness | 0.67 (0.2-1.14) | 0.79 (0.34-1.25) |
5. Witnessed apnea | 0.53 (0.07-0.99) | 0.36 (0.04-0.77) |
Category 3 | ||
10. HBP/obesity | 0.91 (0.53-1.28) | 0.77 (0.4-1.14) |
AUC | ||
AHI > 5 | 0.812 | |
AHI > 15 | 0.73 | |
AHI > 30 | 0.695 |
AHI: apnea/hipopnea index; AUC: area under the curve;
BQ: Berlin Questionnaire; HBP: high blood pressure.
A final model, considering only the items that showed a positive statistically significant association with AHI, was estimated, verifying an increase in BQ discrimination capability for all AHI cut-offs (AUC of 0.812, 0.73, 0.695, respectively) (Table 4).
Considering this last model, a modified Berlin score was calculated: two points for items 2 and 10, one point for item 5. The modified Berlin score demonstrated again a greater discrimination capability compared with the original BQ, especially for an AHI > 5, (sensitivity 65.2 %; specificity 80 %, positive LR 3.26, AUC 0.795) (Table 5).
Performance of modified BQ
AIH > 5 (95 % CI) | AIH > 15 (95 % CI) | AIH > 30 (95 % CI) | |
Sensitivity, % | 65.2(54–75.5) | 78.5(64.8-88.9) | 81.8(63.1-93.7) |
Specificity, % | 80(56.8-94.2) | 62.2(47.9-75.2) | 50.7(38.8-62.6) |
PPV, % | 94 (85.1-98.4) | 66 (52.4-77.9) | 36 (23.7-49.6) |
NPV, % | 32.4(19.1-48) | 75.7(60.7-87.3) | 89.2(76.7-96.5) |
Positive LR | 3.26(1.17-9.1) | 2.07(1.38-3.12) | 1.66(1.21-2.28) |
Negative LR | 0.43(0.28-0.65) | 0.34(0.18-0.64) | 0.35(0.14-0.9) |
Odds ratio | 7.52(2.15-35.3) | 6.04(2.4-16.34) | 4.64(1.53-17.4) |
AUC | 0.795 | 0.72 | 0.671 |
AHI: apnea/hipopnea index; AUC: area under the curve;
BQ: Berlin Questionnaire; LR: likelihood ratio; NPV: negative predictive value; PPV: positive predictive value.
In face of most SDBC having long waiting lists, it seems attractive to use a screening tool to prioritize patients needing OSA diagnosis test according to probability that they might have a positive result.
When applied to a SDBC population, BQ demonstrated not to be a good test neither for OSA diagnosis nor for severity defining, presenting for different AHI cut-off points a moderate to high sensitivity (72.1 to 88.4 %), a specificity which in whole was low (39.1 to 50 %) and a lower positive LR (1.44 to 1.49).
Our findings reflected, on the one hand, the high frequency of OSA in this population (83.2 %), which influences the screening test operating characteristics, as also the large number of both false positives and false negatives demonstrated by BQ.
Most enrolled subjects (68.4 %) had a BQ high risk score, as a result of population pre selection at the time of consultation referral, which is predominantly composed of snoring, hypertensive and obese subjects, as noted by the high prevalence of positivity in categories 1 (91.6 %) and 3 (66.3 %).
Besides the above, and although 87.7 % of patients with BQ high risk score presented OSA, the same has not happened in individuals at low risk, where the sleep study ruled out OSA in only 26.7 % of subjects, contributing to poor overall agreement between the two tests (68.4 %).
Although polysomnography remains the gold standard, cardiorespiratory sleep study, having demonstrating a high accuracy in OSA diagnosis,22 and like occurred in the original study of Netzer,14 was the test applied.
BQ performance in this SDBC population is in accordance with results obtained in other studies undertaken in tertiary health care setting, particularly in resistant HBP 16 and surgical patients.17
In patients undergoing pulmonary rehabilitation 18 and in the retrospective study that took place in a general sleep clinic, 19 the obtained results, compared with those in our population, were even less satisfactory. Noteworthy, in the last case, the association of the sleep clinic with the psychiatry department of a tertiary hospital, which contributed to the high proportion of patients with positivity in category 2 (74.6 %) and was postulated by the authors as may have influenced BQ performance. To confirm this hypothesis a shortened BQ version was developed, excluding category 2 and item 4 of category 1, in this case because it was poorly answered. Although BQ sensitivity has increased, specificity remained low (80 and 42 % respectively, for an AHI > 5), similar to the obtained in the original questionnaire, corroborating the importance of other factors, in addition to daytime sleepiness, in the clinical picture of this syndrome. In fact, when analyzing the association of category 2 with AHI, the same authors found no significant correlations.
As previously reported, in our population, none of the category 2 items was associated with the AHI and only items, 2 and 5, of category 1, referring to snoring loudness and witnessed apneas, and category 3, referring to HBP/obesity, were significantly associated to the diagnosis gold standard.
The model established by these three BQ items association had a greater ability to discrimínate OSA for all AHI cut-off points, compared to the original BQ, especially for an AHI > 5. The abbreviated BQ version and the modified score confirmed these findings, although it didn't present enough predictive accuracy, so that only these items could be recommended in OSA identification in a SDBC.
A recent meta-analysis of screening tests for OSA 12 corroborated some of these data, having found that BMI, history of hypertension and nocturnal choking are significant test elements in the more accurate prediction models. Besides, it also demonstrated that the Epworth sleepiness scale was the least accurate predictive questionnaire used in this ambit, possibly because excessive daytime sleepiness occurs commonly in obese individuals without OSA, driven by mechanism other than nighttime sleep deprivation.23
Although BQ remains the most accurate questionnaire for predicting OSA diagnosis,12 it is not an appropriate screening tool for a high risk population in a SDBC. BQ was translated into Portuguese language and can thus be used by our country clinicians, in the contexts in which it is validated.
Conflict of interestAuthors declare that they don't have any conflict of interest.