The purpose of this post is to explain the concept of sensitivity, specificity, predictive values, and likelihood ratios.
Screening tests (surveillance tests) are tools use to assess the likelihood that a patient may have a certain disease. They are not definitive, but if positive, will heighten suspicion that would warrant use of a gold standard diagnostic test to rule in or rule out a certain diagnosis. The goal of screening tests is to reduce the morbidity and mortality in a population group (Maxim, Niebo, & Utell, 2014). Examples of screening tests include routine EKGs, PSA, PAP smears, and mammograms. For example, a male with an elevated PSA may have prostate cancer, BPH, or prostatitis. Positive results of screening tests need to be compared to the established gold standard test that is regarded as definitive. In this case, a prostate biopsy is considered a definitive test, as it will reveal the etiology of the elevated PSA. Screening tests are less invasive and less costly, whereas the gold standard test may be more invasive, expensive, or too late (discovered during an autopsy). Ideally, gold standard tests, such as coronary angiography, breast biopsy, or colposcopy should have 100% sensitivity and specificity. However, in reality, this may not be the case, as it may be the best test given the clinical picture at the time (Maxim, Niebo, & Utell, 2014).
Sensitivity and Specificity
Sensitivity and specificity are two measures that evaluate the performance of medical tests. Sensitivity refers to the ability of a test to correctly identify those patients who have the disease, whereas, specificity refers to the ability of a test to correctly identify those patients who do NOT have the disease (Akobeng, 2007). A perfect clinical test would yield 100% sensitivity and specificity. However, most tests do not achieve such performance.
The terms “sensitivity, specificity, and positive/negative predictive values” all refer the diagnostic utility of a certain test. It is important to remember that these are based on disease prevalence, which varies depending on the population being tested.
A test that is highly sensitive is useful for ruling out a disease if one test is negative for it. Likewise, a test that is highly specific is useful for ruling in a disease if one test is positive for it. There are times when a clinician would want a test with low sensitivity and high specificity and vice versa. For example, the sensitivity and specificity of the nitrate dipstick is 27% and 94%, respectively. If a woman, whose symptomatology is suggestive of a UTI tests negative, would that mean that she does not have a UTI? With the sensitivity of the test being low, the clinician cannot tell. However, if the dipstick came back positive, she may actually have a UTI as the specificity is high (van Stralen et al., 2009). When assessing sensitivity and specificity, remember the mnemonic:
- SnNOUT – high sensitivity, Negative test Rule out
- SpPIN – high specificity, Positive test Rule in
The limitation of utilizing sensitivity and specificity for a clinical test is that they have no practical use to help clinicians estimate the probability that a patient has the disease they are testing for. If a test result comes back positive, patients would usually ask what his or her chance (i.e. probability) that she/he will have the disease. The converse would be true if the test result comes back negative. These questions assess the positive and negative predictive values (PPV, NPV) of a diagnostic test; that is, they describe a patient’s probability of a having a disease when their result is known. The drawback of utilizing PPV and NPV is that they vary based on the population chosen and disease prevalence, and should not be transferred from one setting or patient to another (Attia, 2003). For example, the screening for SLE utilizes the ANA test. In the general population, this would yield a low PPV. However, screening for SLE with the same test in patients who present with malar rash and joint pain would yield a high PPV (Lalkhen & McCluskey, 2008).
Speaking of probability, let’s assess what the probability would be that a patient has a pulmonary embolus who presents with symptoms of chest pain and dyspnea. That depends on the patient’s pre-test probability. Pre-test probability is the estimate that a patient has a disease without any test and is based on symptoms, complaints, and or predetermined probability. For example, what is the pre-test probability that a 19 y.o female has a positive stress test versus a 78 y/o male with diabetes and extensive pack year smoking history? Pre-test probability is based on clinical intuition, clinical experience, known risks factors, and gestalt. It is based on the prevalence of a disease, which can change, as the population changes. Population can mean just the general population or a defined group of people who are at risk for a particular disease. For example, the incidence of appendicitis is low in the general population, but one can see that the prevalence of appendicitis is much higher for an ED patient who presents with RLQ abdominal pain (The NNT, 2019).
Based on more “testing”, which may include physical assessment, history, ECG, radiologic testing, laboratory workup, a patient’s pretest probability may be low or high, which the clinician may rule out or rule in the disease, respectively. Thus, if the above patient comes in with complaints of SOB and chest pain and also had surgery 2 weeks ago, what is your pre-test probability of the patient having a PE? The patient’s Well’s score is at least a 4.5 (3 for PE more likely than other alternatives and 1.5 for surgery in last 4 weeks). We have not assessed their vital signs yet, as that criteria adds to the pre-test probability that a PE is likely. The combination of those factors establishes a moderate probability that a PE is likely (20.5%), with a positive likelihood ratio of 1.3 and a negative likelihood ratio of 0.7% (Family Practice Notebook, 2018). You will need a definitive test such as a CT angiogram. However, if the pre-test probability is low (that is a Well’s score between 0-2 points) and the patient’s d-dimer is negative, the clinician can exclude PE.
Based on the result of the additional testing, the clinician will arrive at a new probability of a disease. This is referred to as the post-test probability. Applying likelihood ratios allows clinicians to estimate the probability that the patient has the disease in question. Likelihood ratios (LRs) are not dependent on prevalence of a disease. They are based on a known sensitivity and specificity of the test. LRs allow clinicians to quantitate the probability that a patient has the disease, with larger positive LRs indicating that the patient has a greater likelihood of having the disease and vice versa. The application of Bayes’ theorem provides the clinician with the odds of having or not having a disease. Bayes’ theorem is the utilization of context in decision making as no test is 100% accurate. However, if context is applied, it will help increase the accuracy of the diagnostic test. Bayes’ theorem converts the result of the test into an actual probability. It relates the actual probability to the test probability.
Measurements of accuracy of a test is useful, but clinicians have prior assumptions about a patient’s chance of having a disease based on clinical picture, gestalt, and disease prevalence in a population. This forms the clinician’s impression of the clinical picture (i.e. probability of a disease). This totality will influence necessary workup to increase the post-test probability of a disease. While these measures of accuracy and probability help rule in or rule out a disease, they do not override clinical judgment and the “gut” feeling.
Akobeng, A.K. (2007). Understanding diagnostic tests 1: sensitivity, specificity and predictive values. Acta Paediatrica, 96: 338-341.
Attia, J. (2003). Moving beyond sensitivity and specificity: using likelihood ratios to help interpret diagnostic tests. Aust Preser, 26: 111-113.
Lalkhen, A.G., & McCluskey, A. (2008). Clinical tests: sensitivity and specificity. Continuing Education in Anesthesia, Critical Care, & Pain, 8(6): 221-223.
Maxin, L.D., Niebo, R., & Utell, M.J. (2014). Screening tests: a review with examples. Inhal Toxicol, 26(13): 811-828.
Van Stralen, K.J., Stel, V.S., Reitsma., J.B., Dekker, F.W., Zoccali, C., & Jager, K.J. (2009). Diagnostic methods I: sensitivity, specificity, and other measures of accuracy. Kidney International, 75: 1257-1263.
Diagnostics and Likelihood Ratios, Explained. (2019). Retrieved on January 17, 2019 from http://www.thennt.com/diagnostics-and-likelihood-ratios-explained/
Pulmonary embolism pre-test probability. Family Practice Notebook (2018). Retrieved January 21, 2009 from https://fpnotebook.com/Lung/Exam/PlmnryEmblsmPrtstPrblty.htm