(statistics) The ratio of the probability of occurrence of an event to the probability of the event not occurring.
| Sci-Tech Dictionary: odds ratio |
(statistics) The ratio of the probability of occurrence of an event to the probability of the event not occurring.
| 5min Related Video: Odds ratio |
| Encyclopedia of Public Health: Odds Ratio |
The odds ratio (OR) provides a measure of the strength of relationship between two variables,
Table 1
| Frequencies in a 2 × 2 Table. | ||
| OUTCOME +ve | OUTCOME –ve | |
| SOURCE: Courtesy of author. | ||
| Exposure (outcome positive) | a | b |
| Exposure (outcome negative) | c | d |
most commonly an exposure and a dichotomous outcome. It is most commonly used in a case-control study where it is defined as "the ratio of the odds of being exposed in the group with the outcome to the odds of being exposed in the group without the outcome." In the standard 2×2 epidemiological table, this ratio can be expressed as the "cross-product" (ad/bc), as seen in Table 1.
This concept can be extended to a situation with multiple levels of exposure (e.g., low, moderate, or high exposure to an environmental containment). One exposure level is assigned as the "reference" level. For each of the remaining exposure levels, one divides the odds of that exposure level in the outcome positive group (compared with the reference level) by the odds of that exposure level in the outcome negative group.
The OR ranges in value from 0 to infinity. Values close to 1.0 indicate no relationship between the exposure and the outcome. Values less than 1.0 suggest a protective effect, while values greater than 1.0 suggest a causative or adverse effect of exposure.
The OR is closely connected to logistic regression. This analytic method models the natural logarithm of the OR as a linear function of the predictor variables. It is a powerful and very common method for the analysis of epidemiological studies.
The OR is one of the most common measures encountered in observational epidemiology. The value of the OR for case-control research was first
Table 2
| Frequencies of Erysipelas by Obesity | ||
| erysipelas | No erysipelas | |
| SOURCE: Courtesy of author. | ||
| Obese | 68 | 97 |
| Non-obese | 61 | 197 |
recognized by Jerome Cornfield in 1951. His work provided the theoretical base for the application of the case-control approach to studying disease etiology. The OR estimates the incidence-density ratio or the cumulative incidence ratio that would have been observed if it had been feasible to perform a cohort study rather than a case-control study. Depending on the method used to obtain control subjects, the OR either is identical to one of the incidence ratios or is close to them if the disease is rare. Some epidemiologists modify the term to reflect the type of study being done (e.g., prevalence odds ratio, exposure odds ratio, or disease odds ratio).
Although mainly used for the analysis of case-control studies, the odds ratio can also be applied in cross-sectional and cohort studies. It also plays a major role in certain approaches to the metaanalysis of randomized clinical trials (e.g., the Peto method).
An example of the use of the odds ratio can be found in a paper published by A. Dupuy et al. This paper studied 129 patients with erysipelas of the leg and a control group of 294 people without erysipelas of the leg. Obesity was considered as a risk factor. Analysis of the data produced the 2×2 table shown in Table 2.
This gives an OR of (68×197)/(61×97) or 2.3. That is, people with erysipelas are 2.3 times more likely to be obese than people without erysipelas. This supports the suggestion that obesity increases the risk of developing erysipelas.
(SEE ALSO: Case-Control Study; Epidemiology; Statistics for Public Health)
Bibliography
Dupuy, A.; Benchikhi, H.; Roujeau, J. C.; Bernard, P.; Vaillant, L.; Chosidow, O.; Sassolas, B.; Guillaume, J. C.; Grob, J. J.; and Bastuji-Garin, S. (1999). "Risk Factors for Erysipelas of the Leg (Cellulitis): Case-Control Study." British Medical Journal 318:1591–1594.
— GEORGE WELLS
| Wikipedia: Odds ratio |
The odds ratio [1][2][3] is a measure of effect size, describing the strength of association or non-independence between two binary data values. It is used as a descriptive statistic, and plays an important role in logistic regression. Unlike other measures of association for paired binary data such as the relative risk, the odds ratio treats the two variables being compared symmetrically, and can be estimated using some types of non-random samples.
The odds ratio is the ratio of the odds of an event occurring in one group to the odds of it occurring in another group, or to a sample-based estimate of that ratio. These groups might be men and women, an experimental group and a control group, or any other dichotomous classification. If the probabilities of the event in each of the groups are p1 (first group) and p2 (second group), then the odds ratio is:

where qx = 1 − px. An odds ratio of 1 indicates that the condition or event under study is equally likely to occur in both groups. An odds ratio greater than 1 indicates that the condition or event is more likely to occur in the first group. And an odds ratio less than 1 indicates that the condition or event is less likely to occur in the first group. The odds ratio must be greater than or equal to zero if it is defined. It is undefined if p2q1 equals zero.[1]
The odds ratio can also be defined in terms of the joint probability distribution of two binary random variables. The joint distribution of binary random variables X and Y can be written
| Y = 1 | Y = 0 | |
| X = 1 | p11 | p10 |
| X = 0 | p01 | p00 |
where p11, p10, p01 and p00 are non-negative "cell probabilities" that sum to one. The odds for Y within the two subpopulations defined by X = 1 and X = 0 are defined in terms of the conditional probabilities given X:
| Y = 1 | Y = 0 | |
| X = 1 | p11 / (p11 + p10) | p10 / (p11 + p10) |
| X = 0 | p01 / (p01 + p00) | p00 / (p01 + p00) |
Thus the odds ratio is

The simple expression on the right, above, is easy to remember as the product of the probabilities of the "concordant cells" (X = Y) divided by the product of the probabilities of the "discordant cells" (X ≠ Y). However note that in some applications the labeling of categories as zero and one is arbitrary, so there is nothing special about concordant versus discordant values in these applications.
If we had calculated the odds ratio based on the conditional probabilities given Y,
| Y = 1 | Y = 0 | |
| X = 1 | p11 / (p11 + p01) | p10 / (p10 + p00) |
| X = 0 | p01 / (p11 + p01) | p00 / (p10 + p00) |
we would have gotten the same result

Other measures of effect size for binary data such as the relative risk do not have this symmetry property.
If X and Y are independent, their joint probabilities can be expressed in terms of their marginal probabilities px = P(X = 1) and py = P(Y = 1), as follows
| Y = 1 | Y = 0 | |
| X = 1 | pxpy | px(1 − py) |
| X = 0 | (1 − px)py | (1 − px)(1 − py) |
In this case, the odds ratio equals one, and conversely the odds ratio can only equal one if the joint probabilities can be factored in this way. Thus the odds ratio equals one if and only if X and Y are independent.
The odds ratio is a function of the cell probabilities, and conversely, the cell probabilities can be recovered given knowledge of the odds ratio and the marginal probabilities P(X = 1) = p11 + p10 and P(Y = 1) = p11 + p01. If the odds ratio R differs from 1, then

where p1• = p11 + p10, p•1 = p11 + p01, and

In the case where R = 1, we have independence, so p11 = p1•p•1.
Once we have p11, the other three cell probabilities can easily be recovered from the marginal probabilities.
Suppose that in a sample of 100 men, 90 have drunk wine in the previous week, while in a sample of 100 women only 20 have drunk wine in the same period. The odds of a man drinking wine are 90 to 10, or 9:1, while the odds of a woman drinking wine are only 20 to 80, or 1:4 = 0.25:1. The odds ratio is thus 9/0.25, or 36, showing that men are much more likely to drink wine than women. Using the above formula for the calculation yields the same result:

The above example also shows how odds ratios are sometimes sensitive in stating relative positions: in this sample men are 90/20 = 4.5 times more likely to have drunk wine than women, but have 36 times the odds. The logarithm of the odds ratio, the difference of the logits of the probabilities, tempers this effect, and also makes the measure symmetric with respect to the ordering of groups. For example, using natural logarithms, an odds ratio of 36/1 maps to 3.584, and an odds ratio of 1/36 maps to −3.584.
Several approaches to statistical inference for odds ratios have been developed.
One approach to inference uses large sample approximations to the sampling distribution of the log odds ratio (the natural logarithm of the odds ratio). If we use the joint probability notation defined above, the population log odds ratio is

If we observe data in the form of a contingency table
| Y = 1 | Y = 0 | |
| X = 1 | n11 | n10 |
| X = 0 | n01 | n00 |
then the probabilities in the joint distribution can be estimated as
| Y = 1 | Y = 0 | |
| X = 1 | ![]() |
![]() |
| X = 0 | ![]() |
![]() |
where p̂ = nij / n, with n = n11 + n10 + n01 + n00 being the sum of all four cell counts. The sample log odds ratio is
.The standard error for the log odds ratio is approximately
.This is an asymptotic approximation, and will not give a meaningful result if any of the cell counts are very small. If L is the sample log odds ratio, an approximate 95% confidence interval for the population log odds ratio is L ± 2SE. This can be mapped to exp(L − 2SE), exp(L + 2SE) to obtain a 95% confidence interval for the odds ratio. If we wish to test the hypothesis that the population odds ratio equals one, the two-sided p-value is 2P(Z< −|L|/SE), where P denotes a probability, and Z denotes a standard normal random variable.
An alternative approach to inference for odds ratios looks at the distribution of the data conditionally on the marginal frequencies of X and Y. An advantage of this approach is that the sampling distribution of the odds ratio can be expressed exactly.
Logistic regression is one way to generalize the odds ratio beyond two binary variables. Suppose we have a binary response variable Y and a binary predictor variable X, and in addition we have other predictor variables Z1, ..., Zp that may or may not be binary. If we use multiple logistic regression to regress Y on X, Z1, ..., Zp, then the estimated coefficient
for X is related to a conditional odds ratio. Specifically, at the population level

so
is an estimate of this conditional odds ratio. The interpretation of
is as an estimate of the odds ratio between Y and X when the values of Z1, ..., Zp are held fixed.
If the data form a "population sample", then the cell probabilities p̂ij are interpreted as the frequencies of each of the four groups in the population as defined by their X and Y values. In many settings it is impractical to obtain a population sample, so a selected sample is used. For example, we may choose to sample units with X = 1 with a given probability f, regardless of their frequency in the population (which would necessitate sampling units with X = 0 with probability 1 − f). In this situation, our data would follow the following joint probabilities:
| Y = 1 | Y = 0 | |
| X = 1 | fp11 / (p11 + p10) | fp10(p11 + p10) |
| X = 0 | (1 − f)p01 / (p01 + p00) | (1 − f)p00 / (p01 + p00) |
The odds ratio p11p00 / p01p10 for this distribution does not depend on the value of f. This shows that the odds ratio (and consequently the log odds ratio) is invariant to non-random sampling based on one of the variables being studied. Note however that the standard error of the log odds ratio does depend on the value of f. This fact is exploited in two important situations:
In both these settings, the odds ratio can be calculated from the selected sample, without biasing the results relative to what would have been obtained for a population sample.
Due to the widespread use of logistic regression, the odds ratio is widely used in many fields of medical and social science research. The odds ratio is commonly used in survey research, in epidemiology, and to express the results of some clinical trials, such as in case-control studies. It is often abbreviated "OR" in reports. When data from multiple surveys is combined, it will often be expressed as "pooled OR".
In clinical studies, as well as in some other settings, the parameter of greatest interest is often the relative risk rather than the odds ratio. The relative risk is best estimated using a population sample, but if the rare disease assumption holds, the odds ratio is a good approximation to the relative risk — the odds is p / (1 − p), so when p moves towards zero, 1 − p moves towards 1, meaning that the odds approaches the risk, and the odds ratio approaches the relative risk[4]. When the rare disease assumption does not hold, the odds ratio can overestimate the relative risk[5][6][7].
If the absolute risk in the control group is available, conversion between the two is calculated by:[5]:

where:
The sample odds ratio n11n00 / n10n01 is easy to calculate, and for moderate and large samples performs well as an estimator of the population odds ratio. When one or more of the cells in the contingency table can have a small value, the sample odds ratio can be biased and exhibit high variance. A number of alternative estimators of the odds ratio have been proposed to address this issue. One alternative estimator is the conditional maximum likelihood estimator, which conditions on the row and column margins when forming the likelihood to maximize (as in Fisher's exact test)[8]. Another alternative estimator is the Mantel-Haenszel estimator.
The following four contingency tables contain observed cell counts, along with the corresponding sample odds ratio (OR) and sample log odds ratio (LOR):
| OR = 1, LOR = 0 | OR = 1, LOR = 0 | OR = 4, LOR = 1.39 | OR = 0.25, LOR = −1.39 | |||||
|---|---|---|---|---|---|---|---|---|
| Y = 1 | Y = 0 | Y = 1 | Y = 0 | Y = 1 | Y = 0 | Y = 1 | Y = 0 | |
| X = 1 | 10 | 10 | 100 | 100 | 20 | 10 | 10 | 20 |
| X = 0 | 5 | 5 | 50 | 50 | 10 | 20 | 20 | 10 |
The following joint probability distributions contain the population cell probabilities, along with the corresponding population odds ratio (OR) and population log odds ratio (LOR):
| OR = 1, LOR = 0 | OR = 1, LOR = 0 | OR = 16, LOR = 2.77 | OR = 0.67, LOR = −0.41 | |||||
|---|---|---|---|---|---|---|---|---|
| Y = 1 | Y = 0 | Y = 1 | Y = 0 | Y = 1 | Y = 0 | Y = 1 | Y = 0 | |
| X = 1 | 0.2 | 0.2 | 0.4 | 0.4 | 0.4 | 0.1 | 0.1 | 0.3 |
| X = 0 | 0.3 | 0.3 | 0.1 | 0.1 | 0.1 | 0.4 | 0.2 | 0.4 |
| Example 1: risk reduction | Example 2: risk increase | ||||
|---|---|---|---|---|---|
| Experimental group (E) | Control group (C) | Total | (E) | (C) | |
| Events (E) | EE = 15 | CE = 100 | 115 | EE = 75 | CE = 100 |
| Non-events (N) | EN = 135 | CN = 150 | 285 | EN = 75 | CN = 150 |
| Total subjects (S) | ES = EE + EN = 150 | CS = CE + CN = 250 | 400 | ES = 150 | CS = 250 |
| Event rate (ER) | EER = EE / ES = 0.1, or 10% | CER = CE / CS = 0.4, or 40% | N/A | EER = 0.5 (50%) | CER = 0.4 (40%) |
| Equation | Variable | Abbr. | Example 1 | Example 2 |
|---|---|---|---|---|
| EER − CER | < 0: absolute risk reduction | ARR | (−)0.3, or (−)30% | N/A |
| > 0: absolute risk increase | ARI | N/A | 0.1, or 10% | |
| (EER − CER) / CER | < 0: relative risk reduction | RRR | (−)0.75, or (−)75% | N/A |
| > 0: relative risk increase | RRI | N/A | 0.25, or 25% | |
| 1 / (EER − CER) | < 0: number needed to treat | NNT | (−)3.33 | N/A |
| > 0: number needed to harm | NNH | N/A | 10 | |
| EER / CER | relative risk | RR | 0.25 | 1.25 |
| (EE / EN) / (CE / CN) | odds ratio | OR | 0.167 | 1.5 |
| EE / (EE + CE) − EN / (EN + CN) | attributable risk | AR | (−)0.34, or (−)34% | 0.095, or 9.5% |
| (RR − 1) / RR | attributable risk percent | ARP | N/A | 20% |
| 1 − RR (or 1 − OR) | PF | 0.75, or 75% | N/A |
|
||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||
This entry is from Wikipedia, the leading user-contributed encyclopedia. It may not have been reviewed by professional editors (see full disclaimer)
| Mantel–Haenszl technique | |
| uniform association model | |
| Relative Risk |
| What the ratio of odd to even digits in 7921? Read answer... | |
| What are odds? Read answer... | |
| How odd are you? Read answer... |
| How do you calculate adjusted odds ratio? | |
| How do you interpret the adjusted odds ratio? | |
| Explain odds ratio 0.99? |
Copyrights:
![]() | Sci-Tech Dictionary. McGraw-Hill Dictionary of Scientific and Technical Terms. Copyright © 2003, 1994, 1989, 1984, 1978, 1976, 1974 by McGraw-Hill Companies, Inc. All rights reserved. Read more | |
![]() | Encyclopedia of Public Health. Encyclopedia of Public Health. Copyright © 2002 by The Gale Group, Inc. All rights reserved. Read more | |
![]() | Wikipedia. This article is licensed under the Creative Commons Attribution/Share-Alike License. It uses material from the Wikipedia article "Odds ratio". Read more |