(statistics) A sample obtained by a procedure that incorporates a systematic error introduced by taking items from a wrong population or by favoring some elements of a population.
| Sci-Tech Dictionary: biased sample |
(statistics) A sample obtained by a procedure that incorporates a systematic error introduced by taking items from a wrong population or by favoring some elements of a population.
| 5min Related Video: Sampling bias |
| Investment Dictionary: Sample Selection Bias |
A type of bias caused by choosing non-random data for statistical analysis. The bias exists due to a flaw in the sample selection process, where a subset of the data is systematically excluded due to a particular attribute. The exclusion of the subset can influence the statistical significance of the test, or produce distorted results.
Investopedia Says:
Survivorship bias is a common type of sample selection bias. For example, when back-testing an investment strategy on a large group of stocks, it may be convenient to look for securities that have data for the entire sample period. If we were going to test the strategy against 15 years worth of stock data, we might be inclined to look for stocks that have complete information for the entire 15-year period. However, eliminating a stock that stopped trading, or shortly left the market, would input a bias in our data sample. Since we are only including stocks that lasted the 15-year period, our final results would be flawed, as these performed well enough to survive the market.
Related Links:
Learn how to follow the efficient frontier to better returns. Modern Portfolio Theory Stats Primer
See why investors today still follow this set of principles to reduce risk and increase returns through diversification. Modern Portfolio Theory: An Overview
This technique can reduce uncertainty in estimating future outcomes. Introduction To Monte Carlo Simulation
Learn to predict future events through a series of random trials. Monte Carlo Simulation With GBM
Use these calculations to uncover the risk involved in your investments. Using Historical Volatility To Gauge Future Risk
Volatility is not the only way to measure risk. Learn about the "new science of risk management". Introduction to Value at Risk (VAR) - Part 1
Volatility is not the only way to measure risk. Learn about the "new science of risk management". Introduction to Value at Risk (VAR) - Part 2
| Sports Science and Medicine: biased sample |
In statistics, a population sample that is not a fair reflection of the parent population.
| Wikipedia: Sampling bias |
| It has been suggested that Ascertainment bias be merged into this article or section. (Discuss) |
A biased sample is a statistical sample of a population (or non-human factors) in which all participants are not equally balanced or objectively represented.[1] It results from sampling bias (systematic error due to a non-random sample of a population[2]), causing some members of the population to be less likely to be included than others. If the bias makes estimation of population parameters impossible, the sample is a non-probability sample.
It is also called ascertainment bias.[3][4] Ascertainment bias has basically the same definition,[5][6] but is still sometimes classified as a separate type of bias.[5]
Contents |
Sampling bias is mostly classified as a subtype of selection bias[7], sometimes specifically termed sample selection bias[8][9], but some classifies it as a separate type of bias[10]. A distinction, albeit not universally accepted, of sampling bias is that it undermines the external validity of a test (the ability of its results to be generalized to the rest of the population), while selection bias mainly addresses internal validity for differences or similarities found in the sample at hand. In this sense, errors occurring in the process of gathering the sample or cohort cause sampling bias, while errors in any process thereafter cause selection bias.
However, selection bias and sampling bias are often used synonymously.[11]
A biased sample causes problems because any statistic computed from that sample has the potential to be consistently erroneous. The bias can lead to an over- or under-representation of the corresponding parameter in the population. Almost every sample in practice is biased because it is practically impossible to ensure a perfectly random sample. If the degree of under-representation is small, the sample can be treated as a reasonable approximation to a random sample. Also, if the group that is under-represented does not differ markedly from the other groups in the quantity being measured, then a random sample can still be a reasonable approximation.
The word bias in common usage has a strong negative word connotation, and implies a deliberate intent to mislead or other scientific fraud. In statistical usage, bias merely represents a mathematical property, no matter if it is deliberate or either unconscious or due to imperfections in the instruments used for observation. While some individuals might deliberately use a biased sample to produce misleading results, more often, a biased sample is just a reflection of the difficulty in obtaining a truly representative sample.
Some samples use a biased statistical design which nevertheless allows the estimation of parameters. The U.S. National Center for Health Statistics. for example, deliberately oversamples from minority populations in many of its nationwide surveys in order to gain sufficient precision for estimates within these groups.[12] These surveys require the use of sample weights (see below) to produce proper estimates across all racial and ethnic groups. Provided that certain conditions are met (chiefly that the sample is drawn randomly from the entire sample) these samples permit accurate estimation of population parameters.
A classic example of a biased sample and the misleading results it produced occurred in 1936. In the early days of opinion polling, the American Literary Digest magazine collected over two million postal surveys and predicted that the Republican candidate in the U.S. presidential election, Alf Landon, would beat the incumbent president, Franklin Roosevelt by a large margin. The result was the exact opposite. The Literary Digest survey represented a sample collected from readers of the magazine, supplemented by records of registered automobile owners and telephone users. This sample included an over-representation of individuals who were rich, who, as a group, were more likely to vote for the Republican candidate. In contrast, a poll of only 50 thousand citizens selected by George Gallup's organization successfully predicted the result, leading to the popularity of the Gallup poll.
Another classic example occurred in the 1948 Presidential Election. On Election night, the Chicago Tribune printed the headline DEWEY DEFEATS TRUMAN, which turned out to be mistaken. In the morning the grinning President-Elect, Harry S. Truman, was photographed holding a newspaper bearing this headline. The reason the Tribune was mistaken is that their editor trusted the results of a phone survey. Survey research was then in its infancy, and few academics realized that a sample of telephone users was not representative of the general population. Telephones were not yet widespread, and those who had them tended to be prosperous and have stable addresses. (In many cities, the Bell System telephone directory contained the same names as the Social Register.) In addition, the Gallup poll that the Tribune based its headline on was over two weeks old at the time of the printing.[14]
If entire segments of the population are excluded from a sample, then there are no adjustments that can produce estimates that are representative of the entire population. But if some groups are underrepresented and the degree of underrepresentation can be quantified, then sample weights can correct the bias.
For example, a hypothetical population might include 10 million men and 10 million women. Suppose that a biased sample of 100 patients included 20 men and 80 women. A researcher could correct for this imbalance by attaching a weight of 2.5 for each male and 0.625 for each female. This would adjust any estimates to achieve the same expected value as a sample that included exactly 50 men and 50 women, unless men and women differed in their likelihood of taking part in the survey.
| This article includes a list of references or external links, but its sources remain unclear because it has insufficient inline citations. Please help to improve this article by introducing more precise citations where appropriate. (April 2009) |
|
|||||||||||
This entry is from Wikipedia, the leading user-contributed encyclopedia. It may not have been reviewed by professional editors (see full disclaimer)
| Public Opinion (American history) | |
| Sampling | |
| Correct sampling |
| What is a sample that is taken without bias? Read answer... | |
| Sample taken without bias? Read answer... | |
| Can one correct for bias by using a large sample? Read answer... |
| What are the different methods of bias sampling? | |
| A sample taken without bias? | |
| How can studies of samples be biased? |
Copyrights:
![]() | Sci-Tech Dictionary. McGraw-Hill Dictionary of Scientific and Technical Terms. Copyright © 2003, 1994, 1989, 1984, 1978, 1976, 1974 by McGraw-Hill Companies, Inc. All rights reserved. Read more | |
![]() | Investment Dictionary. Copyright ©2000, Investopedia.com - Owned and Operated by Investopedia Inc. All rights reserved. Read more | |
![]() | Sports Science and Medicine. The Oxford Dictionary of Sports Science & Medicine. Copyright © Michael Kent 1998, 2006, 2007. All rights reserved. Read more | |
![]() | Wikipedia. This article is licensed under the Creative Commons Attribution/Share-Alike License. It uses material from the Wikipedia article "Sampling bias". Read more |
Mentioned in