Share on Facebook Share on Twitter Email
Answers.com

Loss function

 
Sci-Tech Dictionary: loss function
(′lös ′fəŋk·shən)

(mathematics) In decision theory, the function, dependent upon the decision and the true underlying distributions, which expresses the loss produced in taking the decision.


Search unanswered questions...
Enter a question here...
Search: All sources Community Q&A Reference topics
Wikipedia: Loss function
Top

In statistics, decision theory and economics, a loss function is a function that maps an event onto a real number representing the economic cost or regret associated with the event.

Less technically, in statistics a loss function represents the loss (cost in money or loss in utility in some other sense) associated with an estimate being "wrong" (different from either a desired or a true value) as a function of a measure of the degree of wrongness (generally the difference between the estimated value and the true or desired value.)

Contents

Definition

Given a random variable X over the probability space  \scriptstyle (\mathcal{X},\Sigma, P_\theta) determined by a parameter θ ∈ Θ, and a set A of possible actions, a decision rule is a function δ : \scriptstyle\mathcal{X}→ A.

A loss function is a real lower-bounded function L on Θ × A. The value L(θδ(X)) is the cost of action δ(X) under parameter θ.[1]

Expected loss

As the result of the decision rule depends on the outcome of a random variable X, the value of the loss function itself is a random quantity. Both frequentist and Bayesian statistical theory involve making a decision based on the expected value of the loss function: however this quantity is defined differently under both paradigms.

Frequentist risk

The expected loss in the frequentist context is obtained by taking the expected value with respect to the probability distribution of the observed data Pθ. This is also referred to as the risk function of the decision rule δ and the parameter θ:

R(\theta, \delta) = \mathbb{E}_\theta L\big( \theta, \delta(X) \big) = \int_\mathcal{X} L\big( \theta, \delta(X) \big) \, \operatorname{d} P_\theta (X) .[2]

A decision rule is then chosen using an optimality criterion. Some commonly used criterion:

Minimax: Choose the decision rule with the lowest worst performance:
 \underset{\delta} \operatorname{arg\,min} \ \max_{\theta \in \Theta} \ R(\theta,\delta).
Bayes risk: If there exists a prior for the parameter, choose the decision rule with the lowest expected risk.
Invariance: Choose the optimal decision rule which satisfies an invariance requirement.

Bayesian expected loss

In a Bayesian approach, the expectation is calculated using the posterior distribution π* of the parameter θ:

\rho(\pi^*,a) = \int_\Theta L(\theta, a) \, \operatorname{d} \pi^* (\theta).

One then should choose the action a* which minimises the expected loss. Although this will result in choosing the same action as would be chosen using the Bayes risk, the emphasis of the Bayesian approach is that one is only interested in choosing the optimal action under the actual observed data, whereas choosing the actual Bayes optimal decision rule, which is a function of all possible observations, is a much more difficult problem.

Selecting a loss function

Sound statistical practice requires selecting an estimator consistent with the actual loss experienced in the context of a particular applied problem. Thus, in the applied use of loss functions, selecting which statistical method to use to model an applied problem depends on knowing the losses that will be experienced from being wrong under the problem's particular circumstances, which results in the introduction of an element of teleology into problems of scientific decision-making.

A common example involves estimating "location." Under typical statistical assumptions, the mean or average is the statistic for estimating location that minimizes the expected loss experienced under the Taguchi or squared-error loss function, while the median is the estimator that minimizes expected loss experienced under the absolute-difference loss function. Still different estimators would be optimal under other, less common circumstances.


Loss functions in economics are typically expressed in monetary terms. For example:

 \$ = \frac{\mathrm{loss}}{\mathrm{time\ period}}

Other measures of cost are possible, for example mortality or morbidity in the field of public health or safety engineering.

Loss functions are complementary to utility functions which represent benefit and satisfaction. Typically, for utility U:

\ \mathrm{loss} = f(k - U)

where k is some arbitrary constant.

Loss functions in Bayesian statistics

One of the consequences of Bayesian inference is that in addition to experimental data, the loss function does not in itself wholly determine a decision. What is important is the relationship between the loss function and the prior probability. So it is possible to have two different loss functions which lead to the same decision when the prior probability distributions associated with each compensate for the details of each loss function.

Combining the three elements of the prior probability, the data, and the loss function then allows decisions to be based on maximizing the subjective expected utility, a concept introduced by Leonard J. Savage.

Regret

Savage also argued that using non-Bayesian methods such as minimax, the loss function should be based on the idea of regret, i.e., the loss associated with a decision should be the difference between the consequences of the best decision that could have been taken had the underlying circumstances been known and the decision that was in fact taken before they were known.

Quadratic loss function

The use of a quadratic loss function is common, for example when using least squares techniques or Taguchi methods. It is often more mathematically tractable than other loss functions because of the properties of variances, as well as being symmetric: an error above the target causes the same loss as the same magnitude of error below the target. If the target is t, then a quadratic loss function is

\lambda(x) = C |t-x|^2 \;

for some constant C; often the value of the constant makes no difference to a decision, and can then be ignored by setting it equal to 1.

Many common statistics, including t-tests, regression models, design of experiments, and much else, use least squares Linear models theory, which is based on the Taguchi loss function.

References

  1. ^ Nikulin, M.S. (2001), "Loss function", in Hazewinkel, Michiel, Encyclopaedia of Mathematics, Kluwer Academic Publishers, ISBN 978-1556080104 
  2. ^ Nikulin, M.S. (2001), "Risk of a statistical procedure", in Hazewinkel, Michiel, Encyclopaedia of Mathematics, Kluwer Academic Publishers, ISBN 978-1556080104 

Further reading

See also


 
 

 

Copyrights:

Sci-Tech Dictionary. McGraw-Hill Dictionary of Scientific and Technical Terms. Copyright © 2003, 1994, 1989, 1984, 1978, 1976, 1974 by McGraw-Hill Companies, Inc. All rights reserved.  Read more
Wikipedia. This article is licensed under the Creative Commons Attribution/Share-Alike License. It uses material from the Wikipedia article "Loss function" Read more