Multivariate statistics is a form of statistics encompassing the simultaneous observation and analysis of more than one statistical variable. The application of multivariate statistics is multivariate analysis. Methods of bivariate statistics, for example ANOVA and t-tests, are special cases of multivariate statistics in which two variables are involved.
There are many different models, each with its own type of analysis:
- Clustering systems assign objects into groups (called clusters) so that objects from the same cluster are more similar to each other than objects from different clusters.
- Hotelling's T-square is a generalization of Student's t statistic that is used in multivariate hypothesis testing.
- Multivariate analysis of variance (MANOVA) methods extend analysis of variance methods to cover cases where there is more than one dependent variable and where the dependent variables cannot simply be combined.
- Discriminant function or canonical variate analysis attempt to establish whether a set of variables can be used to distinguish between two or more groups.
- Regression analysis attempts to determine a linear formula that can describe how some variables respond to changes in others. Regression analyses are based on forms of the general linear model.
- Principal components analysis attempts to determine a smaller set of synthetic variables that could explain the original set.
- Redundancy analysis a canonical (constrained) variant of Principal components analysis; the synthetic variables are a linear combination of a set of explaining variables.
- Correspondence analysis or Reciprocal Averaging also attempts to determine a smaller set of synthetic variables that could explain the original set.
- Linear discriminant analysis (LDA) computes a linear predictor from two sets of normally distributed data to allow for classification of new observations.
- Logistic regression allows regression analysis to estimate and test the influence of covariates on a binary response.
- Artificial neural networks extend regression methods to non-linear multivariate models.
- Multidimensional scaling covers various algorithms to determine a set of synthetic variables that best represent the pairwise distances between records. The original method is principal coordinate analysis.
- Canonical correlation analysis tries to establish whether or not there are linear relationships among two sets of variables (covariates and response).
- Recursive partitioning creates a decision tree that strives to correctly classify members of the population based on a dichotomous dependent variable
Contents |
Software & Tools
- The Unscrambler (free-to-try commercial MVA software for Windows)
See also
- Multivariate analysis
- A/B testing
- Estimation of covariance matrices
- Hotelling's T-square distribution
- Important publications in multivariate analysis
- Multivariate testing
- Multivariate normal distribution
- Wishart distribution
- Structured data analysis (statistics)
- Univariate
References
- Professor Kim H. Esbensen. Multivariate Data Analysis -in practice (5th Edition). http://www.camo.com/introducer/.
- KV Mardia, JT Kent, and JM Bibby (1979). Multivariate Analysis. Academic Press. ISBN 0-124-712525.
External links
This entry is from Wikipedia, the leading user-contributed encyclopedia. It may not have been reviewed by professional editors (see full disclaimer)




