Cluster analysis

Share on Facebook Share on Twitter Email
(′kləs·tər ə′nal·ə·səs)

(statistics) A general approach to multivariate problems whose aim is to determine whether the individuals fall into groups or clusters.


Top

Method of statistical analysis that groups people or things by common characteristics of interest to the researcher. Can be used to characterize the behavior or interests of various customer clusters such as yuppies so that promotion copy and design can be specifically targeted to them. Cluster analyses are frequently based upon geographic criteria so that mailings can be sent to the best clusters.

Top
Method of statistical analysis that groups people or things by common characteristics of interest to the researcher. Can be used to characterize the behavior or interests of various customer clusters such as yuppies, so that promotion copy and design can be specifically targeted to them. Cluster analyses are frequently based upon geographic criteria so that mailings can be sent to the best clusters.

Previous:Cloud On Title, Closing Statement, Closing Price or Closing Quote
Next:Cluster Housing, Co-Mortgagor, Co-Op
Top

A type of multivariate analysis which aims to group a set of variables or individuals into classes, so that the objects in each class are as like each other as possible and as unlike the other classes as possible, as defined by a designated list of characteristics and indicators. In social geography, the technique can be used to create classifications of, for example, urban areas by type. In general, the classification process begins by drawing up a table of correlation coefficients of dis/similarity between each pair of objects. From here, the objects can be combined into larger and larger groups, or broken down into smaller and smaller ones.

A technique used to differentiate between subgroups within a single collection of information made about a group, people, or objects.

Top

An investment approach that places securities into groups based on the correlation found among their returns. Securities with high positive correlations are grouped together and segregated from those with negative correlation. Between each cluster, very little correlation should exist. Holding stocks in each cluster provides the investor with a diversified portfolio.

Investopedia Says:
Cluster analysis enables the investor to eliminate any overlap in his or her portfolio by identifying securities with related returns. This approach increases diversification, which provides the investor will a less risky portfolio. Cluster analysis has uncovered certain categories of stocks, such as cyclical and growth stocks.

Related Links:
This is a step-by-step approach to determining, achieving and maintaining optimal asset allocation. 4 Steps To Building A Profitable Portfolio
In this feature, we take an in-depth look at the various techniques that determine the value and investment quality of companies from an industry perspective. Industry Handbook
This strategy can be profitable but only if you know when to dump these stocks. The Ups And Downs Of Investing In Cyclical Stocks
If you don't know how to evaluate a company's present performance and its possible future performance, you need to learn how to analyze ratios. Ratio Analysis Tutorial
Prices never move in straight lines, so it's time to learn about this powerful trend-following technique. Peak-and-Trough Analysis


Top

n

A complex statistical technique of data analysis of numeric scale scores, producing clusters of variables related to one another.

Wikipedia on Answers.com:

Cluster analysis (in marketing)

Top

Cluster analysis is a class of statistical techniques that can be applied to data that exhibit “natural” groupings. Cluster analysis sorts through the raw data and groups them into clusters. A cluster is a group of relatively homogeneous cases or observations. Objects in a cluster are similar to each other. They are also dissimilar to objects outside the cluster, particularly objects in other clusters.

The diagram below illustrates the results of a survey that studied drinkers’ perceptions of spirits (alcohol). Each point represents the results from one respondent. The research indicates there are four clusters in this market. Please keep in mind, the axes represent two traits of the market. In more complex cluster analyses you may have more than that number.

Perceptual Map
Illustration of clusters

Another example is the vacation travel market. Recent research has identified three clusters or market segments. They are the: 1) The demanders - they want exceptional service and expect to be pampered; 2) The escapists - they want to get away and just relax; 3) The educationalist - they want to see new things, go to museums, go on a safari, or experience new cultures.

Cluster analysis, like factor analysis and multi-dimensional scaling, is an interdependence technique: it makes no distinction between dependent and independent variables. The entire set of interdependent relationships is examined. It is similar to multi-dimensional scaling in that both examine inter-object similarity by examining the complete set of interdependent relationships. The difference is that multi-dimensional scaling identifies underlying dimensions, while cluster analysis identifies clusters. Cluster analysis is the obverse of factor analysis. Whereas factor analysis reduces the number of variables by grouping them into a smaller set of factors, cluster analysis reduces the number of observations or cases by grouping them into a smaller set of clusters.

Contents

In marketing, cluster analysis is used for

Basic procedure

  1. Formulate the problem - select the variables to which you wish to apply the clustering technique
  2. Select a distance measure - various ways of computing distance:
  3. Select a clustering procedure (see below)
  4. Decide on the number of clusters
  5. Map and interpret clusters - draw conclusions - illustrative techniques like perceptual maps, icicle plots, and dendrograms are useful
  6. Assess reliability and validity - various methods:
    • repeat analysis but use different distance measure
    • repeat analysis but use different clustering technique
    • split the data randomly into two halves and analyze each part separately
    • repeat analysis several times, deleting one variable each time
    • repeat analysis several times, using a different order each time

Clustering procedures

There are several types of clustering methods:

  • Non-Hierarchical clustering (also called k-means clustering)
    • first determine a cluster center, then group all objects that are within a certain distance
    • examples:
      • Sequential Threshold method - first determine a cluster center, then group all objects that are within a predetermined threshold from the center - one cluster is created at a time
      • Parallel Threshold method - simultaneously several cluster centers are determined, then objects that are within a predetermined threshold from the centers are grouped
      • Optimizing Partitioning method - first a non-hierarchical procedure is run, then objects are reassigned so as to optimize an overall criterion.
  • Hierarchical clustering
    • objects are organized into an hierarchical structure as part of the procedure
    • examples:
      • Divisive clustering - start by treating all objects as if they are part of a single large cluster, then divide the cluster into smaller and smaller clusters
      • Agglomerative clustering - start by treating each object as a separate cluster, then group them into bigger and bigger clusters
        • examples:
          • Centroid methods - clusters are generated that maximize the distance between the centers of clusters (a centroid is the mean value for all the objects in the cluster)
          • Variance methods - clusters are generated that minimize the within-cluster variance
            • example:
              • Ward’s Procedure - clusters are generated that minimize the squared Euclidean distance to the center mean
          • Linkage methods - cluster objects based on the distance between them
            • examples:
              • Single Linkage method - cluster objects based on the minimum distance between them (also called the nearest neighbour rule)
              • Complete Linkage method - cluster objects based on the maximum distance between them (also called the furthest neighbour rule)
              • Average Linkage method - cluster objects based on the average distance between all pairs of objects (one member of the pair must be from a different cluster)

See also

References

  • Sheppard, A. G. (1996). The sequence of factor analysis and cluster analysis: Differences in segmentation and dimensionality through the use of raw and factor scores. Tourism Analysis, 1(Inaugural Volume), 49-57.

Post a question - any question - to the WikiAnswers community:

Copyrights: