Multivariate statistics help the researcher to summarize data and reduce the number of variables necessary to describe it.
How are these techniques used?
- for developing taxonomies or systems of classification
- to investigate useful ways to conceptualize or group items
- to generate hypotheses
- to test hypotheses
Most commonly multivariate statistics are employed:
One researcher has this to say about factor analysis, a comment that could apply to all three techniques:
When I think of factor analysis, two words come to mind: "curiosity" and "parsimony." This seems a rather strange pair -- but not in relation to factor analysis. Curiosity means wanting to know what is there, how it works, and why it is there and why it works ... Scientists are curious. They want to know what's there and why. They want to know what is behind things. And they want to do this in as parsimonious a fashion as possible. They do not want an elaborate explanation when it is not needed ... This ideal we can call the principle of parsimony (Kerlinger, 1979).
How do these techniques differ from regression?
In multiple regression and analysis of variance, several variables are used, however one -- a dependent variable -- is generally predicted or explained by means of the other(s) -- independent variables and covariates. These are called dependence methods.
Factor analysis, multidimensional scaling (MDS) and cluster analysis look at interrelationships among variables. They are not generally used in prediction, there is no p-value, and the researcher interprets the output of the analysis and determines the best model. This can be frustrating! (See cautions for novice researchers.)
What are the assumptions of multivariate analyses?
All of the models require that input data be in the form of interrelationships -- this means correlations for factor analysis. MDS and cluster analysis can use a variety of different input data -- distances, or measures of similarity or proximity. This means that MDS and cluster analysis can be somewhat more flexible than factor analysis.
A big assumption of these methods is that the data itself is valid . (See Trochim's Knowledge Base for a discussion of validity, especially construct validity.) Because these methods do not use the same logic of statistical inference that dependence methods do, there are no robust measures that can overcome problems in the data. So, these methods are only as good as the input you have. The "garbage in-garbage out" rule definately applies.
kte blaja gak.. tp bhase melayu je...huhuhu...^^,
ReplyDelete