The first few techniques discussed are sensitive to the linearity, normality, and equal variance assumptions of the data. Examinations of distribution, skewness, and kurtosis are helpful in examining distribution. Also, it is important to understand the magnitude of missing values in observations and to determine whether to ignore them or impute values to the missing observations. Another data quality measure is outliers, and it is important to determine whether the outliers should be removed.
If they are kept, they may cause a distortion to the data; if they are eliminated, they may help with the assumptions of normality. The key is to attempt to understand what the outliers represent. Multiple regression is the most commonly utilized multivariate technique. It examines the relationship between a single metric dependent variable and two or more metric independent variables. The technique relies upon determining the linear relationship with the lowest sum of squared variances; therefore, assumptions of normality, linearity, and equal variance are carefully observed.
The beta coefficients weights are the marginal impacts of each variable, and the size of the weight can be interpreted directly. Multiple regression is often used as a forecasting tool. It is allowable to utilize nonmetric typically binary dependent variables, as the objective is to arrive at a probabilistic assessment of a binary choice. The independent variables can be either discrete or continuous. A contingency table is produced, which shows the classification of observations as to whether the observed and predicted events match.
The sum of events that were predicted to occur which actually did occur and the events that were predicted not to occur which actually did not occur, divided by the total number of events, is a measure of the effectiveness of the model. This tool helps predict the choices consumers might make when presented with alternatives.
The purpose of discriminant analysis is to correctly classify observations or people into homogeneous groups. The independent variables must be metric and must have a high degree of normality. Discriminant analysis builds a linear discriminant function, which can then be used to classify the observations.go to link
Research with Diverse Groups: Research Designs and Multivariate Latent Modeling for Equivalence
The overall fit is assessed by looking at the degree to which the group means differ Wilkes Lambda or D2 and how well the model classifies. To determine which variables have the most impact on the discriminant function, it is possible to look at partial F values. The higher the partial F, the more impact that variable has on the discriminant function.
This tool helps categorize people, like buyers and nonbuyers. This technique examines the relationship between several categorical independent variables and two or more metric dependent variables. Whereas analysis of variance ANOVA assesses the differences between groups by using T tests for two means and F tests between three or more means , MANOVA examines the dependence relationship between a set of dependent measures across a set of groups. Typically this analysis is used in experimental design, and usually a hypothesized relationship between dependent measures is used.
This technique is slightly different in that the independent variables are categorical and the dependent variable is metric. Sample size is an issue, with observations needed per cell. However, too many observations per cell over 30 and the technique loses its practical significance. Cell sizes should be roughly equal, with the largest cell having less than 1. That is because, in this technique, normality of the dependent variables is important. The model fit is determined by examining mean vector equivalents across groups.
If there is a significant difference in the means, the null hypothesis can be rejected and treatment differences can be determined. When there are many variables in a research design, it is often helpful to reduce the variables to a smaller set of factors. This is an independence technique, in which there is no dependent variable.
Rather, the researcher is looking for the underlying structure of the data matrix. Ideally, the independent variables are normal and continuous, with at least three to five variables loading onto a factor. The sample size should be over 50 observations, with over five observations per variable. Multicollinearity is generally preferred between the variables, as the correlations are key to data reduction.
An overall MSA of. There are two main factor analysis methods: common factor analysis, which extracts factors based on the variance shared by the factors, and principal component analysis, which extracts factors based on the total variance of the factors. Common factor analysis is used to look for the latent underlying factors, whereas principal component analysis is used to find the fewest number of variables that explain the most variance. The first factor extracted explains the most variance. Typically, factors are extracted as long as the eigenvalues are greater than 1.
The factor loadings are the correlations between the factor and the variables. Typically a factor loading of. An orthogonal rotation assumes no correlation between the factors, whereas an oblique rotation is used when some relationship is believed to exist. The purpose of cluster analysis is to reduce a large data set to meaningful subgroups of individuals or objects. The division is accomplished on the basis of similarity of the objects across a set of specified characteristics. Outliers are a problem with this technique, often caused by too many irrelevant variables.
The sample should be representative of the population, and it is desirable to have uncorrelated factors. There are three main clustering methods: hierarchical, which is a treelike process appropriate for smaller data sets; nonhierarchical, which requires specification of the number of clusters a priori; and a combination of both.
GESIS: Modes, Measurement, Modelling
There are four main rules for developing clusters: the clusters should be different, they should be reachable, they should be measurable, and the clusters should be profitable big enough to matter. This is a great tool for market segmentation. Page Site Advanced 7 of Edited by: Neil J. Buy in print. Looks like you do not have access to this content. Entries Per Page:.
- A Thousand Sisters.
- Citations per year.
- Lean Websites.
- Practical Data Structures in C++;
- Electronics For You - Projects and Ideas 2000?
- Navigation menu.
Methods Map Research Methods. Explore the Methods Map.
Related Content. Back to Top.
- Intangible Capital.
- Intangible Capital;
- Algal Ecology: Freshwater Benthic Ecosystem.
Find content related to this author.