After running many segmentation models, researchers are left with the conundrum of deciding which ones to show the client.
While selecting a good segmentation solution can and should consider statistical outcomes, relying only on mathematical results can cause problems – namely, base size and differentiation issues. Thus, it is imperative to consider each of the following criteria when selecting the options that will be shown to the client.
Good Statistical Outcomes
First up for consideration are statistical outcomes. Segmentation models should be scrutinized for convergence and BIC. This is the first line of defense in removing poor segmentation solutions. BIC is an acronym for the Bayesian Information Criterion. It is an index which considers the model’s fit to the data and the number of parameters the model used. This number allows the researcher to compare multiple models. Lower BIC numbers indicate a better model when comparing segment solutions. Models which do not converge and models in which the BIC is larger than a one segment solution should be removed from the consideration set. These segmentation solutions need no further examination. The solutions that pass this criterion are then examined with regard to base sizes.
Segment Base Sizes
Like Goldilocks, we want to find the right base sizes for the segments. Solutions which have segment base sizes that are too large (>35% of the sample) or too small (<10% of the sample) become problematic. Very small segments are increasingly difficult to find, therefore making advertising to them an arduous task. If they are able to be located, the very small segment represents a very small portion of consumers, which turns into a very small return on investment (in most cases). On the other hand, a very large segment may indicate a need for further delineation. Advertising to a large segment may create messages that are too broad and too generic, missing the opportunity. Thus, we want to keep solutions which feature segment base sizes in that “just right” zone. Solutions that have too large and/or too small base sizes should be set aside unless there is compelling evidence to keep them for consideration. The solutions that pass this criterion are then examined with regard to differentiation.
Differentiation on Crucial Variables
In addition to base sizes and statistical criteria, differentiation between the segments is of utmost importance. Without differentiation, there is no way to distinguish one segment from the next. While not all variables may show differentiation between the segments, the most important variables should. These variables often include attitudes, needs, behaviors, and metrics such as likelihood to purchase the client’s product or brand. Variables that will be used to market to the segments should show differences. Likewise, variables that describe the segments’ personalities should be distinguishable from segment to segment. Some of these variables may have been used as inputs while others are profiling variables. Regardless, the main ones should still be distinct among segments. The solutions that have good differentiation between the segments are then examined with regard to high and low rating.
Identify High/Low Rating Segments
Some segmentation programs are notorious for creating segments that are high raters and low raters. This is especially true when the inputs being used in the segmentation analysis are Likert style questions. Essentially, the program picks up on acquiescence bias and groups respondents based on how the rating scale is being used. Hence it is wise to check each of the segments to make sure that there is variability amongst the resulting output. Identifying segments who are the highest raters or lowest raters on many variables indicate the need to discard that solution. Admittedly, the best way to tackle this issue is at the beginning of the segmentation analysis by reducing the number of Likert type variables included as inputs and utilizing more binary, continuous, and even categorical variables.
At this point, the number of viable segmentation solutions should be whittled down to a handful of good options. From here, it is important to select the ones with the best differentiation for client perusal. Typically two or three options for the client to examine is enough without being overwhelming. Because these solutions are all viable options, the decision then relies on the one that makes the most sense to the client.
Conclusion
While statistical criteria are key to helping reduce the number of segmentation solutions down to the best ones, other non-mathematical factors do play a pivotal role. Base sizes of the segments are indeed a chief consideration. Most clients intend to target specific segments with tailored messaging. In order to do so effectively, the segment will need to be large enough to be found “in the wild” and small enough that the messages received are customized for them. And in order to create customized messaging, the segments must be different from each other on crucial variables.
Author
Audrey Guinn
Statistical Consultant, Advanced Analytics Group
Audrey utilizes her knowledge in both inferential and Bayesian statistics to solve real-world marketing problems. She has experience in research design, statistical methods, data analysis, and reporting. As a Statistical Consultant, she specializes in market segmentation, SEM, MaxDiff, GG, TURF, and Key Driver analysis. Audrey earned a Ph.D. and Master of Science in Experimental Psychology with an emphasis on emotional decision-making from The University of Texas at Arlington.
Copyright © 2025 by Decision Analyst, Inc.
This posting may not be copied, published, or used in any way without written permission of Decision Analyst.