LPA: Top 3 Problems
By QuantFish
Summary
## Key takeaways - **Overextraction of classes is common**: Researchers often extract too many latent classes because determining the optimal number is difficult when continuous indicators create a continuum rather than clear class boundaries, especially with large samples that have high power to detect even trivial distinctions. [01:35], [02:40] - **Theory-free LPA leads to over-extraction**: Going into LPA without clear theoretical expectations causes analysts to chase declining BIC values and keep adding classes that capitalize on chance rather than reflect real population subgroups. [03:22], [04:14] - **Ordered classes favor dimensional models**: When all extracted profiles are purely ordered (low/medium/high), LPA adds unnecessary complexity; a dimensional factor analysis would better represent how individuals differ continuously rather than categorically. [05:06], [07:03] - **Mplus defaults to equal variances**: Mplus automatically imposes equality of variances across classes by default, assuming means differ but variability doesn't, which may be unrealistic and is often unreported in published research. [07:57], [08:49] - **Report all parameters, not just means**: Authors should report complete parameter estimates including variances, within-class covariances, and class sizes rather than only mean profiles, to allow readers to evaluate the full solution quality. [08:58], [09:21]
Topics Covered
- Extracting Too Many Classes Without Theory
- Ordered Classes Signal Dimensional Models Instead
- Report All Model Constraints Transparently
- Variance Assumptions May Be Unrealistic
- Report Complete Parameters Not Just Means
Full Transcript
hello and welcome to this video on what I see as the top three problems in many applications of ladent profile analysis my name is Christian Geer I'm an instructor and statistical consultant
with Quant fish and on this channel I present weekly statistics tutorials on Tuesday on Tuesdays I usually talk about an analysis in the M plus software and
on Thursdays I present more General issues in multivariate statistical analysis including structural equation modeling factor analysis class modeling and multi-level analysis if this is
something that interests you then please subscribe to this channel also don't forget to hit the like button and to check out the description for additional resources including a link to my free
weekly statistics newsletter as well as courses that I offer through Quant fish in this video I want to talk about what I see as the top three problems with
many applications of latent profile analysis this does not mean that I am against against latent profile analysis not at all but I want to highlight some
of the issues that I find suboptimal in many applications and then hopefully this will help you if you think about applying latent profile analysis
yourself to avoid some of these problematic issues now maybe number one uh of what I find problematic is
that individuals tend to extract too many classes or that they tend to be unsure about how many classes they should extract because it is often
difficult to determine the number of profiles or the number of latent classes in latent profile analysis latent profile analysis extracts classes or clusters from continuous variables
continuous observed variables and therefore often times there is sort of like a Continuum as well with regard to the classes meaning you often have many
ordered classes and there's often not a clear minimum of for example the Bic value for a certain number of classes or there's not a clear solution according
to a bootstrap likelihood ratio test for comparing the number of classes so that uh people are unsure about how many classes they should extract and that
then that can lead to the extraction of too many classes many of which are maybe not really distinct or maybe some of which capitalize on chance uh meaning
they don't correspond to actual groups in the population so that's one problem and that's particularly relevant for latent profile analysis because of the
continuous nature of the observed variables with a classical latent class analysis where you have binary or ordinal indicators it tends to be less of an issue it can also be a problem
there but with latent profile analysis it's especially difficult especially also when you have a large sample then you have a lot lot of power so say to
detect even small classes and it can be problematic to have too many classes and so a problem so say related to that is that uh many people go into a latent
profile analysis without a clear theory about what they are expecting to find or clear hypothesis about what typological
structure they um aim to uncover and instead they throw in a bunch of variables in their profile analysis so to say without clear Theory without
clear expectations or hypothesis and then as a result they are lost a little bit when they extract three four five
six seven eight classes and um it keeps going so to say the Bic tends to decrease further and new classes look interesting additional classes might be
interesting and then they tend to extract more classes so it's a really good idea to first of all develop a theory and develop some hypothesis about
what groups you are expecting to find so that you're not completely lost and so that the whole analysis is not purely
datadriven another issue that um I frequently see and that's related to my first point is that often times I see
presentations of latent profile analyses with clients or in the published literature where you have many ordered classes so or where even maybe all of
the classes are ordered now what does this mean ordered classes means for example you have a group of low functioning individuals medium
functioning individuals and high functioning individuals for example if you um have competence scales or competence related scales as indicators
of latent profiles or for example when individuals assess risk classes then um you have low medium high risk and that's really not so interesting often times
because if the classes are purely ordered and there is no um the the profiles do not cut or intersect with one another at any
point then the solution could be better represented often times with a dimensional model like for example a factor analysis where um maybe all the
indicators are more or less indicators of a general factor of incompetence or competence for example or of risk versus non-risk or problematic Behavior versus
non problematic behavior and then really a laden profile analysis makes things unnecessarily complicated relative to a more straightforward analysis that works
with dimensions and that has to do with the fact that the indicators in Laten profile analysis are continuous variables so they are measured on a Continuum and now we're extracting
something categorical from continuous indicators and that may or may not be useful and often times actually it's not that useful in my opinion because often
times you would be better off looking at Dimensions meaning underlying factors continuous latent variables that would tell you more about how individuals
differ rather than throwing individuals into profiles where the assumption is that the individuals in that profile in a given profile so say are all the
same except for some variability and that may not be the most useful thing so when you see ordered classes in your application and you extract more classes
and they're all ordered and it just becomes so say a Continuum where the profiles don't intersect at one point then you should really reconsider whether Laten profile analysis is the
best type of analysis Laten profile analysis and Laten class analysis are most interesting when you have nonordered profiles where really a class
is qualitatively different or some classes are qualitatively different from other classes then you can uncover something that you would not easily
uncover with a dimensional type of statistical analysis and then lastly one problem that I often see is in or in
presentations of Len profile analysis is that it's unclear what constraints if were imposed on the analysis some programs for latent profile
analysis impose certain constrains on the solutions by default and then people may not even be aware of that and they report a solution and they don't make it
clear that those restrictions were or were not in the model for example the M plus software by default when you run a latent profile analysis imposes equality
of the variances with in class or ac across classes for the same indicators so then the assumption would be that the mean
profiles can differ for the indicator variables across classes but not the variances and that may or may not be a realistic and meaningful assumption and
so you should talk about that you should also assess models where the variability can differ between classes because why should all classes have the exact same
variability that's not always a realistic or meaningful assumption at least you should discuss this you should look into this you should let readers
know what constraints were imposed on the profile solution in the program that you used so transparency with regard to what
parameters were estimated versus what parameters were set equal across time is something that you should report you should also report the complete set of
parameters values meaning not just the mean profiles but also report the variances report any within class co-variances if they were admitted
report the class sizes and so on so be clear in your presentation also be clear about how you arrived at a given number of
classes what indices or what considerations did you use for selecting the number of classes why did you not extract more more or fewer classes in
your latent profile analysis I hope you found this video useful if you did then please hit the like button don't forget to subscribe to this channel check out the description for additional resources
including other videos and workshops on Laden class and Laden profile analysis and I'll see you next time
Loading video analysis...