if i'm calculating a correlation off of an agree-disagree 1-7 scale, where 90% of people picked 1 and 2, is it good practice to artificially weight each bin equally before calculating the correlation, or should I not do that?
Conversation
Replying to
Probably not a good idea, since that would be in some sense equivalent to synthetically generating data in buckets where they didn't exist.
1
19
It's possible I misunderstood your question or you're doing something specific where it's okay, but the above is my first pass intuition.
3
Replying to
I'm presuming you want to assess the correlation of some 2 or more other variables with choosing a given number on the scale, in which case it would seem like you wouldn't want to artificially equalise the population size of those variables as it equalises /2
1
Replying to
If you artificially weight bins then you’re not calculating the correlation between populations anymore. (You’re calculating correlations of a subset of your population though.)
2
Replying to
I have this bookmarked for whenever I need to do things using a Likert scale, maybe useful to you:
2
Replying to
You might want to use regression instead of correlation. Assuming you have two variables X and Y, and X is an influence on Y, the regression coefficient of X on Y should in theory be independent of the distribution of X.
1
7
Replying to
If most people picked 1 or 2, the differences between 1 and 2 have probably more to do with ceiling effects, such as the willingness to answer extremely. Thus, correlations may not be relevant to the variables you are studying.
3








