ok i have a survey where i have around 1100 responses per question. I wanna do a PCA on this, except... i have also around 1100 questions. chatgpt says i should prob have at least 5x the amount of responses as I have questions. is this still true for high amounts of questions?
Conversation
Replying to
Well, it depends, but you could also cut 80% of the questions and do PCA on the 220 most interesting ones?
4
Replying to
Not a statistician so if one turns up and I'm wrong please call me out on this.
But at least when it comes to genetics, you might have 200 subjects and 50k variants and get a good PCA result.
If it works similarly with subject/question then it should work OK with high numbers.
2
2
Replying to
Another question I'd ask (I don't know the answer) is whether PCA is appropriate on the types of answers people are giving - are these ordinal data? Might matter.
4
Replying to
You’re still fitting a thousand parameters with a thousand data points. Like fitting a line with two points.
5
Replying to
I would do EFA of sets of conceptually similar questions. Compute factor scores for each factor you identify for each set, then run EFA on all those factor scores.
5
Replying to
I guess to get the right answer, you need to describe in more detail what is the actual final goal.
1
Replying to
Is this about the chaos survey? There are 1100 questions in there? Did I not finish it entirely when it said I did, then?
1










