Conversation

I need to create a word frequency database in order to to compare groups of data (for instance, one user's tweets / youtube comments) to extract words / phrases that are used more often for that particular user / group compared to the universal word usage frequency. Do you think
4
8
A lot of great advice here -- thank you! What I'm trying to do is create a universal word frequency table so that we can then scan users to see which users are using terms / words at a much higher frequency than you would expect by using the universal set. For instance, if you
1
2
search for users that use the word "vaccine(s)", you will get X users back and many of them are just using it occasionally (perhaps because there is a discussion of vaccines). But what I want to do is pull out users that use the term much more often than what the global frequency
1
2
table would have. Then you can detect users who are either purposely trying to spread disinfo, find users who are experts in the topic and like to talk about it, find users who have a strong stance on the topic matter, etc.
2
1
Just because a user uses a term X times doesn't mean they're discussing that topic much more frequently than others -- they just might be prolific and write a lot of comments / tweets -- but if you can calculate the frequency of the use compared to their entire corpus, then you
1
Replying to
When we have compared sentiment dictionaries, shorter text formats tended to have more emotional words. We saw problems even by comparing different types of messages within the same platform. I would go for a table per medium.
2
3