goddamnit i think im gonna have to actually learn a programming language to properly process my data
i hear R sucks for odds ratio stuff? is python any better?
Conversation
Python, Pandas, Scipy, Seaborn... no need for PySpark unless she's regularly dealing with >1TB datasets. :-)
5
6
21
Show replies
Show replies
Replying to
Python has the lowest barrier to entry and can do basically everything. What format is your data in? How much data is there?
1
Replying to
R or Python both fine. Decide whether you want counting to start with 1 or 0 then decide.
1
Replying to
it’s about the same, I use both, python better for ml + big data stuff, R better for markdown, visualizations and stats
2
10
agree. i’d just send a normie to Jupyter with seaborn for graphs
would only recommend R to non-tech person if it’s actual big data, like 1M columns and 1M rows, because of perf speedup
1
1
Replying to
really depends how many dimensions are there to the dataset, how many rows and columns. if this is very large it could be faster to use R
did not understand “odds ratio stuff” reference
1
3
Show replies
Replying to
You'll be able to do the analysis you want in R in about 2 hours after a little Googling.
It will take you at least 2 days to (probably) get Python working on your computer and the libraries installed.
I'd go with R.
1
6
just install Anaconda. It doesn't take 2 days to get Python running on a machine.









