Gael Varoquaux

@GaelVaroquaux

Researcher and coder: ►Data, Brain, & Computational science ► / contributor, (co)founder of scikit-learn & joblib ►Photography on

Paris, France
Vrijeme pridruživanja: rujan 2011.

Tweetovi

Blokirali ste korisnika/cu @GaelVaroquaux

Jeste li sigurni da želite vidjeti te tweetove? Time nećete deblokirati korisnika/cu @GaelVaroquaux

  1. proslijedio/la je Tweet
    prije 7 sati

    The first paper on SciPy, after 19 years - feels so good to finally see this published! Credit goes to all SciPy contributors over all those years, amazing team effort! And thanks to Nature Methods for making it open access!

    Poništi
  2. proslijedio/la je Tweet
    2. velj
    Odgovor korisniku/ci

    There's something that many people in science and technology don't seem to get: you can either work on better methods, testing them on well-understood problems, or use well-understood methods for new problems. Innovating in multiple layers at the same time just produces noise.

    Poništi
  3. 1. velj

    Most questions are not about "prediction". But machine learning is about estimating functions that approximate conditional expectations / probability. We need to get better at integrating it in our scientific inference pipelines. For more, push me to write a paper on this. 8/8

    Prikaži ovu nit
    Poništi
  4. 1. velj

    For thousands of data points and large dimensionality, linear models (ridge) are needed. But applying them without thousands of data points (as I tried for many years) is hazardous. Get more data, change the question (eg analyze across cohorts). 7/8

    Prikaži ovu nit
    Poništi
  5. 1. velj

    For thousands of data points and moderate dimensionality (99% of cases), gradient-boosted trees provide the necessary regression model They are robust to data distribution and support missing values (even outside MAR settings ) 6/8

    Prikaži ovu nit
    Poništi
  6. 1. velj

    If there are less than a thousand data points, all but the simple statistical question can and will be gamed (sometimes unconsciously), partly for lack of model selection. An example in neuroimaging I no longer trust such endeavors, including mines. 5/8

    Prikaži ovu nit
    Poništi
  7. 1. velj

    We need non-parametric model selection and testing, that do not break if the model is wrong. Cross-validation and permutation importance provide these, once we have chosen input (endogenous) and output (exogenous) variables. 4/8

    Prikaži ovu nit
    Poništi
  8. 1. velj

    We need weakly-parametric models that can fit data as raw as possible, without relying on non-testable assumptions. Machine learning provides these, and tree-based models need little data transformations. 3/8

    Prikaži ovu nit
    Poništi
  9. 1. velj

    First, analytic variability is a killer. eg in "standard" analysis for brain mapping , for machine learning in brain imaging or more generally in "hypothesis driven" statistical testing 2/8

    Prikaži ovu nit
    Poništi
  10. 1. velj

    Even for science and medical applications, I am becoming weary of fine statistical modeling efforts, and believe that we should standardize on a handful of powerful and robust methods. An opinionated thread to give context for 1/8

    Prikaži ovu nit
    Poništi
  11. proslijedio/la je Tweet
    30. sij

    Scientific Python lecture notes

    Poništi
  12. proslijedio/la je Tweet
    30. sij

    Happy birthday !! Ten years from the first release and still young!

    Poništi
  13. proslijedio/la je Tweet
    30. sij

    Pandas 1.0 is here! * Read the release notes: * Read the blogpost reflecting on what 1.0 means to our project: * Install with conda / PyPI: Thanks to our 300+ contributors to this release.

    Poništi
  14. 29. sij

    Some papers share copy-pasted paragraphs: authors are submitting multiple variants to increase their chances. While beneficial for the individual, this behavior is disastrous for the group: fatigue of the selection process & dilution of publications. We should penalize it. 3/3

    Prikaži ovu nit
    Poništi
  15. 29. sij

    Many papers on same topic, with similar (but differing) contributions. Reading 6 papers on 1-shot learning (or multi-view clustering) brings boredom 😴. On the opposite a good paper on less trendy topics stimulates me 🤩. 2/3

    Prikaži ovu nit
    Poništi
  16. 29. sij

    Screening 40 (!) papers as senior program committee for The sheer volume of submissions is exhausting and time consuming. As a result human factors will influence my decisions 😞. Some thoughts 1/3

    Prikaži ovu nit
    Poništi
  17. proslijedio/la je Tweet
    29. sij

    10 PR already submitted and the first one just merged! Having fun at the Paris Sprint of the Decade!

    Prikaži ovu nit
    Poništi
  18. proslijedio/la je Tweet
    28. sij

    Attention OHBM attendees who are also parents: Please let us know of you plan to attend with your children. We are planning to provide some creative and engaging activities for your little ones. Please RT and share with other OHBM parents.

    Poništi
  19. proslijedio/la je Tweet
    28. sij

    "What's wrong with computational notebooks?" The 9 pain points this user study highlighted me, both as a notebook user and from our experience running Kaggle's hosted notebook implementation (commentary in thread)

    Prikaži ovu nit
    Poništi
  20. proslijedio/la je Tweet
    27. sij

    Rereading Lykken: Lykken, D. T. (1991). What’s Wrong with Psychology Anyway? In D. Cicchetti & W. Grove (Eds.), Thinking clearly about psychology (Vol. 1, pp. 3–39). What a remarkable paper. A few things that stood out to me. 🧵

    Prikaži ovu nit
    Poništi

Čini se da učitavanje traje već neko vrijeme.

Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.

    Možda bi vam se svidjelo i ovo:

    ·