Kittens raised in cylinders of vertical stripes only learn to see vertical stripes. What happens if we raise our children in VR with non-Euclidean geometries?
Just heard the most horrifying story of a physics professor who thinks imaginary numbers are not actually physical but purely descriptive.
How can we entrust these dangerous ideologues to educate the young and impressionable?
From what I can tell, looking at both English and Portuguese sources, the Wright brothers probably do deserve the credit. Sorry Brazil.
Anyway what else did my American schooling lie to me about?
Just submitted a safety paper to NeurIPS? Want to visit London to chat about it this July? Submit to the Safe and Trustworthy AI workshop (deadline tonight!) → we accept published papers, and papers under review, in their original format.
Only reason ML moves so much faster than other fields is that conference deadlines force researchers to be actually productive three weeks of the year.
10/ You could have had all that and more, Europe! The ball was in your court!
Unfortunately, my grandfather didn’t get the position. Still, today we’ll share a virtual toast to everyone who made the web what it is.
9/ P.S. My favorite part of the story takes place years later when my grandfather was campaigning to become director of CERN.
During a tour of the member states, he got chewed out by the Spanish contingent for having wasted the opportunity of the web.
8/ Now, 30 years later, the open web seems like it may have been the best possible outcome. But it was never guaranteed, so today is worth celebrating.
Happy birthday to you, World Wide Web! 🎉
7/ There too, talks fell flat. There was interest, but it was hopelessly slow and bureaucratic.
Which is why, on April 30th, 1993, CERN was left with little choice but to open-source the web.
5/ But CERN also realized that they weren’t well-positioned to bring the web to the masses.
So Michael Sendall, TBL’s boss, went around Europe pitching the web to various actors from private industry.
4/ Though CERN certainly shared in many of the ideals of TBL and co., the administrators were a tad more pragmatic.
Here was an opportunity for profit.
2/ You see, the web was never meant to be open-sourced.
Sure, Tim Berners Lee and the other founders were idealists who believed the web could bridge our divides and that everyone had a right to access.
I turned down an AI slowdown this week.
Sam Altman was shocked when I told him no & that I want him to accelerate 20% harder instead. What he doesnt understand is that someone like me just wants the thrill of the race instead of easy AI safety research. I’m just built different
So… late last night, a friend called me in tears
He just lost his job at a 1000+ person paperclip factory.
The culprit? ChatGPT.
This is a wakeup call to all people made of atoms everywhere (read this to stay alive):
ai is going to solve organizational problems far before it’s at the level of top humans. someone’s going to call the “summarize meeting notes” function and gpt will settle a debate with the (undeserved) weight of scientific authority. an “objective” arbiter, automated McKinsey
Memorizing facts, quotes, and definitions gets unfair flack. There’s no creativity without synthesis. No generalization without interpolation.
So in humans, so in AI.
The next time you feel like dunking on interpolation, remember that you just don't have the imagination to deal with high-dimensional interpolation. Maybe keep it to yourself and go interpolate somewhere else.
7/ Here I've presented the visuals in terms of regression, but the story is pretty similar for classification, where the function being fit is a classification boundary. In this case, there's extra pressure to maximize margins, which further encourages generalization.
6/ Meanwhile, with interpolation-not-extrapolation NNs can and do extrapolate outside the convex hull of training samples. Again, the bias towards simple linear extrapolations is locally the least biased option. There's no beating the polytopes.
5/ Bonus: this explains double descent: Test loss goes down, then up until the interpolation threshold, where there's only 1 bad interpolating solution. But as you increase model capacity further, you end up with many interpolating solutions, and some generalize better.
4/ In reality, we don't even train LLMs long enough to hit that intersecting threshold. In this under-interpolated sweet spot, NNs seem to learn features from coarse to fine with increasing model size. E.g.: https://arxiv.org/abs/1903.03488