NN training goes through two distinct phases: (1) reducing classification error through drift (2) optimally compressing mutual information of hidden layers through diffusion https://arxiv.org/abs/1703.00810 pic.twitter.com/He7dNY3NSv
Better ways of thinking, feeling, and acting—around problems of meaning and meaninglessness; self and society; ethics, purpose, and value.
You can add location information to your Tweets, such as your city or precise location, from the web and via third-party applications. You always have the option to delete your Tweet location history. Learn more
Add this Tweet to your website by copying the code below. Learn more
Add this video to your website by copying the code below. Learn more
By embedding Twitter content in your website or app, you are agreeing to the Twitter Developer Agreement and Developer Policy.
| Country | Code | For customers of |
|---|---|---|
| United States | 40404 | (any) |
| Canada | 21212 | (any) |
| United Kingdom | 86444 | Vodafone, Orange, 3, O2 |
| Brazil | 40404 | Nextel, TIM |
| Haiti | 40404 | Digicel, Voila |
| Ireland | 51210 | Vodafone, O2 |
| India | 53000 | Bharti Airtel, Videocon, Reliance |
| Indonesia | 89887 | AXIS, 3, Telkomsel, Indosat, XL Axiata |
| Italy | 4880804 | Wind |
| 3424486444 | Vodafone | |
| » See SMS short codes for other countries | ||
This timeline is where you’ll spend most of your time, getting instant updates about what matters to you.
Hover over the profile pic and click the Following button to unfollow any account.
When you see a Tweet you love, tap the heart — it lets the person who wrote it know you shared the love.
The fastest way to share someone else’s Tweet with your followers is with a Retweet. Tap the icon to send it instantly.
Add your thoughts about any Tweet with a Reply. Find a topic you’re passionate about, and jump right in.
Get instant insight into what people are talking about now.
Follow more accounts to get instant updates about topics you care about.
See the latest conversations about any topic instantly.
Catch up instantly on the best stories happening as they unfold.
NN training goes through two distinct phases: (1) reducing classification error through drift (2) optimally compressing mutual information of hidden layers through diffusion https://arxiv.org/abs/1703.00810 pic.twitter.com/He7dNY3NSv
When I first saw this a few days ago I thought “peculiar —wonder if it’s true.” Looking at this graph, wondered what happened at 8-9k epochs (sudden dive).
Skimmed bits of paper, got increasingly suspicious that result is artifact of an artificial task. Went looking for the task… found this.pic.twitter.com/MmN7Lq6U3I
If I’m reading right that they’ve spun a general theory of DL based on runs discriminating inside/outside of a hypersphere, everyone should be laughing at them.
I’m not sure what is in the supplementary material (can’t easily find) but based on this bit, sounds like all studies were done with trivial synthetic data. If so this is a nice intro class project, not a publishable result, much less a breakthrough.pic.twitter.com/msWSGp9NM8
I might be missing something, and they also might happen to have found something real even though there’s no meaningful evidence for it here.
Should be easy to run on other data, so I'd guess they use the synthetic data only for the particular paper? Tishby has a number of presentations on this:https://www.youtube.com/watch?v=FSfN2K3tnJU …
I mentioned in a different subthread- it's easy to run but hard to define the distribution over input data to do the analysis
At large risk of sounding like an idiot because I haven’t looked at this stuff seriously in 25 years… isn’t dropout pretty much a method for minimizing mutual info? Not explicitly, but that’s more-or-less the understanding of what it does?
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.