As you can see in the comments you are confusing the audience. But to your argument: why do you think that the priors can be found by organic evolution but not by any other statistical search method?
-
-
Replying to @Plinz @GaryMarcus
I tried to make a similar point to Gary a while back. I don't quite see the applicability of the idea of innateness in an artificial system. In biology it's simple developmental features are innate. There's no development in AI though - it's all computational/statistical.
1 reply 0 retweets 0 likes -
Replying to @DeanSHorak @GaryMarcus
How many bits of innate knowledge do you think were accumulated in our evolutionary lineage, and what amount of data about our world would be required to extract the same amount of knowledge with principled statistical methods?
1 reply 0 retweets 0 likes -
Not sure this kind of accounting is informative. The problem is that one way to construct a prior is to assemble a distribution, but I could also just change the support. Specifying the change in support is low bit cost, but incurs infinite KL divergence.
1 reply 0 retweets 1 like -
For example, in physics we assume 'physics does not change with time' - that excludes an infinite subset of possibilities from consideration (so its 'worth' infinite bits compared to uniform prior), but it doesn't take infinite bits to specify or learn.
2 replies 0 retweets 0 likes -
We don't have to make such an assumption a priori. The unchangeability of physics is a hypothesis that has to be supported by observations.
1 reply 0 retweets 0 likes -
We need far fewer observations in order to attempt the assumption than we would need to render it certain. E.g. there's an infinity of hypotheses that look like 'physics is constant... until 3 seconds from now' that we can't directly disprove by observation but we discard anyhow.
2 replies 0 retweets 1 like -
We already observe that physics is changing. The universe expands etc. When we observe change we can of course always construct an underlying regularity. We don’t need to discard any possibility.
1 reply 0 retweets 0 likes -
In practice we do discard possibilities in order to actually get stuff done. Then, after the fact, if we find what we've built is sufficiently incompatible with observation, we discard and start rebuilding.
1 reply 0 retweets 0 likes -
It's the 'getting stuff done' metric that regularizes the hypothesis space in this case compared to e.g. updating priors based on evidence. Hypotheses that are non-actionable or impossible to distinguish between can be merged or discarded without loss of performance.
2 replies 0 retweets 0 likes
Besides amount of work, there seem to be two other reasons to prefer models with lower Kolmogoroff complexity: number of introduced inaccuracies, and that an emergent or created universe is more likely fundamentally simple.
-
-
Yeah, some kind of Occam's razor is essential. Even something like L1 regularization reduces amount of necessary data from O(N) to O(log(N)) in feature count (Andrew Ng's 2004 paper https://dl.acm.org/citation.cfm?id=1015435 …). K-complexity is difficult in practice though.
0 replies 0 retweets 1 likeThanks. Twitter will use this to make your timeline better. UndoUndo
-
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.