For this trick we introduce a new kind of number, called an infinitesimal hyperreal number. Imagine we have some number epsilon, such that epsilon > 0 but epsilon < x for all positive real numbers x. Imagine that we can use algebra to manipulate epsilon like any other variable.
-
-
এই থ্রেডটি দেখান
-
Now we can compute derivatives just by using "plug-and-chug" algebra on the expression f'(x) = real_part(f(x+epsilon) - f(x)) / epsilon) where the "real_part" function just rounds off the infinitesimal part of some expression
এই থ্রেডটি দেখান -
For example, let's take the derivative of x^2: ((x+eps)^2 - x^2) / eps = (x^2 + 2 eps x + eps^2 - x^2) / eps = 2 + eps. We round off the infinitesimal eps and are left with 2.
এই থ্রেডটি দেখান -
This is a lot more "automatic" than setting up a limit as epsilon approaches zero and proving that the limit converges.
এই থ্রেডটি দেখান -
It might seem like cheating to just make up infinitesimal hyperreal numbers and say that we can manipulate them with the standard rules of algebra... but it turns out that such algebra rules are logically consistent if and only if the same rules for the real numbers are
এই থ্রেডটি দেখান -
Hyperreal numbers formalize some of the intuitions that Newton and Leibniz used without much formal justification in the early development of calculus. Hyperreal numbers stay much closer to the original intuition than earlier formalizations based on limits do.
এই থ্রেডটি দেখান -
There are also infinite hyperreals, larger than any real number, but I don't personally find them quite as useful for the math I need to do for machine learning research.
এই থ্রেডটি দেখান -
It is also possible to compute integrals using hyperreal numbers, but I don't personally find as much of an advantage to hyperreal numbers over limits for integration as I do in the case of derivation.
এই থ্রেডটি দেখান
কথা-বার্তা শেষ
নতুন কথা-বার্তা -
-
-
Indeed, hyperreals are cool. My former student Haosui Duanmu and I used them to prove a new complete class theorem in statistical decision theory, essentially completing the 70 year old problem kicked off by Wald in the 1930's/40's. https://arxiv.org/abs/1612.09305 pic.twitter.com/z15FySQN7w
-
Was always surprised NSA is not taught in applied maths courses, although it can become very abstract very quickly
কথা-বার্তা শেষ
নতুন কথা-বার্তা -
-
-
Sweet! I found about this and covered a little in my post on Auto-diffs. https://www.sanyamkapoor.com/machine-learning/autograd-magic/#dual-numbers …. Digesting the math is more concise I guess!
ধন্যবাদ। আপনার সময়রেখাকে আরো ভালো করে তুলতে টুইটার এটিকে ব্যবহার করবে। পূর্বাবস্থায়পূর্বাবস্থায়
-
-
-
"Radically Elementary Probability Theory" by Nelson is old but has a good intro to using infinitesimals in a range of proofs.
ধন্যবাদ। আপনার সময়রেখাকে আরো ভালো করে তুলতে টুইটার এটিকে ব্যবহার করবে। পূর্বাবস্থায়পূর্বাবস্থায়
-
-
-
There's a similar technique using dual numbers. I implemented a demo in this notebook about autodiff if you want to see it in action:https://github.com/ageron/handson-ml/blob/master/extra_autodiff.ipynb …
ধন্যবাদ। আপনার সময়রেখাকে আরো ভালো করে তুলতে টুইটার এটিকে ব্যবহার করবে। পূর্বাবস্থায়পূর্বাবস্থায়
-
-
-
Also, Dual Numbers (a + b ε) where ε^2 = 0; f(x + ε) = f(x) + f’(x) ε; (x + ε)^2 = x^2 + 2xε + ε^2 = x^2 + 2x ε.
-
thanks, I was about to ask about the relationship to similar tricks via dual numbers, https://blog.demofox.org/2014/12/30/dual-numbers-automatic-differentiation/ … etc... are they both equally practical, or suited to certain uses?
-
Mathematically the "problem" with this approach is that what you get is not an ordered field, since ε is not invertible and ε^2 can't be bigger than zero even though ε should be. This leads to Synthetic Differential Geometry instead of NSA.
-
If you're just using them as a bookkeeping device for a program they are fine of course. This kind of ε is restricted to the first derivative (or up to an nth derivative fixed at the beginning), which is good for space but maybe less flexible?
কথা-বার্তা শেষ
নতুন কথা-বার্তা -
-
-
Is this the same thing as forward mode automatic differentiation?
-
Is it just me or has automatic differentiation only become widespread in the last couple of years? Definitely not covered in my uni course 15 years ago

-
I believe it wasn’t usable enough for people to consider it interesting.
কথা-বার্তা শেষ
নতুন কথা-বার্তা -
লোড হতে বেশ কিছুক্ষণ সময় নিচ্ছে।
টুইটার তার ক্ষমতার বাইরে চলে গেছে বা কোনো সাময়িক সমস্যার সম্মুখীন হয়েছে আবার চেষ্টা করুন বা আরও তথ্যের জন্য টুইটারের স্থিতি দেখুন।