Opens profile photo
Follow
Nicholas Teague
@_NicT_
Founder of Automunge, author of essays "From the Diaries of John Henry".
Altamonte Springs, FLautomunge.comJoined February 2009

Nicholas Teague’s Tweets

A really interesting question partially addressed by my geometric regularization paper was associated with the universality of the double descent phenomenon in high dimensional geometry, and whether we can expect arbitrary systems to experience phase changes of benefits of scale.
I am pretty sure that I read somewhere that on New Year’s Eve you are supposed to listen to an album by the black eyed peas for good luck.
Going forward will reserve Medium for less technical essays here and there. As machine learning focus has progressed finally realized arxiv probably a more suitable venue, both from forcing higher quality output and just more suitable. Coding is fun and less misinterpretations.
Wishing everyone a merry Christmas and happy holidays and etc. My 2022 partly inspired by the book of Jobs. Looking forward to a new chapter in 2023.
2
Consumer facing large language models sacrifice the integration of things like reinforcement learning for objective aligned output. Or at least they should. Part of the challenge for adopting a foundation api standard is ensuring a black box api retains such neutrality alignment.
The coolest thing about coding in Rust is that you can have a ginormous code base, but comprehensive validations for a single support function can be conducted independently of the context of surrounding system. (Exception when incorporating custom data structures, still useful.)
A counterintuitive property of elevators is that the bigger the room, the more awkward the whole thing is. Where do you stand in relation to other passengers when there are so many choices?
A downside of language model APIs whitewashing for offensive or undesired content is that the Turing test benchmark may be harder to demonstrate. Given all the free publicity for chess, jeopardy, go, etc is most obvious low hanging fruit if any marketing team has the initiative.
My single favorite part of NeurIPS had nothing to do with the conference, was the chance performance of a New Orleans performer singing the most amazing rendition of Hallelujah on the shore of the Mississippi.
I expect music generation transformers are going about token encodings all wrong. There is a lot of inherent structure that could be integrated other than just note and duration, eg could vectorize based on scale, chord, and progressions, then apply attention between constituents
Good note taking is the best memory aid have come acrosss. Even if whisper or whatever gadget can translates a dialogue ad verbum, the simple act of recording your impressions will mean that you’re more likely to retain it where it matters.
My professional objectives include identifying and implementing means of sustainable competitive advantage. I like to surround myself with novelty and stretch domains because I find it improves my thinking. Truth be told most of people in ML are smarter than me. #neurips2022
1
Looking forward to #NeurIPS2022 this week. Hoping to get caught up on state of art. Currently looking for opportunities at intersection of software engineering and research. Interests include quantum computing, tabular modality, deep learning theory. Please feel free to say hi!👋
The appeal of the tech sector isn’t just the tech, it is the type of people that are themselves drawn to state of the art. It is a great crowd. People are smart, friendly, it’s just an awesome vibe. Don’t know how else to explain it.
Vivendi qui rect prorogat horam, Rusticus expectat dum defluat amnis; at ille Labitur, et labetur in omne volubilis ævum. The man who postpones the hour of living properly is like the peasant waiting for the river to abate; but it flows on, and will go rolling on forever. -Horace
Dataframe preprocessing is obviously less groundbreaking than what most ML researchers focus on. The rationale for my interest was that it was within my circle of competence, was feasible given my resources on hand, and did not require huge compute. No one else was working on it.
One of the highlights of my early career was the opportunity to spend an impromptu day job shadowing an IBM executive in south Florida. The experience made quite an impression. I was unqualified beyond measure, but it inspired me to attempt my first (somewhat laughable) startup.
Academia is not well structured to support the types of multi-year engineering focus that may go into an open source software library. If we don’t encourage and reward such risk taking we are left to rely on infrastructure designed around monetization instead of utilization.
Since I was having difficulty with speech impediment I wrote a book of essays. It was initially intended to establish basis for social interest, eventually morphed into channel for entrepreneurship and research. At a minimum I hoped to establish credibility and trust. It was fun.
Hey recruiters, it is not too late to have me update the affiliation listing on my groundbreaking paper associated with dataframe preprocessing. Just think of all of that prestige.
One reason that people volunteer to contribute open source software is that it is a form of public service, for giving back to the community and advancing a field. Not all projects become sustainable, but some do. Open source and entrepreneurship are not contradictory.
I have been following quantum computing as a hobby for several years. Read nearly every introductory text offered on Amazon. I recognize my writings on subject have at times been kind of simplistic, have tried to write in a manner suitable for broad audience, impart an intuition.
Here is the poster for the paper I will be sharing at the #NeurIPS2022 “has it trained yet?” workshop. Not a big deal, just a tabular preprocessing platform offered as infrastructure for the community. AGI it is not, useful is the goal. If you use Pandas or XGBoost, worth a look.
Image
1
I told both a government official and the patent office that if granted I intended to donate the primary Automunge patent to the public domain. It is MIT licensed open source software. The goal was originally to build a company, short of that hope at least can advance the field.
I had thought about submitting one of my CVPR 2022 essays to the #CVPR2023 main proceedings but I don’t think they would align with the call for papers. Perhaps there might be a blogging workshop or something who knows.
1
I discovered subsequent to ICML submittal is that the term “geometric regularization” isn’t novel in the literature. Gropp 2020 used a (non-overlapping) phrase in context of describing networks learning shapes. I used b/c thought it sounded better than “volumetric regularization”