Opens profile photo
Follow
Click to Follow du_yilun
Yilun Du
@du_yilun
PhD student at /, Making Machine Learning Tools for Embodied AI. Generative Models/Neural Fields. Interned at , ,
Science & TechnologyMITyilundu.github.ioJoined September 2012

Yilun Du’s Tweets

We've added how to use composable diffusion to compose text-guided 3D shapes from Point-E ()! Make interesting hybrids of 3D shapes using the demo below! huggingface.co/spaces/Shuang5
Quote Tweet
Compositional Visual Generation with Composable Diffusion Models – ECCV 2022 @nanliuuu @du_yilun Antonio Torralba, Joshua B. Tenenbaum. We present Composable-Diffusion, an approach for compositional visual generation by composing multiple diffusion models together.
Show this thread
1
81
Show this thread
Excited to share recent work showing that by interpreting a LM as a EBM, we can design de novo proteins that fold in the real life!
Quote Tweet
Language models understand natural proteins. But can they generalize beyond, to design completely new proteins from scratch? New preprint: biorxiv.org/content/10.110 A 🧵
Show this thread
Embedded video
0:13
63.7K views
63
Check out our latest work showing how we can extend NDFs to manipulate multiple objects to have specified relations!
Quote Tweet
A system for performing *relational rearrangement*, that achieves both generalization and data efficiency: anthonysimeonov.github.io/r-ndf/ R-NDFs enable 6-DoF rearrangement of unseen object *pairs*, presented in arbitrary initial poses, using just 5-10 task demonstrations.
Show this thread
Embedded video
1:41
1.8K views
24
We further show benefits of Decision Diffuser by considering two other conditioning variables: constraints/skills. Conditioning on a constraint or skill during training leads to behaviors at test-time that can satisfy several constraints together or composition of skills. 3/5
Image
1
12
Show this thread
By modeling a policy as a return conditional diffusion model, we illustrate how we may circumvent the need for dynamic programming and subsequently eliminate many of the complexities that come with traditional offline RL. 2/5
Image
1
15
Show this thread
Excited to share our recent work on Neural Acoustic Fields (NAF) at ! In a similar way to how NeRF can synthesize novel views in a scene, NAF is a neural field that simulates sound at any location in a scene. Project Page and Code: andrew.cmu.edu/user/afluo/Neu
Quote Tweet
A new machine-learning model simulates how a listener would hear a sound from any point in a room. “I think this work opens up an exciting research direction on better utilizing sound to model the world,” Yilun Du says. mitsha.re/LHHL50LqJAf
Two figures show floorplan of the same apartment. A red dot is in the bottom right corner of apartment. The figure on the right includes furniture and room details; the left figure shows how loud the sound emitter is in every room. Close to the red dot the area is yellow, and further away it is blue.
3
113
Show this thread
Check out our new work on composing different large pretrained models together! We illustrate how we can generalize EBM composition (energy-based-model.github.io) more broadly across different large pretrained models to solve diverse tasks (reasoning, generation, manipulation, VQA).
Quote Tweet
How can we utilize pre-trained models from different modalities? We introduce a unified framework for composing ensembles of different pre-trained models to solve various multimodal problems in a zero-shot manner. @du_yilun @IMordatch, Antonio, and Josh. energy-based-model.github.io/composing-pret
Show this thread
Embedded video
0:32
6.9K views
18
Check out our work on Composable Diffusion at ECCV is now on MIT New! You can try our model here: huggingface.co/spaces/Shuang5 Here's what you get when you enter the prompt: A white church. AND Lightning in the background. AND A flood. w/
Image
Quote Tweet
Eight images showing "a train on a bridge" and "a river under the bridge,” all generated using a new method developed by MIT researchers.
A new system makes models like DALL-E 2 — an artificial intelligence-based image generator — more creative. CSAIL researchers developed the new method, which uses multiple models to create more complex images with better understanding. mitsha.re/Oynp50KI0Fi
1
16
Image
Quote Tweet
Embedded video
0:08
Compositional Visual Generation with Composable Diffusion Models – ECCV 2022 @nanliuuu @du_yilun Antonio Torralba, Joshua B. Tenenbaum. We present Composable-Diffusion, an approach for compositional visual generation by composing multiple diffusion models together.
Show this thread
18
The workshop on Foundation Models for Decision Making is open to submissions! Submit your (preliminary) work (4-9 pages) at the link below! openreview.net/group?id=NeurI
Quote Tweet
We are open for submissions! sites.google.com/corp/view/fmdm I know there are lots of people working on large models, pretraining, cross-domain/agent generalization for RL. Please submit your papers to the 1st FMDM workshop at NeurIPS 2022!
Show this thread
Image
7
Excited to share Composable Diffusion at #ECCV! We present an approach towards compositional visual generation with diffusion models which explicitly composes multiple different diffusion models together to generate an image. Code + Models + Paper: energy-based-model.github.io/Compositional-!
Quote Tweet
Compositional Visual Generation with Composable Diffusion Models – ECCV 2022 @nanliuuu @du_yilun Antonio Torralba, Joshua B. Tenenbaum. We present Composable-Diffusion, an approach for compositional visual generation by composing multiple diffusion models together.
Show this thread
Embedded video
0:08
22.1K views
2
41
How can large pre-trained sources of world knowledge help inform decision making? Check out the FMDM workshop (sites.google.com/view/fmdm-neur) at !
Quote Tweet
Interested in foundation models + RL? Keep an eye out for the 1st "Foundation Models for Decision Making" workshop at NeurIPS 2022: sites.google.com/view/fmdm-neur. Call for submissions will soon follow. w. @du_yilun @jparkerholder @siddkaramcheti @IMordatch @shaneguML @ofirnachum
Image
5
Reasoning as energy minimization! One of the key points of my recent position paper on autonomous machine intelligence openreview.net/forum?id=BZ5a1 (and of my 2006 "tutorial on energy-based learning").
Quote Tweet
(1/6) How can we learn to iteratively reason about the problems in the world? In our #ICML2022 paper, we introduce an approach towards iterative reasoning based off energy minimization : Website: energy-based-model.github.io/iterative-reas w/ @ShuangL13799063 Josh Tenenbaum @IMordatch
Show this thread
Embedded video
0:05
49.4K views
3
164
(5/6) Finally, we may recursively solve sequences of reasoning problems by nesting optimization. Below, we illustrate the predictions of our approach when four separate matrices are added together (first 9 entries of 400 shown)
Image
1
10
Show this thread
(4/6) This iterative optimization procedure enables us to solve complex matrix operations such matrix inverse, addition, and low rank factorization. Below we illustrate predictions when adding two matrices as we run optimization. Over time, our prediction error drops to zero.
Embedded video
GIF
1
10
Show this thread
(3/6) We can solve graph problems such as connected components/shortest paths. We show shortest path predictions over optimization steps (left). We can further predict short paths on 30 node graphs at test time when trained on 10 node graphs (IREM refers to our method) (right).
Embedded video
GIF
1
9
Show this thread
(2/6) We formulate reasoning as an iterative optimization procedure on an energy landscape, with each step of reasoning an optimization step. This enables us to adaptively use more computation, through more complex optimization procedures, to solve harder problems at test time.
Embedded video
GIF
2
13
Show this thread
Check out our work on applying diffusion to data driven control (#icml2022 long talk)! We cast trajectory optimization as conditional sampling on a diffusion model. This enables test time adaptation (through conditioning) to tasks such as planning, block stacking, and RL.
Quote Tweet
Can we do model-based RL just by treating a trajectory like a huge image, and training a diffusion model to generate trajectories? Diffuser does exactly this, guiding generative diffusion models over trajectories with Q-values! diffusion-planning.github.io 🧵->
Show this thread
Embedded video
GIF
1
96
Starting right now on youtube!
Quote Tweet
Monday (March 28th) at 4PM EST @OriolVinyalsML (@DeepMind) will give a in person talk at the @MIT_CSAIL Embodied Intelligence seminar about The Deep Learning Toolbox: From AlphaFold to AlphaCode! Check it out in person in 34-101 and the livestream at: youtube.com/channel/UCnXGb
Image
8