Opens profile photo
Follow
Click to Follow borisdayma
Boris Dayma 🖍️
@borisdayma
🎨 Using AI for Art/Memes 👉 Follow me for latest updates on craiyon
craiyon.comJoined February 2012

Boris Dayma 🖍️’s Tweets

You can now easily search through millions of images generated through the new Craiyon model
Quote Tweet
🚨Announcing Craiyon Search engine! Search through millions of Craiyon's AI generated images and prompts. Link➡️ craiyon.com/search #craiyon #aiart #ai #aiartcommunity #aiartwork
Image
3
60
Show this thread
Amazing video 😍 Strongly recommend! I couldn't have thought of a smarter way to gather data to reproduce ChatGPT! Pretty confident it's gonna work 👏
Quote Tweet
It's surprisingly fun to collect data for OpenAssistant - Our open-source alternative to ChatGPT! Check out the video: youtu.be/64Izfm24FKA #openassistant #chatgpt
Show this thread
Image
41
Excellent thread on training guidelines ❤️ This type of knowledge is essential for training well large models
Quote Tweet
Excited to announce our Deep Learning Tuning Playbook, a writeup of tips & tricks we employ when designing DL experiments. We use these techniques to deploy numerous large-scale model improvements and hope formalizing them helps the community do the same! github.com/google-researc
Show this thread
Image
1
32
Show this thread
So glad people had fun with dalle-mini this year. More cool stuff coming in 2023!
Quote Tweet
Replying to @ClementDelangue @borisdayma and 7 others
my this year favourite was the dalle mini by @borisdayma it was just amazing and was really quick when almost everyone was waiting for their chance to get access to dall e, he allowed us to use it before openai
3
40
CLIP is a fun model to experiment with when trained because it's visual. It can be challenging to train because you need large batches that can see all the other samples so cannot rely on gradient averaging per device + accumulation but need all-gather operations for the loss.
2
22
Show this thread
Great guide on parallelism with JAX pjit. Glad to see how the API is getting more simple and concise. Note that for a large range of smaller models pmap is super easy to use.
Quote Tweet
New Flax guide on auto-parallelism with JAX's pjit! We are adding a guide that shows how to scale up your Flax Module on multiple devices using JAX's latest auto-SPMD API, pjit, and to explore various partition layouts with custom dimension names. flax.readthedocs.io/en/latest/guid
9
Using transformers for diffusion, awesome results and great detailed ablations! Interesting that code is released in Pytorch when it was trained in JAX on TPU pods 🤔 Transformers are ideal in JAX because you can "scan" repetitive blocks for fast compilation
Quote Tweet
Scalable Diffusion Models with Transformers abs: arxiv.org/abs/2212.09748 largest DiT-XL/2 models outperform all prior diffusion models on the class conditional ImageNet 512×512 and 256×256 benchmarks, achieving a state-of-the-art FID of 2.27 on the latter
Show this thread
Image
8
103
Really nice talk to watch! bringing back memories of dalle-mini birth gave me chills!
Quote Tweet
To inspire you for our just-released Diffusion Models Course 🎓 with @johnowhitaker we are excited to share the free online event with @hardmaru, @deviparikh, @Buntworthy, @robrombach, @pess_r and @multimodalart on Nov 30th at 18h CET🎋 Register here: huggingface.us17.list-manage.com/subscribe?u=7f
Show this thread
Image
5
Does the number of channels per head matter in attention layer? I see it's typically 64. Would it make a big difference to go to 32 or 128?
2
35
Very cool approach for video editing using atlas + edit layer 😍 Shows how beneficial it is to learn existing techniques and enhance them
Quote Tweet
Played with optimizing Neural Atlases through Stable Diffusion. So much fun! Here are a few examples of video edits: @RafailFridman @DanahYatim
Show this thread
Embedded video
0:13
15.5K views
26
I think also something that can be dangerous with lecun init is if you add also absolute position embeddings, which would have a very different initialization scale (due to vocab size vs max number of positions).
1
3
Show this thread