I learned a valuable lesson today: don't sign up for a brand new Hotmail, upload a 3GB file to Microsoft OneDrive, and post it to the Orange Site front page.
Anyways, I had to point the download link back to my DigitalOcean Spaces...
Zhuowei Zhang
@zhuowei
link in bio
































































































































Zhuowei Zhang’s Tweets
New blog post: worthdoingbadly.com/bsky/
I downloaded all 1,680,399 posts on Bluesky as of 2023-05-01.
1
4
9
Here's my dump of all the posts on Bluesky as of 2023-05-01:
1680399 posts and 45457 accounts in a Postgres database.
keets-org.nyc3.digitaloceanspaces.com/bsky_20230501.
3
⌘ + ⌥ + O + F
Quote Tweet
2
6
Bluesky's official search API is accessible even if you don't have an account.
I tried calling the API and formats the results nicely, and linking them to Amazingca's unofficial Bluesky viewer (which doesn't have its own search yet):
zhuoweizhang.net/bskysearch/
read image description
ALT
2
Apparently federated posts DO show up in Bluesky's search right now, just without usernames: e.g. boobee.blue's creator got a post from that self-hosted instance showing in Bluesky search:
read image description
ALT
1
Show this thread
If you're interested in Bluesky's services, crt.sh/?q=bsky.dev and crt.sh/?q=bsky.social might be interesting: the Certificate Transparency logs shows some subdomains for these domains, corresponding to each component of a Bluesky/Atproto server
1
It looks like Bluesky hosts a BGS (Big Graph Server, the server responsible for crawling all federated servers) at bgs.bsky.social.
What... what happens if you ask it to index your server: would it show up anywhere?
2
3
Show this thread
23027 accounts now: halfway there.
What should I do with this data once I get it?
2
Show this thread
"Tangled" made only half the money of "Frozen" for one simple reason:
Rapunzel is right handed; Elsa is left handed.
Thus, "Wish" is going to be a smash hit with left-handed protagonist Asha.
"Well, actually, Rapunzel is ambidextr"Don't you DARE ruin my box office headcanon
2
11
Update: I am an idiot and the above tweet is wrong; it has four beats, not three. (That'll teach me to buy music on sale at 75% off...)
4
Show this thread
Using AI to separate songs into their stems (voice/drums/instruments) is like Inspect Element on music.
I'm going to enjoy learning how songs are put together.
(Example: Facebook's Hybrid Demucs separated the "Wish" trailer quite well, speaking parts and all!)
4
I'll let it run overnight, then dump the PostgreSQL database. Not sure what to do with it yet.
I was going to try to make a third-party viewer, but it turns out @ thatamazingca already made an amazing Bluesky viewer (blue.amazingca.dev), so not sure what I can do better...
1
4
Show this thread
OK, first 100 accounts done in... 9 minutes and 3 seconds. This is going to take a _week_!
Anyone know any way to make it go faster?
2
Show this thread
I'm now downloading the posts from all 54,246 accounts, one at a time. This'll take a while.
1
1
Show this thread
Quote Tweet
There are currently 50321 accounts registered on Bluesky.
(curl https ://bsky.social/xrpc/com.atproto.sync.listRepos?limit=1000 and keep counting)
Show this thread
1
9
Show this thread
I wanted to make a joke about a "VBA Package Manager":
I discovered there are already three package managers for VBA.
Why
3
16
(Me singing along to the Disney "Wish" trailer: youtu.be/ctlz0R1tSZE?t=)
I won't let this "Docker" contain me
Why would I want a Kubernetes cluster?
I don't need a cloud to scale me
Just one little server is enough!
No YA-YA-YAML! (Oh, no!)
No YA-YA-YAML!
1
2
In Disney's "Wish" (their next animated musical), the main character's obligatory "I wish" song sounds like a waltz, with 3 beats / measure.
The only other "I wish" song with this structure: "Waiting on a Miracle" from Encanto.
What did you do to Disney, Lin-Manuel Miranda?!
2
4
Show this thread
OK, plan "D" for trying to get a Bluesky one-way viewer thingy without melting my cloud server:
- Don't bother downloading old posts
- When receiving notifications of new posts/replies/etc, ignore if target isn't known to my server (nobody's searched for them)
4
Show this thread
I downloaded 107 accounts' full histories and gave up. Have 30,239 (14MB) of Bluesky posts in a CSV:
1
5
Show this thread
It takes one whole minute (!!!!!!!!) to download all of an account's posts into my own Bluesky Atproto/bsky server.
This is a problem if I want to federate with another server and wants to see their old posts...
1
3
Show this thread
There are currently 50321 accounts registered on Bluesky.
(curl https ://bsky.social/xrpc/com.atproto.sync.listRepos?limit=1000 and keep counting)
1
2
9
Show this thread
OK, this is weird: Bluesky bsky repo sync hangs unless I make it single threaded (put an await in github.com/bluesky-social).
But with single threaded, it downloads posts at 1/second...
2
Replying to
did some digging and found this, may be of use? github.com/KingYoSun/atpr
(KingYoSun runs boobee.blue)
1
2
Checkout Bluesky posts by jay.bsky.social & jack.bsky.social with this 3rd party viewer:
blue.amazingca.dev
1
3
Seriously considering making a read-only Bluesky mirror, where you can read Bluesky posts without an invite, but not log in.
Anyone down?
1
11
As far as I can tell, Bluesky's main instance does not support federation yet.
Federation requires a server to connect to BGS (Big Graph Server) and download a firehose of all posts on the network.
I don't think Bluesky's main instance does this yet.
1
OK, I'm an idiot. bsky didn't work because I set my domain name to be non-localhost and it was trying to hit my actual server instead of my computer...
1
1
New plan:
Run Bluesky / Atproto PDS + Bsky + BGS all locally; only PLC gets pointed at Bluesky's shared instance. See if that works
2
(PDS and PLC were there the last time I looked at Atproto/Bluesky last October. Bsky and BGS are new)
Show this thread
PDS:
- Connects to PLC to register/look up usernames
- Connects to BGS to subscribe to all events/posts from federated servers
- Connects to Bsky to render the timeline/user profile pages/posts
1
Show this thread



