Conversation

This Tweet was deleted by the Tweet author. Learn more
Thanks Megan! That is helpful. Also, did you see my tweet about the ID structure of Youtube ids? I'm not entirely convinced that their ids are truly random. If there was any structure to the ids, that might give clues as to the level of activity for a time range.
1
That's interesting -- my assumption was that they're random. I think that's the assumption the aforementioned paper relies on as well... I think it's entirely possible that the IDs are randomly selected since the ID space is so large, but I have no clue if that's true...
1
In my previous tweet, I showed that the 11 digit ids are a slightly modified base64 representation of a 64 bit integer. We know Twitter uses the Snowflake algo to embed time data and server/node and datacenter ids so I'm wondering if Youtube's ID scheme uses something similar.
1
For me it makes sense to aggregate data on a channel level, a large-scale crawl could use "subscribed lists" and get the largest component of the node. A good place to start is social blade data. I crawled some (72M videos)
1
1