Conversation

and then converted the int64 to a string of 1s and 0s to examine each byte position. When I looked at a generic sample, I saw very little deviation from .5 for each set bit (meaning that there was close to 50% probability that each bit would be a 1 or a 0) which we would expect
1
3
if the distribution was normal. However, I then took a different sample where the publishedtime of the video had a specific second (the second would end in 1 or some other value) which would make the sample representative of correlated timestamp values. When I ran the
1
2
distribution for each bit, I found bits well outside the normal distribution and the bits were always the same when using a sample with specific timestamps. What this means is that it appears that the Youtube ids have structure that correlates to the published time. It
1
6
Replying to
researchers to walk through what I'm doing to see if this is indeed what I suspected at first -- that Youtube ids are like Twitter's snowflake algo but more obfuscated. WOW -- this would be a HUGE finding because we can reduce the id space substantially and associate
2
11