Conversation

and then converted the int64 to a string of 1s and 0s to examine each byte position. When I looked at a generic sample, I saw very little deviation from .5 for each set bit (meaning that there was close to 50% probability that each bit would be a 1 or a 0) which we would expect
1
3
Replying to
distribution for each bit, I found bits well outside the normal distribution and the bits were always the same when using a sample with specific timestamps. What this means is that it appears that the Youtube ids have structure that correlates to the published time. It
1
6
appears they took the binary data for the timestamp and placed the bits in specific areas to obfuscate it enough so that the ids would appear random when analyzing a generic sample that isn't correlated to anything specific (like the published time). I need to get some
1
6
researchers to walk through what I'm doing to see if this is indeed what I suspected at first -- that Youtube ids are like Twitter's snowflake algo but more obfuscated. WOW -- this would be a HUGE finding because we can reduce the id space substantially and associate
2
11