A few weeks ago I was hosting our IT team’s weekly podcast, Full Frontal Nerdity, and we talked about a viral video "breaking" YouTube. You can listen to the podcast by visiting www.ffntech.com
. It's a little hyperbolic to say that YouTube was broken, but South Korea's favorite rapper, Psy, did create an interesting problem. Apparently his music video for "Gangnam Style" was so incredibly popular that it accrued over two billion views.
Yes, that billion with a "b." I'm sure I personally account for 2 or 3 of those, but that sure is a lot of people watching an Asian man impersonate a horse! The issue with the two billion views is that YouTube's counter, which keeps track of how many hits each specific video receives, caps out at 2,147,483,647. After that, it doesn't know what to do. And that's the part that broke YouTube.
Two Billion One Hundred & Forty Seven Million, Four Hundred & Eighty Three Thousan, Six Hundred & Forty Seven. Whew!
So why that number? 2,147,483,647 seems like a very specific but random number, does it not? It's because YouTube uses a 32 bit integer
to store the view counts. It sounds intimidating, but you don't have to be a computer programmer to understand what it means, and I want to take a few minutes in this article to break it down. It all boils down to how computers count. As humans, we usually use a Base 10 counting system, otherwise known as decimal. It's probably because we're born with 10 fingers (well... most of us are). So as a species we like the number 10 in general. We count up… and every time we reach the number 10 we "rollover" one number and keep on counting. That's why we don't say "twenty-eleven." Instead, when we get to "twenty-ten" we call it "thirty" and start over with the ones column. This is why most of the world uses the metric system, because it's easier to think in Base 10.
Counting in Binary
Computers can't count in Base 10. It seems like a simple thing, but not for computers. They use a Base 2 system, also known as binary. A computer starts counting at 0, gets to 1... And then has to roll that one into the next column and start over, the same way we do when we get to 9. Each of those columns is called a bit in technical parlance. One bit can only display 1 of 2 numbers like this:
0 = 0
1 = 1
There are only two states that bit can exist in; 0 or 1. It's either on or off. Picture it as a light switch that can only be up or down. But if we add a second light switch (bit), the computer can now count up to 4, (well, 0 to 3) like this:
00 = 0
01 = 1
10 = 2
11 = 3
The nerdiest joke you’ll ever hear goes like this; “There are only 10 types of people in this world. Those who understand binary and those who don’t!” Groan. We can go further by adding one more bit, which will now let us represent 8 different values:
000 = 0
001 = 1
010 = 2
011 = 3
100 = 4
101 = 5
110 = 6
111 = 7
As you can see, every time we add one more bit to the mix, it effectively doubles the amount of numbers we can count. 8 bits are called a byte and can represent up to 256 unique values (2 to the power of 8), but 9 bits = 512, and 10 bits = 1024, and so on. These numbers get enormous very quickly. When YouTube added the video counter feature to their site, they (incorrectly) assumed nobody could create a video that would possibly attract more than 2,147,483,647 views. So they told the database that stores this information, "dedicate 32 bits (light switches) to remember how many times the video has been seen." And then Psy destroyed that by singing "heeeeeeeeeey, sexy lady!"
I'm Gonna Need to See Your John Hancock
There's one last thing I want to say about 32 bit integers. If you were playing close attention, and following along with a calculator (nerd alert!) you may have noticed that a 32 bit integer (2 to the power of 32) actually counts to 4,294,967,295 NOT 2,147,483,647. The reason the number is cut in half is because of a thing called signed integers. The problem with counting in binary is that it doesn't allow us to represent negative numbers. If YouTube wants to perform a calculation, for instance; there were 100 views today, but 200 yesterday... what happens if we subtract yesterday's views from today's? Well, that comes out to -100. That's easy for us humans, but a computer has no way of saying that, so things break again. A signed integer, on the other hand, says that the leftmost bit is actually only there to signify if the number is positive or negative. So that turns an 8 bit integer (capable of counting from 0 to 255) into just a 7 bit integer with the addition of a signed bit (now capable of counting from -63 to 63). I know it's a little confusing, but it does make sense, and it's very important in programming. If a program gets the following number: 101 in binary it doesn't know if that means 5 or -1 in decimal, so it's always good practice to use signed integers; that way everyone can assume their programs can count into negative numbers if necessary.
It's Safe to Watch Again
(But for the love of everything that's holy, please don't)
So in the end, YouTube changed their code on the backend of their website and made the counter a 64 bit integer. This means that people can keep on watching Gangnam Style to their hearts' content, and they won't have to change the code again until it reaches 9,223,372,036,854,775,807 views, which is about 1,317,624,576 views per person on the planet. And nobody can handle that much Psy!
At Bit-Wizards we're experts at building high-availability websites that are robust, stylish, revenue-generating, and
can withstand your video going viral.