Doshi's blog


Little piece of *nix trivia

— ~3mn de lecture

Korbak, a friend of mine, wrote a joke-article about Jeff Bezos, presenting the Jeff Bezos bomb, a little shell script. He didn’t thought of it as a real bomb, the name was only a joke. But, as it turns out…

Introduction to the “problem”

Here goes the original code :

for i in {1..150000000000}
   echo "Jeff Bezos"

When writing that, Korbak thought it would actually be run as if it were the same as the following :

while [ $i -lt 150000000000 ]; do
	echo "Jeff Bezos"

That is to say, it would increment a variable i 150000000000 times, resulting in printing a lot of times “Jeff Bezos”

Such a code is harmless, because even if it takes a whole core to compute, I have 4 of those, so it’s not a problem. Incrementing a variable is easy.


Actually, running Korbak’s script on my machine really made it crash 1

Why doesn’t it behave as we expected

Here’s my conjecture :

Using that {} syntax, bash computes the full list before hand. That computation eats all of the RAM, and that’s what made my computer crash.

A list of all the integers from 1 to 150,000,000,000, considering an integer as 4-byte value 2, would take 150,000,000,000 × 4 = 600,000,000,000 bytes = 600 gigabytes

Yeah, computing that entire list would take up 600GB in RAM. That’s a hundred times more than I have on my machine, no surprise it crashed :)

“But why does it do that, it’s stupid ‽”

Well, here goes another one of my conjectures :

That brackety notation {} doesn’t only allow numbers in it, I could for instance write something like that

    for img in {*.jpg}
      # do something

It would have to compute the list of jpg files before entering the loop, I guess, so I think it does that for everything, not just candidates to brace expansion

Yeah, very vague answer to the question, but hey

EDIT 2018-20-10 :

Another way of writing an efficient loop like that would be the following :

    seq 150000000000 | while read i; do
      echo "Jeff Bezos"

One would think that we would have to complete the seq command before moving on to the while loop, but that’s not the case, since there is a pipe ! The pipe turns the problem into an asyncronous task, a perfect exemple of a “producer-consumer problem”. The seq command will feed data into the pipe, while the while loop will read data coming from the pipe as it goes along. You can learn more on that here. Also, a great exemple of the power of the pipe in action is the tar pipe

  1. Yeah I tried to run it on my machine because I had made the same assumption that it would only increment a variable 

  2. Usually, it is true. But 4 bytes only allows going to 2,147,483,647, which is less than our final value of 150,000,000,000. So we would have to use either a mix of int32 and int64, or using an entire array of int64’s, but my point is still valid if we use int32, keep reading. Furthermore, bash doesn’t treat them as ints, but as strings, so it takes way more space. 

Il n'y a pas de commentaires sur ce blog, cependant, n'hésitez pas à me faire des remarques sur cet article (ou autre, d'ailleurs). Que ce soit via Mastodon, Twitter, XMPP ou encore par mail, je serai ravi de voir que des gens me lisent pour de vrai