Little piece of *nix trivia
— programmation, bash, informatique, english — ~3mn de lecture
Korbak, a friend of mine, wrote a joke-article about Jeff Bezos, presenting the Jeff Bezos bomb, a little shell script. He didn’t thought of it as a real bomb, the name was only a joke. But, as it turns out…
Introduction to the “problem”
Here goes the original code :
#!/bin/bash
for i in {1..150000000000}
do
echo "Jeff Bezos"
done
When writing that, Korbak thought it would actually be run as if it were the same as the following :
i=0
while [ $i -lt 150000000000 ]; do
echo "Jeff Bezos"
((i++))
done
That is to say, it would increment a variable i 150000000000 times, resulting in printing a lot of times “Jeff Bezos”
Such a code is harmless, because even if it takes a whole core to compute, I have 4 of those, so it’s not a problem. Incrementing a variable is easy.
BUT
Actually, running Korbak’s script on my machine really made it crash 1
Why doesn’t it behave as we expected
Here’s my conjecture :
Using that {}
syntax, bash computes the full list before hand. That computation eats all of the RAM, and that’s what made my computer crash.
A list of all the integers from 1 to 150,000,000,000, considering an integer as 4-byte value 2, would take 150,000,000,000 × 4 = 600,000,000,000 bytes = 600 gigabytes
Yeah, computing that entire list would take up 600GB in RAM. That’s a hundred times more than I have on my machine, no surprise it crashed :)
“But why does it do that, it’s stupid ‽”
Well, here goes another one of my conjectures :
That brackety notation {}
doesn’t only allow numbers in it, I could for instance write something like that
#!/bin/bash
for img in {*.jpg}
do
# do something
done
It would have to compute the list of jpg files before entering the loop, I guess, so I think it does that for everything, not just candidates to brace expansion
Yeah, very vague answer to the question, but hey
EDIT 2018-20-10 :
Another way of writing an efficient loop like that would be the following :
#!/bin/bash
seq 150000000000 | while read i; do
echo "Jeff Bezos"
done
One would think that we would have to complete the seq
command before moving on to the while loop, but that’s not the case, since there is a pipe !
The pipe turns the problem into an asyncronous task, a perfect exemple of a “producer-consumer problem”. The seq
command will feed data into the pipe, while the while loop will read data coming from the pipe as it goes along. You can learn more on that here. Also, a great exemple of the power of the pipe in action is the tar pipe
-
Yeah I tried to run it on my machine because I had made the same assumption that it would only increment a variable ↩
-
Usually, it is true. But 4 bytes only allows going to 2,147,483,647, which is less than our final value of 150,000,000,000. So we would have to use either a mix of int32 and int64, or using an entire array of int64’s, but my point is still valid if we use int32, keep reading. Furthermore, bash doesn’t treat them as ints, but as strings, so it takes way more space. ↩
Il n'y a pas de commentaire sur ce blog, donc si vous voulez réagir à cet article, n'hésitez pas à venir m'en parler sur le Fediverse, Twitter ou par mail. Des bisouxes !