I found a fascinating post last week on /r/ProgrammerHumor: optimize later.
Here's the original image:
Yes, this piece of code sorts a specific string by randomizing it infinitely and comparing it to the desired result. Once the desired result has been achieved, it prints the string and stops execution.
Without even getting into why one would need to sort a known string and compare it to a known result before returning, I want to look into the (in)efficiency of the code in question.
Let's start with some Python code that operates more/less the same way:
#!/usr/bin/env python
import random
import sys
string_source = "typewriter"
string_target = "eeiprrttwy"
for i in xrange(0, sys.maxint):
string_list = list(string_source)
random.shuffle(string_list)
string_result = "".join(string_list)
if string_result == string_target:
break
if i == sys.maxint - 1:
i = 0
Since simply rewriting the code in Python isn't particularly interesting, we should run it a few times and keep track of some numbers:
#!/usr/bin/env python
import random
import sys
string_source = "typewriter"
string_target = "eeiprrttwy"
runs = []
for run in xrange(0, 100):
for i in xrange(0, sys.maxint):
string_list = list(string_source)
random.shuffle(string_list)
string_result = "".join(string_list)
if string_result == string_target:
runs.append(i)
break
if i == sys.maxint - 1:
i = 0
print(runs)
The raw result is a 100-element list full of hilariously large numbers:
[269549, 137496, 240377, 484345, 673976, 1613987, 92846, 352143, 1136790, 2108562, 876595, 642806, 323282, 1513, 109430, 591031, 515341, 49572, 1079214, 1718, 336642, 85076, 187450, 34702, 1220284, 684380, 1053765, 1793595, 300108, 1459472, 636922, 276027, 201474, 706707, 492543, 1462999, 425504, 330468, 539478, 563176, 623868, 245793, 7367, 23726, 623156, 353938, 6188, 33855, 1148584, 7989, 840779, 52267, 236986, 429023, 672752, 986512, 345828, 987815, 987144, 26443, 1232278, 2447360, 276129, 319292, 1180522, 99683, 627885, 348592, 828685, 181212, 113842, 194562, 1433020, 352353, 1437230, 60914, 712325, 286642, 348062, 516207, 539552, 154839, 65763, 17699, 36090, 911231, 26066, 1898790, 407310, 625485, 77086, 453708, 1246709, 154345, 844968, 395168, 181864, 13757, 566216, 282737]
After 100 runs in 4 minutes, 47 seconds, we have a high of 2,447,360, low of 1,513, mean of 546,275, median of 374,553, and average run time of 2.87 seconds . These are numbers of shuffles necessary to sort via randomization.
If it's still not obvious why this is an absolutely appalling way to sort strings, consider this far simpler solution (again in Python):
#!/usr/bin/env python
string_source = "typewriter"
string_result = "".join(sorted(string_source))
print(string_result)
This gives us 100 runs in 0.019 seconds, for an average run time of 0.00019 seconds, which is roughly 15,105.26 times faster.
When you're sorting things, make sure you're, uhh, actually sorting them…
The presumed author of the code is Beck3570 and I'm not 100% sure if the whole thing (along with their previous posts) is a joke or not.
Image saved from Imgur.
Thanks for reading! You can keep up with my writing via the feed or newsletter, or you can get in touch via email or Mastodon.