[HN Gopher] Recent Performance Improvements in Function Calls in...
       ___________________________________________________________________
        
       Recent Performance Improvements in Function Calls in CPython
        
       Author : rbanffy
       Score  : 53 points
       Date   : 2024-08-08 19:23 UTC (3 hours ago)
        
 (HTM) web link (blog.codingconfessions.com)
 (TXT) w3m dump (blog.codingconfessions.com)
        
       | winrid wrote:
       | AFAIK this is because CPython has to walk the scope up to find
       | the import for every call in your loop, and still applies to
       | python3, right? You can still use the built in min, just create a
       | "closer" reference before your loop for the same speedup:
       | 
       | inline_min = min
       | 
       | while expr:                   if inline_min(blah):
        
         | boxed wrote:
         | That's not how imports work in python at all.
         | 
         | Imports are super imperative operations that first tries to
         | find it in sys.modules, otherwise executes the module and puts
         | the resulting dict in sys.modules. Then it grabs the symbols
         | you asked for and shoves all those symbols into the global dict
         | for the module.
         | 
         | It does have to walk the locals and the globals scopes (there
         | are ONLY exactly two scopes in Python!) to find the function,
         | that's true.
        
           | winrid wrote:
           | > It does have to walk the locals and the globals scopes
           | 
           | right, this is what I meant.
        
           | winrid wrote:
           | Also I imagine the dict lookup for the module is the slow
           | part? So declaring in local scope just removes a dict lookup?
           | 
           | I am by no means a python expert :) I just use it
           | occasionally.
        
           | hansvm wrote:
           | TIL. Apparently `nonlocal` and closures are implemented with
           | copies, at least for my copy of dis.dis().
        
         | ipsum2 wrote:
         | Benchmark it, it won't be any faster.
         | 
         | Using the if statement: Execution time: 0.00648 seconds
         | 
         | Using the min function directly: Execution time: 0.02298
         | seconds
         | 
         | Using the min function with an intermediate variable: Execution
         | time: 0.02959 seconds
        
           | necovek wrote:
           | How about using the `min` builtin directly over the entire
           | list?
        
           | winrid wrote:
           | Sure, let's benchmark it :)
           | 
           | It is consistently around 8% to 15% faster on 3.10.12 and
           | 3.11 for me. On 3.12.5 (latest) I seem to get the same
           | result.
           | 
           | https://www.online-python.com/B6AgKW5zod
           | 
           | (please copy the code to your local to not ddos this site :D)
        
       | necovek wrote:
       | I wonder how would simply doing a `return min(heights)` compare
       | to any of the options given?
       | 
       | (It sure doesn't demonstrate the improvements between interpreter
       | versions, but that's the classic, Python way of optimizing: let
       | builtins do all the looping)
        
         | necovek wrote:
         | I was curious myself, so I've ran a quick benchmark:
         | import random       import timeit            heights =
         | [random.randint(0, 10000)/100 for i in range(10000)]
         | def benchmark1(heights):           smallest = heights[0]
         | count = len(heights) - 1           while count > 0:
         | if heights[count] < smallest:                   smallest =
         | heights[count]               count -= 1           return
         | smallest                 def benchmark1b(heights):           a
         | = 1           b = len(heights) - 1           min_height =
         | heights[0]           while a < b:               if heights[a] <
         | min_height:                   min_height = heights[a]
         | a += 1                return min_height                 def
         | benchmark2(heights):           smallest = heights[0]
         | count = len(heights) - 1           while count > 0:
         | smallest = min(heights[count], smallest)               count -=
         | 1           return smallest
         | print(timeit.timeit('min(heights)', number=1000,
         | globals={'heights': heights}))
         | print(timeit.timeit('benchmark1(heights)', number=1000,
         | globals={'heights': heights, 'benchmark1': benchmark1}))
         | print(timeit.timeit('benchmark1b(heights)', number=1000,
         | globals={'heights': heights, 'benchmark1b': benchmark1b}))
         | print(timeit.timeit('benchmark2(heights)', number=1000,
         | globals={'heights': heights, 'benchmark2': benchmark2}))
         | 
         | Here are the results in Python 3.11:
         | 0.04471710091456771       0.21777329698670655
         | 0.22779683792032301       0.6679719020612538
         | 
         | So, using min over a list is ~5x faster, using a single
         | variable and a constant 0 is ~5% faster than using two for
         | boundaries, and using min inside the loop instead of the if
         | check is another 3 times slower: so, the old approach of
         | looking for opportunities to use a builtin instead of looping
         | still likely "wins" in the newer interpreters too, but if
         | someone's got 3.14 alpha up, I'd love to see the results.
         | 
         | I might install 3.13 to check it out there too.
        
       | L-four wrote:
       | Loops should be avoided in python. Only constant time operations
       | should be performed.
        
         | kaldah wrote:
         | Careful! Discussing Python performance could provoke the
         | Steering Council to issue a Fatwa.
        
       | hoten wrote:
       | So there's only three super-instructions? I wonder if there are
       | plans for more.
        
       ___________________________________________________________________
       (page generated 2024-08-08 23:00 UTC)