Can you explain this difference of depth recursion in Python using those seemingly equivalent codes?

Can you explain this difference of depth recursion in Python using those seemingly equivalent codes?

6

I noticed that on my machine, the following reaches the max of depth recursion for n = 2960:

m = {0:0, 1:1}
def f(n):
    if n not in m:
        m[n] = f(n - 1) + f(n - 2)
    return m[n]

while this version reaches it for n = 988:

m = {0:0, 1:1}
def f(n):
    if n not in m:
        m[n] = sum(f(n - i) for i in [1, 2])
    return m[n]

Can anyone explain what happens ‘under the hood’ that explains such difference?

More precisely, it would be very nice to understand why the factor is 3 in this example and be able to derivate it for summations with more terms.

Share
Improve this question

New contributor

dormeur20092010 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

7

  • 4

    It sounds like there are two extra function calls on the stack for each recursive call. Calling sum() accounts for one extra stack frame. My guess is that sum() performs another function call internally, or perhaps evaluating the generator expression takes an additional stack frame. It could be the PyIter_Next call here.

    – John Kugelman

    10 hours ago

  • You can inspect the size of the stack with inspect.stack() which will show the second one does create a substantially larger stack with each call. I think this supports the comment above from @JohnKugelman.

    – Mark

    10 hours ago

  • @JohnKugelman No, sum() does not perform another function call internally, but it's the generator expression that makes that extra function. Switch the generator expression to a list comprehension and that extra function is removed: sum([f(n - i) for i in [1, 2]])

    – blhsing

    10 hours ago

  • 1

    @juanpa.arrivillaga Perhaps I worded it imprecisely. I meant that by using a list comprehension the extra function call is not performed simultaneously with the call of sum(), i.e. the list comprehension function is removed from the call stack before the call of sum().

    – blhsing

    10 hours ago

  • 1

    @blhsing ah, that is a good point!

    – juanpa.arrivillaga

    10 hours ago

3 Answers
3

Reset to default

Highest score (default)

Trending (recent votes count more)

Date modified (newest first)

Date created (oldest first)

3

TL;DR: sum is an extra function call, and the generator expression it is summing over is also implemented with a nested function scope (docs)

… the comprehension is executed in a separate implicitly nested scope. This ensures that names assigned to in the target list don’t “leak” into the enclosing scope.

Therefore the second way uses two extra stack frames during the recursion, as well as the recursive call itself, which is explaining the factor of 3 here.

Note that the default recursion limit is 1000, so you should really be seeing the stack overflow at exactly 1000 for the first case, and at 334 for the second case (on Python 3.10 or lower). To get 2960 and 988 here, you may have:

  • imported something which called sys.setrecursionlimit(3000), and
  • spent a few stack frames already, somewhere.

Running this code within an IDE or a tricked-out interactive REPL, for example, will likely have done both of the above.


Interesting recent developments:

There is PEP 709 – Inlined comprehensions which is all about removing this hidden function from comprehensions. Generator expressions are currently not inlined in the reference implementation for the PEP, although it may be considered in future.

CPython 3.11 already has some optimizations to reduce the number of function calls in this code, which changes the point of stack overflow. It will happen at 501 instead of 334 in CPython 3.11. The reason is described in the changelog at What’s New In Python 3.11: Inlined Python function calls.

Share
Improve this answer

1

  • "Generator expressions are currently not inlined in the reference implementation for the PEP, although it may be considered in future." – even then, generator inlining would be restricted to cases where the generator isn't exposed to other functions, like [x for x in (i**2 for i in l) if x < 1000]. Passing the generator to sum would prevent inlining.

    – user2357112

    8 hours ago

1

I’ll add my debugging here as well. After running some tests, I found that the traceback module was giving me a call stack that was significantly (~333) less than the recursion limit. I noticed that this difference was always close to the number of <genexpr> invocations on the call stack:

import sys
import traceback
from collections import Counter
from pprint import pprint

sum_every_n = int(sys.argv[1])

def f(n=0):
    try:
        if n % sum_every_n == 0:
            return sum(f(n+i) for i in [1,2])
        else:
            return f(n+1) + f(n+2)
    except RecursionError:
        stack = traceback.extract_stack()
        counts = Counter([frame.name for frame in stack])

        stack_size = len(stack)
        stack_size_plus = stack_size + counts['<genexpr>']

        pprint(counts)
        print(f'rec limit:      {sys.getrecursionlimit()}')
        print(f'stack size:     {stack_size}')
        print(f'adj stack size: {stack_size_plus}')
        sys.exit()

f()

Here are the results for some values of sum_every_n:

$ ./rec3.py 1
Counter({'f': 333, '<genexpr>': 332, '<module>': 1})
rec limit:      1000
stack size:     666
adj stack size: 998

$ ./rec3.py 2
Counter({'f': 499, '<genexpr>': 249, '<module>': 1})
rec limit:      1000
stack size:     749
adj stack size: 998

$ ./rec3.py 3
Counter({'f': 599, '<genexpr>': 200, '<module>': 1})
rec limit:      1000
stack size:     800
adj stack size: 1000

$ ./rec3.py 4
Counter({'f': 665, '<genexpr>': 166, '<module>': 1})
rec limit:      1000
stack size:     832
adj stack size: 998

It seems that the generator is indeed adding a function call to the stack, but also that each generator expression counts as two functions on the call stack. This explains the ratio in your original question. Not sure why this is the case though, and I welcome any possible explanations!

Share
Improve this answer

0

A fascinating behavior, I did not get to the root of the issue but I still would like to share some logging code that I wrote that might help others find the issue:

m = {0:0, 1:1}
def f(n, depth=0):
 print(f"{'  '*depth}f({n})")
 if n not in m:
  print(f"{'  '*depth}Computing f({n-1}) + f({n-2})")
  m[n] = sum(f(n - i, depth+1) for i in [1, 2])
 else:
  print(f"{'  '*depth}Already computed f({n})")
 return m[n]

print("Testing the version with the iterator, call stack is:")
f(10)

m = {0:0, 1:1}
def g(n, depth=0):
 print(f"{'  '*depth}g({n})")
 if n not in m:
  print(f"{'  '*depth}Computing g({n-1}) + g({n-2})")
  m[n] = g(n - 1, depth+1) + g(n - 2, depth+1)
 else:
  print(f"{'  '*depth}Already computed g({n})")
 return m[n]

print("Testing the version with the normal sum, call stack is:")
g(10)

The output is:

Testing the version with the iterator, call stack is:
f(10)
Computing f(9) + f(8)
  f(9)
  Computing f(8) + f(7)
    f(8)
    Computing f(7) + f(6)
      f(7)
      Computing f(6) + f(5)
        f(6)
        Computing f(5) + f(4)
          f(5)
          Computing f(4) + f(3)
            f(4)
            Computing f(3) + f(2)
              f(3)
              Computing f(2) + f(1)
                f(2)
                Computing f(1) + f(0)
                  f(1)
                  Already computed f(1)
                  f(0)
                  Already computed f(0)
                f(1)
                Already computed f(1)
              f(2)
              Already computed f(2)
            f(3)
            Already computed f(3)
          f(4)
          Already computed f(4)
        f(5)
        Already computed f(5)
      f(6)
      Already computed f(6)
    f(7)
    Already computed f(7)
  f(8)
  Already computed f(8)
Testing the version with the normal sum, call stack is:
g(10)
Computing g(9) + g(8)
  g(9)
  Computing g(8) + g(7)
    g(8)
    Computing g(7) + g(6)
      g(7)
      Computing g(6) + g(5)
        g(6)
        Computing g(5) + g(4)
          g(5)
          Computing g(4) + g(3)
            g(4)
            Computing g(3) + g(2)
              g(3)
              Computing g(2) + g(1)
                g(2)
                Computing g(1) + g(0)
                  g(1)
                  Already computed g(1)
                  g(0)
                  Already computed g(0)
                g(1)
                Already computed g(1)
              g(2)
              Already computed g(2)
            g(3)
            Already computed g(3)
          g(4)
          Already computed g(4)
        g(5)
        Already computed g(5)
      g(6)
      Already computed g(6)
    g(7)
    Already computed g(7)
  g(8)
  Already computed g(8)

So they look to be doing the exact same thing… my Python version is 3.8.15, You should try running this code to see if the output is the same also with your Python version.

On my Python version, I hit the recursion limit for f(333) and g(997)


Edit: thanks to John Kugelman I am getting closer to the answer, with the code:

import time
import inspect

m = {0:0, 1:1}
def f(n, depth=0):
    # fraction part of time
 print(f"{'  '*depth}f({n})")
 if n not in m:
  print(f"{'  '*depth}Computing f({n-1}) + f({n-2})")
  m[n] = sum(f(n - i, depth+1) for i in [1, 2])
 else:
  print(f"{'  '*depth}Already computed f({n})")
 print(f"{'  '*depth}Returning {m[n]}")
 print(f"{'  '*depth}Call stack has length {len(inspect.stack())} and is: {[x.function for x in inspect.stack()]}")
 return m[n]

print("Testing the version with the iterator, call stack is:")
start = time.time()
f(10)

m = {0:0, 1:1}
def g(n, depth=0):
 print(f"{'  '*depth}g({n})")
 if n not in m:
  print(f"{'  '*depth}Computing g({n-1}) + g({n-2})")
  m[n] = g(n - 1, depth+1) + g(n - 2, depth+1)
 else:
  print(f"{'  '*depth}Already computed g({n})")
 print(f"{'  '*depth}Returning {m[n]}")
 print(f"{'  '*depth}Call stack has length {len(inspect.stack())} and is: {[x.function for x in inspect.stack()]}")
 return m[n]

print("Testing the version with the normal sum, call stack is:")
g(10)

import sys
print(sys.version)

The output is:

Testing the version with the iterator, call stack is:
f(10)
Computing f(9) + f(8)
  f(9)
  Computing f(8) + f(7)
    f(8)
    Computing f(7) + f(6)
      f(7)
      Computing f(6) + f(5)
        f(6)
        Computing f(5) + f(4)
          f(5)
          Computing f(4) + f(3)
            f(4)
            Computing f(3) + f(2)
              f(3)
              Computing f(2) + f(1)
                f(2)
                Computing f(1) + f(0)
                  f(1)
                  Already computed f(1)
                  Returning 1
                  Call stack has length 20 and is: ['f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<module>']
                  f(0)
                  Already computed f(0)
                  Returning 0
                  Call stack has length 20 and is: ['f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<module>']
                Returning 1
                Call stack has length 18 and is: ['f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<module>']
                f(1)
                Already computed f(1)
                Returning 1
                Call stack has length 18 and is: ['f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<module>']
              Returning 2
              Call stack has length 16 and is: ['f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<module>']
              f(2)
              Already computed f(2)
              Returning 1
              Call stack has length 16 and is: ['f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<module>']
            Returning 3
            Call stack has length 14 and is: ['f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<module>']
            f(3)
            Already computed f(3)
            Returning 2
            Call stack has length 14 and is: ['f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<module>']
          Returning 5
          Call stack has length 12 and is: ['f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<module>']
          f(4)
          Already computed f(4)
          Returning 3
          Call stack has length 12 and is: ['f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<module>']
        Returning 8
        Call stack has length 10 and is: ['f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<module>']
        f(5)
        Already computed f(5)
        Returning 5
        Call stack has length 10 and is: ['f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<module>']
      Returning 13
      Call stack has length 8 and is: ['f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<module>']
      f(6)
      Already computed f(6)
      Returning 8
      Call stack has length 8 and is: ['f', '<genexpr>', 'f', '<genexpr>', 'f', '<genexpr>', 'f', '<module>']
    Returning 21
    Call stack has length 6 and is: ['f', '<genexpr>', 'f', '<genexpr>', 'f', '<module>']
    f(7)
    Already computed f(7)
    Returning 13
    Call stack has length 6 and is: ['f', '<genexpr>', 'f', '<genexpr>', 'f', '<module>']
  Returning 34
  Call stack has length 4 and is: ['f', '<genexpr>', 'f', '<module>']
  f(8)
  Already computed f(8)
  Returning 21
  Call stack has length 4 and is: ['f', '<genexpr>', 'f', '<module>']
Returning 55
Call stack has length 2 and is: ['f', '<module>']
Testing the version with the normal sum, call stack is:
g(10)
Computing g(9) + g(8)
  g(9)
  Computing g(8) + g(7)
    g(8)
    Computing g(7) + g(6)
      g(7)
      Computing g(6) + g(5)
        g(6)
        Computing g(5) + g(4)
          g(5)
          Computing g(4) + g(3)
            g(4)
            Computing g(3) + g(2)
              g(3)
              Computing g(2) + g(1)
                g(2)
                Computing g(1) + g(0)
                  g(1)
                  Already computed g(1)
                  Returning 1
                  Call stack has length 11 and is: ['g', 'g', 'g', 'g', 'g', 'g', 'g', 'g', 'g', 'g', '<module>']
                  g(0)
                  Already computed g(0)
                  Returning 0
                  Call stack has length 11 and is: ['g', 'g', 'g', 'g', 'g', 'g', 'g', 'g', 'g', 'g', '<module>']
                Returning 1
                Call stack has length 10 and is: ['g', 'g', 'g', 'g', 'g', 'g', 'g', 'g', 'g', '<module>']
                g(1)
                Already computed g(1)
                Returning 1
                Call stack has length 10 and is: ['g', 'g', 'g', 'g', 'g', 'g', 'g', 'g', 'g', '<module>']
              Returning 2
              Call stack has length 9 and is: ['g', 'g', 'g', 'g', 'g', 'g', 'g', 'g', '<module>']
              g(2)
              Already computed g(2)
              Returning 1
              Call stack has length 9 and is: ['g', 'g', 'g', 'g', 'g', 'g', 'g', 'g', '<module>']
            Returning 3
            Call stack has length 8 and is: ['g', 'g', 'g', 'g', 'g', 'g', 'g', '<module>']
            g(3)
            Already computed g(3)
            Returning 2
            Call stack has length 8 and is: ['g', 'g', 'g', 'g', 'g', 'g', 'g', '<module>']
          Returning 5
          Call stack has length 7 and is: ['g', 'g', 'g', 'g', 'g', 'g', '<module>']
          g(4)
          Already computed g(4)
          Returning 3
          Call stack has length 7 and is: ['g', 'g', 'g', 'g', 'g', 'g', '<module>']
        Returning 8
        Call stack has length 6 and is: ['g', 'g', 'g', 'g', 'g', '<module>']
        g(5)
        Already computed g(5)
        Returning 5
        Call stack has length 6 and is: ['g', 'g', 'g', 'g', 'g', '<module>']
      Returning 13
      Call stack has length 5 and is: ['g', 'g', 'g', 'g', '<module>']
      g(6)
      Already computed g(6)
      Returning 8
      Call stack has length 5 and is: ['g', 'g', 'g', 'g', '<module>']
    Returning 21
    Call stack has length 4 and is: ['g', 'g', 'g', '<module>']
    g(7)
    Already computed g(7)
    Returning 13
    Call stack has length 4 and is: ['g', 'g', 'g', '<module>']
  Returning 34
  Call stack has length 3 and is: ['g', 'g', '<module>']
  g(8)
  Already computed g(8)
  Returning 21
  Call stack has length 3 and is: ['g', 'g', '<module>']
Returning 55
Call stack has length 2 and is: ['g', '<module>']
3.8.15 (default, Nov 24 2022, 15:19:38) 
[GCC 11.2.0]

So the problem is that the generator expression is getting put on the call stack, taking up space that could be used by the actual recursive function. Now it would be interesting to investigate why the generator expression is getting put on the stack as if it were a function.

Share
Improve this answer

1

  • "Now it would be interesting to investigate why the generator expression is getting put on the stack as if it were a function" A generator expression evaluates to a generator object. This is done by creating a generator function, calling it, and returing the generator object. Generators work has coroutines. Iteration over the generator object requires the generator function execution frame.

    – juanpa.arrivillaga

    9 hours ago

Your Answer

dormeur20092010 is a new contributor. Be nice, and check out our Code of Conduct.

Draft saved
Draft discarded

Post as a guest

Required, but never shown


By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

Not the answer you're looking for? Browse other questions tagged

or ask your own question.

Leave a Reply

Your email address will not be published. Required fields are marked *