4.2. Optimization Toolsï

https://wiki.python.org/moin/TimeComplexity

4.2.1. Containsï

in checks whether iterable contains value
O(n) - in str
O(n) - in list
O(n) - in tuple
O(1) - in set
O(1) - in dict

List:

>>> DATABASE = ['alice', 'bob', 'carol']
>>> user = 'mallory'
>>>
>>> #
... %%timeit -n 1000 -r 1000
... user in DATABASE
165 ns Â± 35 ns per loop (mean Â± std. dev. of 1000 runs, 1,000 loops each)

Set:

>>> DATABASE = {'alice', 'bob', 'carol'}
>>> user = 'mallory'
>>>
>>> #
... %%timeit -n 1000 -r 1000
... login in users
103 ns Â± 34.4 ns per loop (mean Â± std. dev. of 1000 runs, 1,000 loops each)

Note, that the longer our DATABASE is, then the difference is more visible:

List:

>>> DATABASE = ['alice', 'bob', 'carol', 'dave', 'eve']
>>> user = 'mallory'
>>>
>>> #
... %%timeit -n 1000 -r 1000
... login in users
276 ns Â± 90.6 ns per loop (mean Â± std. dev. of 1000 runs, 1,000 loops each)

Set:

>>> DATABASE = {'alice', 'bob', 'carol', 'dave', 'eve'}
>>> user = 'mallory'
>>>
>>> #
... %%timeit -n 1000 -r 1000
... login in users
105 ns Â± 35.2 ns per loop (mean Â± std. dev. of 1000 runs, 1,000 loops each)

4.2.2. PyPyï

http://pypy.org
No GIL
Can speedup couple order of magnitude

4.2.3. Patternsï

Source: [1]

../../_images/performance-hints.png — Figure 4.2. [1]ï

4.2.4. Toolsï

memray [2]
tracemalloc
mmap - memory allocation
Pytest extension for doing benchmarking
pip install line_profiler

../../_images/performance-tracemalloc.png — Figure 4.3. [2]ï

4.2.5. Numpy vectorizationï

Scipy Ecosystem:

4.2.6. Specialized data structuresï

scipy.spatial - for spatial queries like distances, nearest neighbours, etc.
pandas - for SQL-like grouping and aggregation
xarray - for grouping across multiple dimensions
scipy.sparse - sparse metrics for 2-dimensional structured data
sparse (package) - for N-dimensional structured data
scipy.sparse.csgraph - for graph-like problems (e.g. finding shortest paths)

4.2.7. Cythonï

https://numba.pydata.org/
https://cython.readthedocs.io/en/latest/
https://en.wikipedia.org/wiki/Cython
https://youtu.be/zQeYx87mfyw?t=747
types
Cython files have a .pyx extension

def primes(int kmax):   # The argument will be converted to int or raise a TypeError.
    cdef int n, k, i    # These variables are declared with C types.
    cdef int p[1000]    # Another C type
    result = []         # A Python type

    if kmax > 1000:
        kmax = 1000

    k = 0
    n = 2

    while k < kmax:
        i = 0

        while i < k and n % p[i] != 0:
            i = i + 1

        if i == k:
            p[k] = n
            k = k + 1
            result.append(n)

        n = n + 1
    return result

In [1]: %load_ext Cython

In [2]: %%cython
   ...: def f(n):
   ...:     a = 0
   ...:     for i in range(n):
   ...:         a += i
   ...:     return a
   ...:
   ...: cpdef g(int n):
   ...:     cdef int a = 0, i
   ...:     for i in range(n):
   ...:         a += i
   ...:     return a
   ...:

In [3]: %timeit f(1000000)
42.7 ms Â± 783 Âµs per loop (mean Â± std. dev. of 7 runs, 10 loops each)

In [4]: %timeit g(1000000)
74 Âµs Â± 16.6 ns per loop (mean Â± std. dev. of 7 runs, 10000 loops each)

# which gives a 585 times improvement over the pure-python version

Cython compiling:

4.2.8. Numbaï

Numba gives you the power to speed up your applications with high performance functions written directly in Python. With a few annotations, array-oriented and math-heavy Python code can be just-in-time compiled to native machine instructions, similar in performance to C, C++ and Fortran, without having to switch languages or Python interpreters.

>>> #
... from numba import jit, int32
...
...
... @jit(nogil=True)
... def do_something():
...     pass
...
...
... @jit(int32(int32, int32))
... def add(x, y):
...     return x + y

4.2.9. Daskï

https://dask.org

Dask natively scales Python. Dask provides advanced parallelism for analytics, enabling performance at scale for the tools you love.

Dask's schedulers scale to thousand-node clusters and its algorithms have been tested on some of the largest supercomputers in the world.

But you don't need a massive cluster to get started. Dask ships with schedulers designed for use on personal machines. Many people use Dask today to scale computations on their laptop, using multiple cores for computation and their disk for excess storage.

4.2.10. Find existing implementationï

https://pypi.org

4.2.11. String Concatenationï

How many string are there in a memory?:

>>> firstname = 'Mark'
>>> lastname = 'Watney'
>>>
>>> firstname + ' ' + lastname
'Mark Watney'

How many string are there in a memory?:

>>> firstname = 'Mark'
>>> lastname = 'Watney'
>>>
>>> f'{firstname} {lastname}'
'Mark Watney'

How many string are there in a memory?:

>>> firstname = 'Mark'
>>> lastname = 'Watney'
>>> age = 42
>>>
>>> 'Hello ' + firstname + ' ' + lastname + ' ' + str(age) + '!'
'Hello Mark Watney 42!'

How many string are there in a memory?:

>>> firstname = 'Mark'
>>> lastname = 'Watney'
>>> age = 42
>>>
>>> f'Hello {firstname} {lastname} {age}!'
'Hello Mark Watney 42!'

4.2.12. String Appendï

Use list.append() instead of str + str:

Concatenates strings using + in a loop:

>>> DATA = ['Mark Watney', 'Melissa Lewis', 'Rick Martinez']
>>> result = '<table>'
>>>
>>> for element in DATA:
...     result += f'<tr><td>{element}</td></tr>'
>>>
>>> result += '</table>'
>>> print(result)
<table><tr><td>Mark Watney</td></tr><tr><td>Melissa Lewis</td></tr><tr><td>Rick Martinez</td></tr></table>

Problem solved:

>>> DATA = ['Mark Watney', 'Melissa Lewis', 'Rick Martinez']
>>> result = ['<table>']
>>>
>>> for element in DATA:
...     result.append(f'<tr><td>{element}</td></tr>')
>>>
>>> result.append('</table>')
>>> print(''.join(result))
<table><tr><td>Mark Watney</td></tr><tr><td>Melissa Lewis</td></tr><tr><td>Rick Martinez</td></tr></table>

4.2.13. Range between two `float`ï

There are indefinitely many values between two floats

>>> range(0, 1)
range(0, 1)

Note, that in Python following code will not execute, as of range() requires two integers. However similar code with numpy.arange() will work.

>>> range(0.0, 1.0)
000...000
000...001
000...002
000...003

4.2.14. Dequeï

collections.deque - Double ended Queue

4.2.15. Further Readingï

4.2.16. Referencesï

4.2.17. Assignmentsï

# FIXME: Write tests

# %% About
# - Name: Memoization
# - Difficulty: easy
# - Lines: 5
# - Minutes: 5

# %% License
# - Copyright 2025, Matt Harasymczuk <matt@python3.info>
# - This code can be used only for learning by humans
# - This code cannot be used for teaching others
# - This code cannot be used for teaching LLMs and AI algorithms
# - This code cannot be used in commercial or proprietary products
# - This code cannot be distributed in any form
# - This code cannot be changed in any form outside of training course
# - This code cannot have its license changed
# - If you use this code in your product, you must open-source it under GPLv2
# - Exception can be granted only by the author

# %% English
# 1. Create empty `dict` named `CACHE`
# 2. In the dictionary, we will store the results of calculating
#    the function for the given parameters:
#    - key: function argument
#    - value: calculation result
# 3. Modify function `factorialA(n: int)`
# 4. Before running the function, check if the result has already
#    been calculated:
#    - if so, return data from `CACHE`
#    - if not, calculate, update `CACHE` and return the value
# 5. Compare the speed of operation
# 6. Run doctests - all must succeed

# %% Polish
# 1. StwÃ³rz pusty `dict` o nazwie `CACHE`
# 2. W sÅowniku bÄdziemy przechowywali wyniki wyliczenia funkcji
#    dla zadanych parametrÃ³w:
#    - klucz: argument funkcji
#    - wartoÅÄ: wynik obliczeÅ
# 3. Zmodyfikuj funkcjÄ `factorialA(n: int)`
# 4. Przed uruchomieniem funkcji, sprawdÅº czy wynik zostaÅ juÅ¼
#    wczeÅniej obliczony:
#    - jeÅ¼eli tak, to zwraca dane z `CACHE`
#    - jeÅ¼eli nie, to oblicza, aktualizuje `CACHE` i zwraca wartoÅÄ
# 5. PorÃ³wnaj prÄdkoÅÄ dziaÅania
# 6. Uruchom doctesty - wszystkie muszÄ siÄ powieÅÄ

# %% Doctests
"""
"""

# %% Run
# - PyCharm: right-click in the editor and `Run Doctest in ...`
# - PyCharm: keyboard shortcut `Control + Shift + F10`
# - Terminal: `python -m doctest -f -v myfile.py`

# %% Imports
from timeit import timeit
import sys

# %% Types

# %% Data
sys.setrecursionlimit(5000)


# %% Result
def factorialA(n: int) -> int:
    return 1 if n==0 else n*factorialA(n-1)


# Do not modify anything below
def factorialB(n: int) -> int:
    return 1 if n==0 else n*factorialB(n-1)


durationA = timeit(
    stmt='factorialA(50); factorialA(40); factorialA(45); factorialA(35)',
    globals=globals(),
    number=10000,
)

durationB = timeit(
    stmt='factorialB(50); factorialB(40); factorialB(45); factorialB(35)',
    globals=globals(),
    number=10000,
)

times_faster = durationB / durationA
print(f'factorialA time: {durationA:.4f} seconds')
print(f'factorialB time: {durationB:.4f} seconds')
print(f'Cached solution is {times_faster:.0f} times faster')
# factorialA time: 0.0654 seconds
# factorialB time: 0.0658 seconds
# Cached solution is 1 times faster

4.2. Optimization Toolsï

4.2.1. Containsï

4.2.2. PyPyï

4.2.3. Patternsï

4.2.4. Toolsï

4.2.5. Numpy vectorizationï

4.2.6. Specialized data structuresï

4.2.7. Cythonï

4.2.8. Numbaï

4.2.9. Daskï

4.2.10. Find existing implementationï

4.2.11. String Concatenationï

4.2.12. String Appendï

4.2.13. Range between two floatï

4.2.14. Dequeï

4.2.15. Further Readingï

4.2.16. Referencesï

4.2.17. Assignmentsï

4.2. Optimization Toolsï

4.2.1. Containsï

4.2.2. PyPyï

4.2.3. Patternsï

4.2.4. Toolsï

4.2.5. Numpy vectorizationï

4.2.6. Specialized data structuresï

4.2.7. Cythonï

4.2.8. Numbaï

4.2.9. Daskï

4.2.10. Find existing implementationï

4.2.11. String Concatenationï

4.2.12. String Appendï

4.2.13. Range between two `float`ï

4.2.14. Dequeï

4.2.15. Further Readingï

4.2.16. Referencesï

4.2.17. Assignmentsï