Fix 2 bugs and improve performance from O(N²+NxM) to O(N+M) in _assign_requests_to_connections #1035
+830
−118
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Fixes 2 logic bugs in
_assign_requests_to_connectionsand improves complexity from O(N²+NxM) to O(N+M) where N=connections, M=requests.Although I attempted a smaller PR with an atomic change, I ultimately decided for a redesign of the method because the algorithm design of the method was fundamentally flawed.
Motivation
_assign_requests_to_connectionsis in the hot path of this library and the bugs and complexity issue were severly slowing down my application.Bug 1:
max_keepalive_connectionsis not respectedThis causes idle connections to be evicted unnecessarily. This bug is also addressed in this PR (currently approved but not merged).
The bug is on lines 294-295. The expression on line 294 simply returns
len(self._connections), ignoring the number of idle connections, and compares this number toself._max_keepalive_connections.httpcore/httpcore/_async/connection_pool.py
Lines 294 to 295 in 5974b03
This test demonstrates the bug. This test fails with the current code on master, but passes with the new implementation of this PR.
Bug 2: multiple requests are assigned to the same HTTP/1.1 connection, leading to
ConnectionNotAvailableexceptionsThe bug is on lines 322-323. An available connection is assigned to a request, but the status of that connection is not actually changed, resulting in that connection also being assigned to the next request in the queue.
httpcore/httpcore/_async/connection_pool.py
Lines 322 to 323 in 5974b03
This test demonstrates the bug. This test fails with the current code on master, but passes with the new implementation of this PR.
Perf: Reduce complexity from quadratic to linear
This PR reduces time complexity from O(N²+NxM) to O(N+M) where N=connections, M=requests.
In a typical use case, with for instance 100 connections in the pool and 500 requests in the queue this reduces complexity dramatically, especially since
_assign_requests_to_connectionsis in the hot path of this library because it will be called twice for every request (on start and on finish).In my use case, this PR reduces the CPU utilization of my application by half. Profiling shows that
_assign_requests_to_connectionspreviously accounted for 59% of all samples; after applying this PR, its share dropped to 4%.The problem: unnecesary nested loops
This snippet contains a nested loop over all connections (O(N²)). This is unnecessary and can be done in a single pass (O(N)).
This snippet contains a nested loops over all requests and connections (O(NxM)). This is unnecessary and can be done in a single pass (O(M)) with early exit if all connections are occupied.
Checklist