This repository was archived by the owner on May 17, 2024. It is now read-only.
Do not detect MD5s as UUIDs, and preserve UUID casing for UUID PKs#813
Merged
Do not detect MD5s as UUIDs, and preserve UUID casing for UUID PKs#813
Conversation
This was referenced Dec 27, 2023
Contributor
Author
|
FIXED. On a unrelated discussion, this popped up: two sides should be lower-/upper-cased independently based on each side's samples. However, we now slice by PK ranges of one side, and propagate that side to the other one. The casing of the "other" side must be preserved. |
dagadbm
approved these changes
Dec 28, 2023
dagadbm
left a comment
There was a problem hiding this comment.
to unblock if needed but needs proper review
dlawin
approved these changes
Dec 29, 2023
136e605 to
2114ede
Compare
added 4 commits
December 30, 2023 19:49
It fails the comparison anyway — because of casing & dashes not fitting into alphanumeric ranges/slices.
…e when slicing Otherwise, it uses the same PK values, e.g. `ArithUUID` from the side A, and then pushes them to side B, where improper rendering can lead to improper slicing.
2114ede to
9a99030
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Comparing MD5s as UUIDs does not work anyway: it improperly slices and then compares the values, since our code always renders UUIDs as
abcdabcd-abcd-abcd-abcd-abcdabcdabcd, always dashed and lower-cased, while the actual value stored in MD5 (i.e. string) PKs can be uppercased and typically non-dashed (e.g.ABCDABCDABCDABCDABCDABCDABCDABCD). As a result, all such MD5 PKs go into one pseudo-UUID range, usually the first one (because in ASCII & UTF-8, uppercase is lesser than lowercase letters).The root cause is that Python's UUID can parse even such values:
This PR excludes MD5s and other UUID-like textual PKs from UUID detection.
As an extra change (separate commits), this PR also preserves the information on how the database presents the UUIDs — either lowercased or uppercased, and renders the actual sliced UUID values accordingly. This does not matter for native UUIDs (stored & compared as numbers), but does matter for UUIDs stored and/or compared as strings (at least from one side of the diff).