Standardize on pathlib.Path internally with proper type hints#3170
Standardize on pathlib.Path internally with proper type hints#3170juan-cobos wants to merge 18 commits intoDeepLabCut:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR standardizes path handling across the DeepLabCut codebase by migrating from mixed string/os.path operations to consistent pathlib.Path usage with proper type hints. The changes align with modern Python practices and aim to prevent subtle bugs related to string-based path manipulation.
Key changes:
- Migrated from
""toNoneforinit_weightsdefault parameter inproc_videofunction - Replaced
os.pathand string concatenation with pathlib.Path operations - Updated type hints to use modern Python syntax (
str | Pathinstead ofUnion[str, Path]) - Refactored file searching logic in
find_analyzed_datato use pathlib instead ofgrab_files_in_folder
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 10 comments.
| File | Description |
|---|---|
deeplabcut/utils/make_labeled_video.py |
Converted path handling to pathlib.Path, changed init_weights default from "" to None, updated video output path construction |
deeplabcut/utils/auxiliaryfunctions.py |
Refactored find_analyzed_data to use Path.iterdir() instead of grab_files_in_folder, added type hints, updated error messages |
deeplabcut/pose_estimation_pytorch/modelzoo/inference.py |
Modernized type hints, simplified dest_folder path handling with pathlib |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| candidates = [] | ||
| for file in grab_files_in_folder(folder, "h5"): | ||
| stem = Path(file).stem.replace("_filtered", "") | ||
| starts_by_scorer = file.startswith(videoname + scorer) or file.startswith( | ||
| videoname + scorer_legacy | ||
| ) | ||
| if tracker: | ||
| matches_tracker = stem.endswith(tracker) | ||
| else: | ||
| matches_tracker = not any(stem.endswith(s) for s in TRACK_METHODS.values()) | ||
| if all( | ||
| ( | ||
| starts_by_scorer, | ||
| "skeleton" not in file, | ||
| matches_tracker, | ||
| (filtered and "filtered" in file) | ||
| or (not filtered and "filtered" not in file), | ||
| ) | ||
| ): | ||
| candidates.append(file) | ||
| candidates = [] | ||
| folder = Path(folder) | ||
| for p in folder.iterdir(): |
There was a problem hiding this comment.
one of the advantages of a pathlib.Path, is that it has the glob method. Could we make use of that here perhaps?
There was a problem hiding this comment.
I wrote an attempt below. It becomes a bit more readable for my taste.
|
@juan-cobos, great proposal, thanks for opening this PR and addressing this. Please see my comments above. |
|
@juan-cobos thanks for incorporating the suggested changes, looks good! Please re-request another review once you feel everything is done or let me know if you'd like me to join in making adjustments. |
|
@deruyter92 Feel free to make any changes! Merge when you think it's fine (not sure how all those builds actually work). I'll keep making PRs as I continue refactoring :) |
|
Replacing this PR with #3199 because of inactivity and potentially breaking changes. The replacement PR contains the original adjustments, but reverts changes that are unrelated to the pathlib migration. @juan-cobos thanks you for your contribution! |
Path handling is currently inconsistent (strings vs Path objects, mixed os.path and pathlib), making code hard to follow. This also causes subtle bugs (e.g., scorer.replace() on a Path attempts filesystem operations instead of string replacement).
Proposal: Standardize on pathlib.Path internally with proper type hints. This is consistent with newer parts of the codebase, eliminates helper functions like grab_files_in_folder(), improves readability, and catches type errors earlier.
Open to feedback on approach and scope.