Improve performance of Group-Object#7410
Conversation
Refactoring only - no functional changes.
This reducing alloctions and object copying.
…hen comparing. Much faster when the number of unique values grows large.
|
Appveyor error seem unrelated to my changes. |
| { | ||
| if (propValue != null && propValue.PropertyValue != null) | ||
| var propValuePropertyValue = propValue?.PropertyValue; | ||
| if (propValuePropertyValue != null) |
There was a problem hiding this comment.
Is the check really needed?
There was a problem hiding this comment.
yes, it is dereferenced in the else clause. #Closed.
There was a problem hiding this comment.
Should it be currentObjectOrderValues?
There was a problem hiding this comment.
Please address the comment and fix case.
There was a problem hiding this comment.
Please add space after //.
There was a problem hiding this comment.
Make sense to move the cycle into the if (AsString) ?
There was a problem hiding this comment.
Not sure i understand what you suggest here.
There was a problem hiding this comment.
I meant
if (AsString)
{
foreach (GroupInfo grp in _groups)
{
table.Add(grp.Name, grp.Group);
}
}
else
{
foreach (GroupInfo grp in _groups)
{
if (grp.Values.Count == 1)
{
...
}
}
}There was a problem hiding this comment.
Yes, I think that is clearer.
There was a problem hiding this comment.
I thought more about performancethough :-) From this point of view it seems for() better than foreach - less allocations. Is the _groups so much that we can get a win?
Also some additional CodeFactor complaints
|
I would like to fix the parameter sets too. |
Moving parameter checking earlier. Adding back the slow quadratic code path for the cases where we have more than one set of input types.
|
I believe we shouldn't mess performance changes with other and it is better fix parameter sets in new PR. |
There was a problem hiding this comment.
We'd move this line on previous line.
There was a problem hiding this comment.
Please move to previous line 115.
There was a problem hiding this comment.
Please remove the empty XML comment.
There was a problem hiding this comment.
Please address the comment and fix case.
iSazonov
left a comment
There was a problem hiding this comment.
LGTM with two minor comments.
There was a problem hiding this comment.
Seems string.Format was optimized in .Net Core 2 but we could still use AppendFormat() here and in line 160 to get more readable code and exclude extra method calls.
There was a problem hiding this comment.
This definitely should use AppendFormat()
There was a problem hiding this comment.
currentObjectorderValues should be currentObjectOrderValues.
There was a problem hiding this comment.
This assert is redundant. ArrayToTuple(IList, int> has the exact same code. Suggest removing both Assert statements.
| { | ||
| Diagnostics.Assert(inputObjects != null, "inputObjects is null"); | ||
| Diagnostics.Assert(inputObjects.Length > 0, "inputObjects is empty"); | ||
| Diagnostics.Assert(inputObjects.Count > 0, "inputObjects is empty"); |
There was a problem hiding this comment.
These asserts should be backed up with runtime checks. Either document it as returning NULL when constraints are not met for throw the appropriate exceptions.
There was a problem hiding this comment.
This is a fairly common pattern in the engine, that there are asserts, but no backing runtime checks. I have always wondered about it...
There was a problem hiding this comment.
I guess the thinking is that since it's an internal class, the usage can be demonstrated to be safe and the error checking skipped.
There was a problem hiding this comment.
Since we don't have an official policy around Assert, I'll not hold up the PR for it and you can ignore the rest. :)
I've been involved in three different threads recently where Dbg.Assert in internal code is hit in debug builds. In two cases, crashes were occurring the release build and the third the lack of the check was masking a bug in another component.
IMHO: Since we don't test debug builds, Dbg.Assert is a placebo at best without a release check.
There was a problem hiding this comment.
This definitely should use AppendFormat()
There was a problem hiding this comment.
Shouldn't this be private
There was a problem hiding this comment.
It was internal, and I just kept it so, but I'm happy to make it private.
|
|
||
| GroupInfo currentGroupInfo = null; | ||
| if (groupInfoDictionary.TryGetValue(currentTupleObject, out currentGroupInfo)) | ||
| if (groupInfoDictionary.TryGetValue(currentTupleObject, out var currentGroupInfo)) |
There was a problem hiding this comment.
What's the use case for currentGroupInfo being null in the groupInfoDictionary? If it is expected, a comment is needed; otherwise, it looks like currentObjectEntry.inputObject is dropped on the floor.
There was a problem hiding this comment.
I don't think that is a case that will ever occur. Just defensive programming.
Both cases are behaving as the original implementation.
One could argue for removing the null check as it is a condition that shouldn't occur, and if it does, it is a bug that should be fixed.
There was a problem hiding this comment.
same as above, when is currentTupleObject present but currentGroupInfo is null.
There was a problem hiding this comment.
Yeah, I'm removing the null checks.
Removing unnecessary call to .ToArray(). Making internal methods private.
|
@dantraMSFT Can you please take another look? |
|
@powercode Many thanks for the great perf improvement! |
Fixes #7409.
PR Summary
code cleanup
object[] -> IList in PSType.ArrayToTuple signature to reduce allocations.
Sort input before grouping. Use sorting to only consider last group when comparing.
adding DebuggerDisplay for GroupInfo.
PR Checklist
.h,.cpp,.cs,.ps1and.psm1files have the correct copyright headerWIP:to the beginning of the title and remove the prefix when the PR is ready.[feature]if the change is significant or affects feature tests