Additional GetBytes/GetStream tests#5934
Conversation
|
Thanks!
The reason we don't allow getting the length in sequential mode is because this would leak an implementation detail of the current protocol (it being length prefixed), which may or (likely) may not change. There is a related scenario that comes to mind: if everything is buffered we could allow seeking on the returned stream under sequential access. We don't do that for similar reasons, consistency and robustness around implementation details (for example, our default buffer size)
I would argue Npgsql is correct here, 'DBNullness' is separate from the data itself, as modeled in the api surface of the reader. It should always be checked for before trying to read any data. It would be strange if we throw from GetValue, GetInt and so on but allow it for GetStream. What do you think? |
|
GetStream feels a little different in that it's a reader of a value rather than the value itself. If an application just wants a stream of that column it's likely that The empty stream does no reading of the value because its own length is zero so could be considered correct api surface wise. Not a big issue but avoiding multiple calls into an async state machine leads to a little simpler application code. |
|
Yes I could be convinced that Stream can be returned on a NULL column, just as we successfully return nullable value types on NULL. @roji what do you think? |
|
Note I also logged dotnet/SqlClient#3057 over on SqlClient as I couldn't have common async code between the two. |
Hmm, I remember we briefly discussed this in the past, though I don't remember the exact pros/cons etc. I think there's still something odd and inconsistent about GetStream() not throwing for null, while all other Get* methods do (GetString, GetFieldValue...). I get that GetStream could be argued to be different (it's not the value but a reader of the value), but I'm not sure why that should impact the behavior around NULL specifically... I also generally prefer APIs that do make the distinction between null and empty, rather than collapsing the difference (by returning a Stream with length=0 for both cases) and telling users to do IsDBNull() if they care. So I'm overall leaning towards the current behavior being better (but could be convinced otherwise)... What exactly would be the argument for changing the behavior here, aside from "SqlClient does it"? |
|
No value to read and an empty array are the same for Stream in that the result is "zero bytes read". Mainly pointing out the inconsistency with SqlClient here and I don't think the throw on null is protecting anything. But this PR doesn't propose any changes. |
Do you think it's protecting something with GetString(), but specifically not with GetStream()? If so, what? |
|
It depends how you take the api. As the library has nullability annotations the api has stated that it only returns string objects, not null. So anything that can't be converted to that is an error. Getting more into the philosophical discussion of given modern language features if the api was redesigned today would the DBNull object exist. I would say no, everything would be strongly typed as much as possible. ORMs have mostly removed that null/dbnull distinction. An application typically knows its database and whether the column is nullable. If I'm creating objects from a reader manually currently it's a IsDBNull/GetString combo and if the property is not set it remains null. Would have been easier if GetString just returned null. So I would have voted for A GetStream change would be a lot less controversial though as there is precedent ;) |
It's kind of the other way around - the API was annotated in the way it was because of the behavior it already had (nullability annotations were introduced a few years ago, whereas the behavior of throwing for NULL is much older).
Couldn't one say the exact same thing about GetString? i.e. it could return an empty string for database NULL instead of throwing? Regarding the rest, I don't think this is the place to discuss the general null-related behavior in ADO.NET - FWIW I agree that we'd have done it differently today. The question is more, given the behavior that is there (e.g. GetString() throws for NULL), what would be the justification for making GetStream() an exception, and thus introducing an inconsistency? If the only argument here is that SqlClient behaves in that way, that seems a bit weak to me - Npgsql behaves differently from SqlClient in various other places already, etc. |
I think SqlClient has it the right way because it is never in error to return an empty stream. All the reading properties and functions on the stream can be made to work. So the library doesn't need to protect against it. In any case this PR is just documenting the current behavior. |
Again, I'm not sure why Stream is any different from String here (i.e. GetString() returning an empty string on NULL). All the properties on an empty String work too, it's a valid object etc. They're also conceptually very similar: an empty Stream is very similar to an empty String (both represent a non-null thing with no data). In other words, I can't see any reason for GetStream() to behave differently from GetString(); either we want the DbDataReader API to make the distinction between NULL and empty (by throwing for NULL), or we don't (and an empty value can be returned) - but whatever we decide should be consistent across both GetStream() and GetString(). |
|
Consistency within the library is good goal |
|
I'm not trying to break open the debate again, but there is another terrible reason for GetStream's weirdness. DbDataReader's base implementation is a MemoryStream over GetBytes, ignoring any nullness. I was almost able to find some consistency in the broader design. In this imaginary world GetStream would be the usable variant of GetBytes (which afaik does not throw on NULL in SqlClient), and GetTextReader would represent the usable variant of GetChars. Alas GetTextReader does throw on NULL, and its implementation instead calls GetString :') Add it to the pile of sins I suppose... |
|
@NinoFloris yeah, I wouldn't go too far in looking for grand design consistency in that particular API area ;) |
|
Just found this from the original issue that I'd missed the notification on in SqlClient, apologies. If SqlClient is being inconsistent with regard how I think the original decision to return empty streams for null values instead of langauge null or a throw supported a pattern that is long forgotten. I'm not sure if it's good or bad. Whatever design pattern you choose for this driver I'm sure will be well considered and I wouldn't suggest trying to be consistent with SqlClient in this area unless you have good reason to do so. |
NinoFloris
left a comment
There was a problem hiding this comment.
I don't think we really want to take the doc changes, thanks for the added tests.
Adding a couple of differences from SqlClient into the tests.
GetStreamafterGetBytesthrows on SqlClient with sequentially access. Npgsql doesn't which is good. Should be able to get the length first with a null buffer even though the Npgsql stream implements the length property.GetStreamon null column returns an empty stream on SqlClient, throws on Npgsql. I think the no throw is better in this situation but added it to the tests.Filled in the doco on what is supported in
GetStreamusing SqlClient as a template.