Skip to content

Conversation

@spaceone
Copy link
Contributor

Add a utility functions ldap.filter.is_filter which can be used to check if the ldap filter has valid syntax.
This uses an unbound ldap connection, so the values are ensured to be correct via the C library.
I added test cases for the function.

@spaceone spaceone force-pushed the add-is-filter branch 2 times, most recently from 3f0a7e6 to 084b622 Compare March 30, 2019 08:59
@codecov
Copy link

codecov bot commented Mar 30, 2019

Codecov Report

Merging #272 (084b622) into master (39ea8e5) will increase coverage by 0.41%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #272      +/-   ##
==========================================
+ Coverage   71.30%   71.72%   +0.41%     
==========================================
  Files          50       49       -1     
  Lines        4796     5033     +237     
  Branches      802      894      +92     
==========================================
+ Hits         3420     3610     +190     
- Misses       1045     1067      +22     
- Partials      331      356      +25     
Impacted Files Coverage Δ
Lib/ldap/dn.py 98.07% <ø> (ø)
Lib/ldap/filter.py 76.74% <100.00%> (+11.11%) ⬆️
Lib/ldap/cidict.py 69.23% <0.00%> (-27.27%) ⬇️
Lib/ldap/extop/__init__.py 55.55% <0.00%> (-12.87%) ⬇️
Modules/constants.c 45.83% <0.00%> (-7.68%) ⬇️
Lib/ldap/ldapobject.py 76.38% <0.00%> (-3.85%) ⬇️
Lib/ldap/compat.py 66.03% <0.00%> (-1.24%) ⬇️
Modules/constants_generated.h 97.92% <0.00%> (-0.67%) ⬇️
Modules/LDAPObject.c 67.43% <0.00%> (-0.31%) ⬇️
Lib/ldap/schema/tokenizer.py 97.87% <0.00%> (-0.05%) ⬇️
... and 10 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2647f59...42b87b0. Read the comment docs.

Copy link
Member

@tiran tiran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm -1 on this approach.

It seems rather wasteful and slow to create a LDAP connection object to verify that a string is a valid filter. The call will also connect to the default ldap server from /etc/openldap/ldap.conf on success. This will spam the error log of the default LDAP server.

ldap_pvt_put_filter might work, but the function is not exposed in ldap.h.

@spaceone
Copy link
Contributor Author

@tiran
On my notebook, no LDAP connection is done with this and not required. It works even without any network connection. And it is very fast.

Would it create a ldap connection if I had a /etc/openldap/ldap.conf? Then I will see if I can find a way to prevent this.

@encukou
Copy link
Member

encukou commented Apr 1, 2019

If there is no connection involved, why catch SERVER_DOWN?
This logic looks fragile: if the server is down, all filters are considered valid.

@spaceone
Copy link
Contributor Author

spaceone commented Apr 2, 2019

If the ldap object is uninitialized and unconnected it raises ldap.SERVER_DOWN without any connection attempt.
Using strace I cannot see any connection attempt:
strace python -c 'import ldap; lo = ldap.initialize(""); lo.search_ext_s("", ldap.SCOPE_BASE, "foo=bar")' 2>&1 | less

What I can see is that it reads my "/etc/ldap/ldap.conf" and tries to read ~/.ldaprc.
My ldap.conf contains URI and BASE. Do I need HOST so that it would actually try to connect even if I pass "" as URI?

@encukou
Copy link
Member

encukou commented May 24, 2019

The main problem is that, however you connect, on SERVER_DOWN, all filters would be considered valid – the function would return a wrong result with no indication of failure.

encukou pushed a commit to encukou/python-ldap that referenced this pull request Sep 20, 2019
@encukou
Copy link
Member

encukou commented Sep 20, 2019

I don't want to merge code that can silently hide a failure, so I'll reject this approach. It might be suitable for a library that builds on top of python-ldap, rather than for python-ldap itself.
Please re-open if you want to discuss further.

@encukou encukou closed this Sep 20, 2019
@spaceone
Copy link
Contributor Author

There is no REOPEN button.
As far as I inspected this, your assumptions that there will be a ldap connection established, are not correct. Can you proove me wrong, please, and explain what I need to do so that this behavior would occurr?

@quanah
Copy link
Contributor

quanah commented Sep 20, 2019

If you set LDAPNOINIT the client libraries will bypass reading ldap.conf/.ldaprc. This is what the OpenLDAP test suite does to ensure that local client configurations don't interfere.

@tiran
Copy link
Member

tiran commented Sep 20, 2019

In my opinion the patch is not a clean solution but a hack. I don't like to add such hacks to python-ldap. You can easily ship the short function in your application.

I wouldn't mind to add this feature iff OpenLDAP adds a public API to perform the check.

@encukou encukou reopened this Sep 20, 2019
@encukou
Copy link
Member

encukou commented Sep 20, 2019

There is no REOPEN button.

Apologies, I had wrong assumptions of the GitHub UI.


My main objection is that is_filter returns different results based on whether the server is reachable or not.
If that is not the case, why does the code handle ldap.SERVER_DOWN?

@spaceone
Copy link
Contributor Author

spaceone commented Sep 20, 2019

I wouldn't mind to add this feature iff OpenLDAP adds a public API to perform the check.

Yes, that would be the best. You already mentioned the not exposed function ldap_pvt_put_filter.

My main objection is that is_filter returns different results based on whether the server is reachable or not.
If that is not the case, why does the code handle ldap.SERVER_DOWN?

The ldap connection there is not connected and will never be. At least I could not get any connection attempt with that code. ldap.SERVER_DOWN therefore does only indicate that the filter is valid, no matter if there is a server which can be connected to or not. The filter validity is checked before any connection attempts, it there are any at all.

If that code really does a connection, how would I set this up to reproduce this? We are using the above code since years and it works. We have a /etc/openldap/ldap.conf and a /etc/ldap/ldap.conf with valid entries. I don't see where it's evaluated. and we did not set LDAPNOINIT.

@spaceone
Copy link
Contributor Author

I took the name is_filter() for consistency, because there is also a is_dn() already.

encukou added a commit that referenced this pull request Jan 29, 2020
#303
Suggested in #272

Co-authored-by: Florian Best <spaceone@users.noreply.github.com>
@encukou
Copy link
Member

encukou commented Jan 29, 2020

I see. I misunderstood the code.

Still, this does rely on undocumented behavior, so @tiran's comment holds:

In my opinion the patch is not a clean solution but a hack. I don't like to add such hacks to python-ldap. You can easily ship the short function in your application.

I wouldn't mind to add this feature iff OpenLDAP adds a public API to perform the check.

IOW, if this is the easiest way to check filter validity using OpenLDAP, it's an issue in OpenLDAP. python-ldap is the wrong place for a workaround.

@tiran
Copy link
Member

tiran commented May 27, 2020

Did you raise the issue with OpenLDAP upstream to request a filter validation function? I'm hesitant to use an internal, private function ldap_pvt_put_filter to validate filters.

@spaceone
Copy link
Contributor Author

Sorry, I did not see your latest answer. I created a feature request at openldap: https://bugs.openldap.org/show_bug.cgi?id=9393
Let's see what they are saying.

I also updated the pull request, made it more clear that ldap.SERVER_DOWN is expected to be raised and raise a RuntimeError (which can't happen) in all other cases. Maybe you are fine with this implementation and could provide it as fallback until there is something available in Open-LDAP? The function has detailed tests.

@spaceone
Copy link
Contributor Author

spaceone commented Oct 1, 2021

They aren't replying at all.

@quanah
Copy link
Contributor

quanah commented Oct 14, 2021

They aren't replying at all.

What you filed is a feature request. The earliest it would be done is OpenLDAP 2.7.

@spaceone
Copy link
Contributor Author

Meanwhile openldap exposed ldap_pvt_put_filter: https://git.openldap.org/openldap/openldap/-/merge_requests/730/diffs
git:5acbc6e994095538b85bb6a927bcb691be7134d1

Target Milestone of the bug says OpenLDAP 2.7, it's not released yet (no git tag, just in branch master).

But I wrote my own Filter parsing library which should be identical to the OpenLDAP one. I have over 500 test cases and hope to have catched all possible combinations. OpenLDAP isn't strictly following the RFC, especially regarding whitespace.

https://docs.freeiam.org/en/latest/examples/ldap_filter.html
https://github.com/Free-IAM/freeiam/blob/main/src/freeiam/ldap/filter.py

@mistotebe
Copy link
Contributor

Meanwhile openldap exposed ldap_pvt_put_filter: https://git.openldap.org/openldap/openldap/-/merge_requests/730/diffs git:5acbc6e994095538b85bb6a927bcb691be7134d1

Target Milestone of the bug says OpenLDAP 2.7, it's not released yet (no git tag, just in branch master).

Yes, I guess we could do what we used to do with ldap_init_fd, expect it to be available by defining it ourselves under a version check (and for platforms that don't have it, use this fallback implementation instead?)

But I wrote my own Filter parsing library which should be identical to the OpenLDAP one. I have over 500 test cases and hope to have catched all possible combinations. OpenLDAP isn't strictly following the RFC, especially regarding whitespace.

https://docs.freeiam.org/en/latest/examples/ldap_filter.html https://github.com/Free-IAM/freeiam/blob/main/src/freeiam/ldap/filter.py

I really like the interface in this! What exactly are the differences between libldap's parsing and the RFC? I think I knew it was slightly more lenient on extra whitespace in a position or two, anything else? Was there a python parsing library that could work at build-time? I think there were people who wanted to keep the dependencies of python-ldap to a minimum.

Only gripe is that attr == '' for a presence filter looks wrong, but in some ways so would attr != None. And the typo comparision -> comparison (and while ldap uses the term 'assertion' I'd use something like 'predicate')

@spaceone
Copy link
Contributor Author

What exactly are the differences between libldap's parsing and the RFC? I think I knew it was slightly more lenient on extra whitespace in a position or two, anything else?

I don't remember exactly every case anymore but:

  • OpenLDAP allows newlines after most brackets (pretty filters).
  • OpenLDAP allows whitespace before brackets in a comparision. This adds some ambiguity as filters which want to explicitly search for strings ending with whitespace must be escaped.
(! ( a= b ))
  ^ ^  ^ ^·
  • Irritating is: (a= *b) and (a= *) - I don't remember anymore, if OpenLDAP treats it as substring match or as presence filter
  • OpenLDAP allows filters which are not in brackets: a=b
  • Unsure about the behavior for the following: OpenLDAP treats (a=foo bar) equal to (a=foo bar).
  • And I didn't find any special test cases in the openldap source code for any parsing validity. There are only basic tests. It probably has to do with that the filter is just transformed into a BER structure encoding.

Was there a python parsing library that could work at build-time? I think there were people who wanted to keep the dependencies of python-ldap to a minimum.

I don't understand the question?

Only gripe is that attr == '' for a presence filter looks wrong, but in some ways so would attr != None.

You mean value == ''? Because attr is correctly set:

>>> x = freeiam.ldap.filter.Filter('foo=*')
>>> x.root.comparisons[0].attr
'foo'
>>> x.root.comparisons[0].value
''

In theory a presence filter has no attribute value, but it's inherited from the base class and easier for composing the string again.
One could see the * as the value, but it's actual only the =* operator.

And the typo comparision -> comparison

Thanks, I fixed it, will be released soon. I stored it wrong in my brain, not the first time for this word.

(and while ldap uses the term 'assertion' I'd use something like 'predicate')

You mean attribute not assertion, which I use.
LDAP uses AVA is Attribute value assertion.
In general, I would like to use the original terms used in the specifications (RFCs) or BNF, but for some things they aren't defined. Then I use the openldap terminology.
I don't like new inventions of terms, which might sound nice but you have to apply context knowledge. This is heavily done in the HTTP area already :-(.
And in LDAP we have LDAP attributes, which is also the part in the filter expression. Therefore I would not use predicate as a replacement.

@mistotebe
Copy link
Contributor

What exactly are the differences between libldap's parsing and the RFC? I think I knew it was slightly more lenient on extra whitespace in a position or two, anything else?

I don't remember exactly every case anymore but:

Sorry for nerd sniping both of us I guess. I wrapped ldap_pvt_put_filter to do a bunch of tests to check. My expectation was that libldap might accept (some) whitespace but only when it's clearly not a valid filter otherwise. Sort of seems like things work that way already:

  • OpenLDAP allows newlines after most brackets (pretty filters).

That's consistent with the above, maybe not great for a test_filter() function, sure.

  • OpenLDAP allows whitespace before brackets in a comparision. This adds some ambiguity as filters which want to explicitly search for strings ending with whitespace must be escaped.

(! ( a= b ))

As expected this is equal to (!(a= b )) as in NOT(a equals \20b\20).

  • Irritating is: (a= *b) and (a= *) - I don't remember anymore, if OpenLDAP treats it as substring match or as presence filter

They actually behave exactly according to the RFC (as substring filters)?

  • OpenLDAP allows filters which are not in brackets: a=b

Yes, again for good reason (noone would use the command line tools otherwise) but not ideal for test_filter()

  • Unsure about the behavior for the following: OpenLDAP treats (a=foo bar) equal to (a=foo bar).

That one encodes correctly (preserving everything), what you're probably seeing is matching rule processing on the server (e.g. caseIgnoreMatch normalising the assertion to foo bar by stripping leading/trailing/consecutive whitespace)?

  • And I didn't find any special test cases in the openldap source code for any parsing validity. There are only basic tests. It probably has to do with that the filter is just transformed into a BER structure encoding.

True, the project never had a way to test these library level things. There is a start on a Python test suite which would be able to now, but also a limit on people's attention so writing a test suite for something that is probably working fine is not high on people's list... It's a shame when someone wants to write a parallel implementation, sure.

Was there a python parsing library that could work at build-time? I think there were people who wanted to keep the dependencies of python-ldap to a minimum.

I don't understand the question?

I thought you were suggesting we adopt something similar here.

Only gripe is that attr == '' for a presence filter looks wrong, but in some ways so would attr != None.

You mean value == ''? Because attr is correctly set:

No, what I mean is that if you set attr = Filter.attr('attr') these are different filters

  • (attr=) which is what I'd expect attr == '' to produce
  • (attr=*) a negation of which attr == '' currently produces
  • also (attr=\2A) as in attr == '*', we agree there

Which is why I'm saying that using something like attr != None is a slightly better way of expressing a presence filter.

In theory a presence filter has no attribute value, but it's inherited from the base class and easier for composing the string again. One could see the * as the value, but it's actual only the =* operator.

Exactly

(and while ldap uses the term 'assertion' I'd use something like 'predicate')

You mean attribute not assertion, which I use. LDAP uses AVA is Attribute value assertion. In general, I would like to use the original terms used in the specifications (RFCs) or BNF, but for some things they aren't defined. Then I use the openldap terminology. I don't like new inventions of terms, which might sound nice but you have to apply context knowledge. This is heavily done in the HTTP area already :-(. And in LDAP we have LDAP attributes, which is also the part in the filter expression. Therefore I would not use predicate as a replacement.

Yeah, it was bikeshedding and I apologise.

@spaceone
Copy link
Contributor Author

That's consistent with the above, maybe not great for a test_filter() function, sure.

I only need to validate whether the filter will be accepted by OpenLDAP - so the function should also just accept it and shouldn't be stricter than what is actually possible.
I didn't check Samba-LDAP, Active-Directory or RedHat-LDAP how they treat it.

That one encodes correctly (preserving everything), what you're probably seeing is matching rule processing on the server (e.g. caseIgnoreMatch normalising the assertion to foo bar by stripping leading/trailing/consecutive whitespace)?

Okay, nice that the matching rule is responsible for it. I didn't had a look yet, whether those spaces were transmitted along with on protocol level.

I thought you were suggesting we adopt something similar here.

Well no, as I wrote it for myself I have something - purely python.
I don't want python-ldap to have to maintain something like this (for the sugar, one could use freeiam :-) ).
I want python-ldap only to maintain a version of libldap for the validation.

Which is why I'm saying that using something like attr != None is a slightly better way of expressing a presence filter.

Ah, now I understand :-)
yes, Attribute('foo') == '' should not be a presence filter but a equality filter with empty value. Thanks for that!
I'll rework those cases.

@mistotebe
Copy link
Contributor

That's consistent with the above, maybe not great for a test_filter() function, sure.

I only need to validate whether the filter will be accepted by OpenLDAP - so the function should also just accept it and shouldn't be stricter than what is actually possible.

In here then, you could do what I suggested, use ldap_pvt_put_fitler if available, the ldap_search hack otherwise?

I didn't check Samba-LDAP, Active-Directory or RedHat-LDAP how they treat it.

Do they provide command-line tools (that don't use libldap) or other interfaces that process string representations? Because over the wire, the filter structure is completely unambiguous and value-agnostic.

That one encodes correctly (preserving everything), what you're probably seeing is matching rule processing on the server (e.g. caseIgnoreMatch normalising the assertion to foo bar by stripping leading/trailing/consecutive whitespace)?

Okay, nice that the matching rule is responsible for it. I didn't had a look yet, whether those spaces were transmitted along with on protocol level.

I checked the output of ldap_pvt_put_fitler which is how things get embedded in the search request. That was enough to sate my curiosity, might not be enough for you :)

I thought you were suggesting we adopt something similar here.

Well no, as I wrote it for myself I have something - purely python. I don't want python-ldap to have to maintain something like this (for the sugar, one could use freeiam :-) ). I want python-ldap only to maintain a version of libldap for the validation.

Given I'm planning on providing much more pythonic APIs, a filter abstraction (creation only, so an algebraic structure) has always been on the list that would be useful and I think you hit the nail on the head in terms of a desirable shape for one. Sure, that means we probably don't need to parse string representations. And that would be definitely be a whole new PR, targeting 4.0+.

Which is why I'm saying that using something like attr != None is a slightly better way of expressing a presence filter.

Ah, now I understand :-) yes, Attribute('foo') == '' should not be a presence filter but a equality filter with empty value. Thanks for that! I'll rework those cases.

Saw those updates, nice. Just a note, very few attributes allow an empty value, so that's probably why you wrote what you wrote originally, but they exist.

@spaceone
Copy link
Contributor Author

In here then, you could do what I suggested, use ldap_pvt_put_fitler if available, the ldap_search hack otherwise?

done.

@spaceone spaceone requested a review from tiran November 17, 2025 15:52
@spaceone
Copy link
Contributor Author

btw does pvt in ldap_pvt_put_filter stands for private?
openldap exposed this name now.

@spaceone
Copy link
Contributor Author

That was enough to sate my curiosity, might not be enough for you :)

691bb4ff.1f8ee5cd 0x7fee8600f6c0 ldap_read: want=34, got=34
691bb4ff.1f8efd36 0x7fee8600f6c0   0000:  00 0a 01 02 0a 01 00 02  01 00 02 01 00 01 01 00   ................  
691bb4ff.1f8f1563 0x7fee8600f6c0   0010:  a3 0e 04 02 63 6e 04 08  66 6f 6f 20 20 62 61 72   ....cn..foo  bar  
691bb4ff.1f8f273d 0x7fee8600f6c0   0020:  30 00                                              0.   

→ the two spaces are transmitted correctly: 66 6f 6f 20 20 62 61 72

691bb4ff.1f935137 0x7fee8600f6c0 conn=1011 op=4 SRCH base="" scope=2 deref=0 filter="(cn=foo bar)"

→ and then the filter is normalized inside of slapd.

@mistotebe
Copy link
Contributor

btw does pvt in ldap_pvt_put_filter stands for private? openldap exposed this name now.

Yes, it was originally a private API, normally private APIs are only available to other code in the project, but the project decided to expose it as is.

@mistotebe
Copy link
Contributor

I'm ok with this. @tiran you put this on hold, what do you think of this version?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants