[MNT]: Implement Scale.val_in_range and refactor _point_in_data_domain#31306
[MNT]: Implement Scale.val_in_range and refactor _point_in_data_domain#31306QuLogic merged 13 commits intomatplotlib:mainfrom
Scale.val_in_range and refactor _point_in_data_domain#31306Conversation
Introduce a val_in_range method in ScaleBase to explicitly check whether values lie within the valid domain of a scale. The default implementation falls back to limit_range_for_scale for compatibility with existing scales. Specific scales (e.g., LogScale, LogitScale) override this method with more efficient checks.
|
Thank you for opening your first PR into Matplotlib! If you have not heard from us in a week or so, please leave a new comment below and that should bring it to our attention. Most of our reviewers are volunteers and sometimes things fall through the cracks. You can also join us on gitter for real-time discussion. For details on testing, writing docs, and our review process, please see the developer guide. We strive to be a welcoming and open project. Please follow our Code of Conduct. |
|
I would like some advice regarding implementing tests for the added function |
Scale.val_in_range and refactor _point_in_data_domain
There was a problem hiding this comment.
Your type usage is mixed up. In _pint_in_data_domain we just need a bool. You implementations return arrays. Your type stub claims bool | array return type. Since we only need single values, let's keep it simple and only implement that. We can always expand later.
Concerning tests: Write tests for all implemented subclasses. Consider edge cases like Inf and NaN
lib/matplotlib/axis.py
Outdated
| """ | ||
| return self._scale.limit_range_for_scale(vmin, vmax, self.get_minpos()) | ||
|
|
||
| def val_in_range(self, val): |
There was a problem hiding this comment.
What is the motivation to expose this on the Axis?
There was a problem hiding this comment.
I noticed limit_range_in_scale being available in Axis class, so exposing the val_in_range seemed consistent to me. But I later forgot to implement the wrapper in _point_in_data_domain.
I have two possible approaches here:
- Keep the wrapper exposed in Axis and use it in the
_point_in_data_domain. - Remove the wrapper from Axis and keep the current
_point_in_data_domain, using the_scaleattribute, unchanged.
After going through other exposed apis, I feel removing the wrapper here would be fine as currently there is no other use of val_in_range except in _point_in_data_domain.
| Return whether the value(s) are within the valid range for this scale. | ||
| """ |
There was a problem hiding this comment.
| Return whether the value(s) are within the valid range for this scale. | |
| """ | |
| Return whether the value(s) are within the valid range for this scale. | |
| This method is a generic implementation. Subclasses may implement | |
| more efficent solutions for their domain. | |
| """ |
| """ | ||
| Return `True` for all values. | ||
| """ |
There was a problem hiding this comment.
The docstring should always descripe the concept, not just the imlementation. A docstring
def some_func():
"""Return True."""
return Truehas zero information conent.
Instead do:
| """ | |
| Return `True` for all values. | |
| """ | |
| """ | |
| Return whether the value(s) are within the valid range for this scale. | |
| This is True for all inputs. | |
| """ |
The first sentence should be in all val_in_range docs.
There was a problem hiding this comment.
This was actually pretty enlightening. Kind of changed the way I thought about docstrings.
I'm making the needed changes. Thank you very much for the suggestion.
|
I've made the changes which I think should fix the type ambiguity I caused earlier. I will need a little time to implement the tests as I have never written tests before and will need to study the current tests before I can implement mine and fix all the linting error with it as well. |
|
While working on this I noticed a couple of behaviors that I wanted to clarify. For Example code: from matplotlib.scale import LinearScale
s = LinearScale(axis=None)
vmin, vmax = s.limit_range_for_scale(float('nan'), float('nan'), minpos=1e-100)
print(vmin, vmax)which results in Similarly for |
|
This is a difficult question. Without having looked into this, I suspect, one of the two is correct:
Which one it is needs to be found out by looking at the current usage contexts. In either case, the behavior should be specified. |
|
Thinking about it, the primary use for |
|
I have added the finite bound check for the |
lib/matplotlib/scale.py
Outdated
| if not math.isfinite(val): | ||
| return False | ||
| else: | ||
| return val > 0 |
There was a problem hiding this comment.
Same logic, but more compact.
| if not math.isfinite(val): | |
| return False | |
| else: | |
| return val > 0 | |
| return math.isfinite(val) and val > 0 |
|
Thanks @null-dreams! Congratulations on your first PR to Matplotlib 🎉 We hope to hear from you again. |
Closes #31286
PR summary
Following the suggestion by @timhoffm in #31061, this PR introduces a
val_in_rangemethod to theScalehierarchy.Why is this change necessary?
The current implementation of
_point_in_data_domainrelies on a "clipping side-effect." It callslimit_range_for_scale, which is intended for setting axis limits, and then compares if the output value differs from the input value to determine validity. This approach is semantically indirect and makes the logic harder to maintain.What problem does it solve?
Axeslogic to handle scale-specific details likeminposor equality checks against clipped values.What is the reasoning for this implementation?
ScaleBaseimplementsval_in_rangeusing the existinglimit_range_for_scalelogic to ensure third-party or custom scales continue to work without modification.Vectorization: Although the original issue focused on the API shift, I identified a small gap in possible performance. I designed the newval_in_rangeimplementations to be vectorized using NumPy.In common scales likeLogScaleandLogitScale, this replaces iterative logic/clipping with direct boolean math (e.g.,val > 0).This allows for high-performance domain checking on large arrays, whereas the previous approach was primarily scalar-focused.Update: Vectorization has been removed after review since it was not deemed necessary for the current use case and added unneeded complexity without clear benefit.
Example
AI Disclosure
I used an AI assistant to understand the codebase structure only.
PR checklist