gh-144995: Optimize memoryview == memoryview#144996
Conversation
|
Results of the benchmark from the issue: memoryview comparison complexity is no longer O(n) but O(1): values are no longer compared. |
| } | ||
|
|
||
| static int | ||
| is_float_format(const char *format) |
There was a problem hiding this comment.
Does this cover the complex types?
import numpy as np
a = np.array([1+2j, 3+4j, float('nan')], dtype=np.complex128)
mv = memoryview(a)
mv == mv # False
There was a problem hiding this comment.
This memory format is Zd. Oh, my change doesn't work for this memoryview. I should replace the blocklist with an allowlist. I'm not a memoryview/buffer expert. I didn't know that 3rd party projects can have their own format.
|
@eendebakpt: I updated the PR to allow formats known to be safe for pointer comparison (integer types), instead of blocking formats known to use floats. I excluded the format |
I think adding the |
Co-authored-by: Pieter Eendebak <pieter.eendebak@gmail.com>
| # A memoryview is equal to itself: there is no need to compare | ||
| # individual values. This is not true for float values since they can | ||
| # be NaN, and NaN is not equal to itself. | ||
| for int_format in 'bBhHiIlLqQ': |
There was a problem hiding this comment.
Can "?" be tested? Can format starting with "@" be tested? Can the null format be tested?
There was a problem hiding this comment.
I don't know how to test these formats. array.array doesn't support "P" and "?" formats and it doesn't support "@" byte order. Do you have an idea how to test these cases?
There was a problem hiding this comment.
memoryview.cast() supports them.
There was a problem hiding this comment.
Surprisingly:
>>> memoryview(b'\0\1').cast('?') == memoryview(b'\0\2').cast('?')
False
even if
>>> list(memoryview(b'\0\1').cast('?')) == list(memoryview(b'\0\2').cast('?'))
True
But this may be platform depending, so I would not test values different than 0 and 1. Or 1 is also not safe?
It may be undefined behavior to interpret random values except 0 as void* (even if it works on x86). Maybe there is a way to create an array of pointers in ctypes? Or it is not worth to bother?
* Optimize also "P" format * Test also "m != m" * Handle native formats such as "@b"
|
I updated the PR to address @serhiy-storchaka's review:
|
|
I added tests on 4 more formats: |
serhiy-storchaka
left a comment
There was a problem hiding this comment.
I have some doubts about the 'P' test. It may be an operation with undefined behavior (although CPython may be never run on platforms were this does not work, but I am not sure). It would be safer to omit that test. There are no other tests for 'P' format. But the optimization should work for it (if we exclude undefined behavior).
|
I modified the tests to check that the result with optimization is the same as the result without optimization: def check_equal(view, is_equal):
self.assertEqual(view == view, is_equal)
self.assertEqual(view != view, not is_equal)
# Comparison with a different memoryview doesn't use
# the optimization and should give the same result.
view2 = memoryview(view)
self.assertEqual(view2 == view, is_equal)
self.assertEqual(view2 != view2, not is_equal)
For boolean ( If you are not confident with my |
Disable also the optimization if the format string is NULL.
Ok, I disabled the optimization for the "P" format. I also disabled the optimization if the format string is @serhiy-storchaka: Would you mind to review the updated PR? |
|
I added tests on "c", "n" and "N" formats. Now all optimized formats have tests. |
| // "d" (double), "f" (float) and "e" (16-bit float). | ||
| // Do not optimize "P" format. | ||
| can_compare_ptr = (format[0] != 0 | ||
| && strchr("bBchHiIlLnNqQ?", format[0]) != NULL |
There was a problem hiding this comment.
For very small memoryviews, what is faster, strchr() or standard path?
* Skip "n" and "N" test if there is no struct format, instead of failing. * Remove can_compare_ptr variable.
|
I updated the PR to skip "n" and "N" tests if there is no struct format, instead of failing. I also removed the can_compare_ptr variable (in the C code). |
I don't know. I ran a benchmark. It seems like even for an empty memoryview, it's faster with the optimization. Benchmark run on Linux (Fedora 43) with CPU isolation:
Benchmark: import pyperf
import array
runner = pyperf.Runner()
for length in (0, 1, 5, 10, 25, 100):
runner.timeit(
f'{length}-byte bytes view',
setup=f'import array; view=memoryview(array.array("B", [0] * {length}))',
stmt='view == view') |
| m = memoryview(a.tobytes()).cast('n') | ||
| check_equal(m, True) | ||
| int_format = None | ||
| if int_format: |
There was a problem hiding this comment.
And since we already use subTest() in this test you can wrap this in a subtest and use skipTest() which will only skip a subtest and report this.
|
Merged. Thanks for the review @eendebakpt and @serhiy-storchaka. |
Optimize memoryview comparison: a memoryview is equal to itself, there is no
need to compare values, except if it uses float format.
Benchmark comparing 1 MiB:
from timeit import timeit
with open("/dev/random", 'br') as fp:
data = fp.read(2**20)
view = memoryview(data)
LOOPS = 1_000
b = timeit('x == x', number=LOOPS, globals={'x': data})
m = timeit('x == x', number=LOOPS, globals={'x': view})
print("bytes %f seconds" % b)
print("mview %f seconds" % m)
print("=> %f time slower" % (m / b))
Result before the change:
bytes 0.000026 seconds
mview 1.445791 seconds
=> 55660.873940 time slower
Result after the change:
bytes 0.000026 seconds
mview 0.000028 seconds
=> 1.104382 time slower
This missed optimization was discovered by Pierre-Yves David
while working on Mercurial.
Co-authored-by: Pieter Eendebak <pieter.eendebak@gmail.com>
Uh oh!
There was an error while loading. Please reload this page.