gh-142037: Improve error messages for printf-style formatting#142081
gh-142037: Improve error messages for printf-style formatting#142081serhiy-storchaka merged 9 commits intopython:mainfrom
Conversation
This affects string formatting as well as bytes and bytearray formatting. * For errors in the format string, always include the position of the start of the format unit. * For errors related to the formatted arguments, always include the number or the name of the argument. * Suggest more probable causes of errors in the format string (stray %, unsupported format, unexpected character). * Provide more information when the number of arguments does not match the number of format units. * Raise more specific errors when access of arguments by name is mixed with sequential access and if * is used with a mapping. * Add tests for some uncovered cases.
| test_exc_common("abc %Id", 1, ValueError, | ||
| "unsupported format %I at position 4") | ||
| test_exc_common("abc %'d", 1, ValueError, | ||
| "stray % at position 4 or unexpected format character ''' at position 5") |
There was a problem hiding this comment.
''' is not very readable. Would it be possible to format it as U+HHHH or 0xHH?
| test_exc_common('%(x)r', 1, TypeError, | ||
| "format requires a mapping, not int") | ||
| test_exc_common('%*r', 3.14, TypeError, | ||
| "* requires int, not float") |
There was a problem hiding this comment.
Without context, it's uneasy to understand that the error comes from a format string. Maybe write %* instead?
| "* requires int, not float") | |
| "%* requires int, not float") |
There was a problem hiding this comment.
It is perhaps better to remove this test, because the example is incorrect in more than one way. If * is used, then we need more than one argument, and in that case we will get "format argument N" in the error message, like in the following line..
| "format argument 1: too big for precision") | ||
| test_exc_common('%d', '1', TypeError, | ||
| "%d format: a real number is required, not str") | ||
| "a real number is required for format %d, not str") |
There was a problem hiding this comment.
is it really a real number which is expected, or an integer?
There was a problem hiding this comment.
Yes. It will be truncated to integer.
| test_exc('%c', 2**128, OverflowError, | ||
| "%c argument not in range(0x110000)") | ||
| test_exc('%c', 3.14, TypeError, | ||
| "%c requires an integer or a unicode character, not float") |
There was a problem hiding this comment.
| "%c requires an integer or a unicode character, not float") | |
| "%c requires an integer or a Unicode character, not float") |
There was a problem hiding this comment.
This is used in many other places. We should do all such changes at once -- either capitalize "unicode", or remove it.
| (int)arg->ch, arg->fmtstart); | ||
| } | ||
| else if (arg->ch >= 32 && arg->ch < 127) { | ||
| else if (arg->ch >= 32 && arg->ch < 127 && arg->ch != '\'') { |
There was a problem hiding this comment.
Maybe ignore also the quote character in Py_UNICODE_ISPRINTABLE() test below?
0e43000 to
c015d6a
Compare
…on into str-format-errors
serhiy-storchaka
left a comment
There was a problem hiding this comment.
I currently consider an idea of using "%x expected an integer, got float" instead of "%x requires an integer, not float". What do you think?
vstinner
left a comment
There was a problem hiding this comment.
LGTM. I left minor comments.
| "format argument: %c requires an integer or a unicode character, not float") | ||
| test_exc('%c', (3.14,), TypeError, | ||
| "format argument 1: %c requires an integer or a unicode character, not float") | ||
| test_exc('%(x)c', {'x': 3.14}, TypeError, | ||
| "format argument 'x': %c requires an integer or a unicode character, not float") | ||
| test_exc('%c', 'ab', TypeError, | ||
| "format argument: %c requires an integer or a unicode character, not a string of length 2") | ||
| test_exc('%c', ('ab',), TypeError, | ||
| "format argument 1: %c requires an integer or a unicode character, not a string of length 2") | ||
| test_exc('%(x)c', {'x': 'ab'}, TypeError, | ||
| "format argument 'x': %c requires an integer or a unicode character, not a string of length 2") | ||
| test_exc('%c', b'x', TypeError, | ||
| "format argument: %c requires an integer or a unicode character, not bytes") |
There was a problem hiding this comment.
Since we are changing error messages, I suggest to write Unicode with an uppercase U.
There was a problem hiding this comment.
This is a different issue. "unicode character" is used in other error messages which this error message was based on.
| self.assertRaisesRegex(TypeError, '%i format: a real number is required, not complex', operator.mod, '%i', 2j) | ||
| self.assertRaisesRegex(TypeError, '%d format: a real number is required, not complex', operator.mod, '%d', 1j) | ||
| self.assertRaisesRegex(TypeError, r'%c requires an int or a unicode character, not .*\.PseudoFloat', operator.mod, '%c', pi) | ||
| self.assertRaisesRegex(TypeError, '%x requires an integer, not float', operator.mod, '%x', 3.14) |
There was a problem hiding this comment.
I would prefer to check also the format argument: prefix.
There was a problem hiding this comment.
I am not sure that it is related to the purpose of this test, but I'll do this. Although it make the lines obscenely long.
…ythonGH-142081) This affects string formatting as well as bytes and bytearray formatting. * For errors in the format string, always include the position of the start of the format unit. * For errors related to the formatted arguments, always include the number or the name of the formatted argument. * Suggest more probable causes of errors in the format string (stray %, unsupported format, unexpected character). * Provide more information when the number of arguments does not match the number of format units. * Raise more specific errors when access of arguments by name is mixed with sequential access and when * is used with a mapping. * Add tests for some uncovered cases.
This affects string formatting as well as bytes and bytearray formatting.
%-formats #142037