ElementTree.parse() fails with "not well-formed" for declared encoding "utf8"

# Bug report

### Bug description:

ElementTree.parse() fails with "not well-formed (invalid token)" for wrong declared encoding `utf8` iif the XML contains non-ASCII characters, instead of "unknown encoding"; more details after repro:

Repro:

```python
import io, xml.etree.ElementTree as ET

s = """\
<?xml version='1.0' encoding='utf8'?>
<outline text="Comentário" />
"""

def parse(s):
    return ET.parse(io.BytesIO(s.encode()))
    
parse(s)
```

Output:

```python
>>> parse(s)
Traceback (most recent call last):
  File "<python-input-3>", line 1, in <module>
    parse(s)
  File "<python-input-2>", line 2, in parse
    return ET.parse(io.BytesIO(s.encode()))
  File "/Library/Frameworks/Python.framework/Versions/3.14/lib/python3.14/xml/etree/ElementTree.py", line 1214, in parse
    tree.parse(source, parser)
  File "/Library/Frameworks/Python.framework/Versions/3.14/lib/python3.14/xml/etree/ElementTree.py", line 577, in parse
    self._root = parser._parse_whole(source)
xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 2, column 21
>>>
>>> # utf-8 works fine
>>> parse(s.replace('utf8', 'utf-8')).getroot().get('text')
'Comentário'
>>>
>>> # ascii-only characters work fine, despite the wrong utf8 declared encoding
>>> parse(s.replace('á', 'a')).getroot().get('text')
'Comentario'
>>>
>>> # a truly unknown encoding fails with the correct message
>>> parse(s.replace('utf8', 'xyz'))
Traceback (most recent call last):
  ...
LookupError: unknown encoding: xyz
>>>
>>> # ascii encoding fails with the same message as utf8
>>> # (perhaps utf8 silently falls back to ascii?)
>>> parse(s.replace('utf8', 'ascii'))
Traceback (most recent call last):
  ...
xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 2, column 21
```


Per the [XML spec] and [IANA character sets list], the correct (and only) encoding name is `utf-8` (works fine with etree).

[XML spec]: https://www.w3.org/TR/REC-xml/
[IANA character sets list]: https://www.iana.org/assignments/character-sets/character-sets.xml

Whether to accept `utf8` was discussed previously in https://github.com/python/cpython/issues/46531, which was closed as won't fix (but in that issue, the error message was "unknown encoding", so the current message is a regression); FWIW, LXML does accept `utf8` as a valid encoding.

Expected behavior:

* `utf8` encoding fails with "unknown encoding", regardless of whether the input contains non-ASCII characters or not ("in the face of ambiguity, refuse the temptation to guess"), *or*
* treat `utf8` as `utf-8`, even if it's not actually correct (str.encode() and LXML supporting it seems to indicate it is a common (mis)spelling)


LXML behavior, for reference:

```python
>>> import lxml.etree as ET
>>> 
>>> # lxml accepts wrong encoding utf8
>>> parse(s).getroot().get('text')
'Comentário'
>>>
>>> # unknown encoding fails as expected
>>> parse(s.replace('utf8', 'xyz'))
Traceback (most recent call last):
  ...
lxml.etree.XMLSyntaxError: Unsupported encoding: xyz, line 1, column 35
```

### CPython versions tested on:

3.14, 3.13, 3.12

### Operating systems tested on:

macOS, Linux

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ElementTree.parse() fails with "not well-formed" for declared encoding "utf8" #148821

Bug report

Bug description:

CPython versions tested on:

Operating systems tested on:

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

ElementTree.parse() fails with "not well-formed" for declared encoding "utf8" #148821

Description

Bug report

Bug description:

CPython versions tested on:

Operating systems tested on:

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions