@@ -403,31 +403,54 @@ and :keyword:`raise` statement in section :ref:`raise`.
403403Runtime Components
404404==================
405405
406- Python's execution model does not operate in a vacuum. It runs on a
407- computer. When a program runs, the conceptual layers of how it runs
408- on the computer look something like this::
409-
410- host machine and operating system (OS)
411- process
412- OS thread (runs machine code)
413-
414- Hosts and processes are isolated and independent from one another.
415- However, threads are not.
416-
417- A program always starts with exactly one thread, known as the "main"
418- thread, it may grow to run in multiple. Not all platforms support
419- threads, but most do. For those that do, all threads in a process
420- share all the process' resources, including memory.
421-
422- The fundamental point of threads is that each thread does *run *
406+ General Computing Model
407+ -----------------------
408+
409+ Python's execution model does not operate in a vacuum. It runs on
410+ a host machine and through that host's runtime environment, including
411+ its operating system (OS), if there is one. When a program runs,
412+ the conceptual layers of how it runs on the host look something
413+ like this::
414+
415+ **host machine**
416+ **process** (global resources)
417+ **thread** (runs machine code)
418+
419+ Each process represents a program running on the host. Think of each
420+ process itself as the data part of its program. Think of the process'
421+ threads as the execution part of the program. This distinction will
422+ be important to understand the conceptual Python runtime.
423+
424+ The process, as the data part, is the execution context in which the
425+ program runs. It mostly consists of the set of resources assigned to
426+ the program by the host, including memory, signals, file handles,
427+ sockets, and environment variables.
428+
429+ Processes are isolated and independent from one another. (The same
430+ is true for hosts.) The host manages the process' access to its
431+ assigned resources, in addition to coordinating between processes.
432+
433+ Each thread represents the actual execution of the program's machine
434+ code, running relative to the resources assigned to the program's
435+ process. It's strictly up to the host how and when that execution
436+ takes place.
437+
438+ From the point of view of Python, a program always starts with exactly
439+ one thread. However, the program may grow to run in multiple
440+ simultaneous threads. Not all hosts support multiple threads per
441+ process, but most do. Unlike processes, threads in a process are not
442+ isolated and independent from one another. Specifically, all threads
443+ in a process share all of the process' resources.
444+
445+ The fundamental point of threads is that each one does *run *
423446independently, at the same time as the others. That may be only
424447conceptually at the same time ("concurrently") or physically
425448("in parallel"). Either way, the threads effectively run
426449at a non-synchronized rate.
427450
428451.. note ::
429452
430- That non-synchronized rate means none of the global state is
453+ That non-synchronized rate means none of the process' memory is
431454 guaranteed to stay consistent for the code running in any given
432455 thread. Thus multi-threaded programs must take care to coordinate
433456 access to intentionally shared resources. Likewise, they must take
@@ -438,70 +461,152 @@ at a non-synchronized rate.
438461 Python runtime.
439462
440463 The cost of this broad, unstructured requirement is the tradeoff for
441- the concurrency and, especially, parallelism that threads provide.
442- The alternative generally means dealing with non-deterministic bugs
443- and data corruption.
444-
445- The same layers apply to each Python program, with some extra layers
446- specific to Python::
447-
448- host
449- process
450- Python runtime
451- interpreter
452- Python thread (runs bytecode)
453-
454- When a Python program starts, it looks exactly like that, with one
455- of each. The process has a single global runtime to manage Python's
456- process-global resources. The runtime may grow to include multiple
457- interpreters and each interpreter may grow to include multiple Python
458- threads. The initial interpreter is known as the "main" interpreter,
459- and the initial thread, where the runtime was initialized, is known
460- as the "main" thread.
461-
462- An interpreter completely encapsulates all of the non-process-global
463- runtime state that the interpreter's Python threads share. For example,
464- all its threads share :data: `sys.modules `, but each interpreter has its
465- own :data: `sys.modules `.
464+ the kind of raw concurrency that threads provide. The alternative
465+ to the required discipline generally means dealing with
466+ non-deterministic bugs and data corruption.
467+
468+ Python Runtime Model
469+ --------------------
470+
471+ The same conceptual layers apply to each Python program, with some
472+ extra data layers specific to Python::
473+
474+ **host machine**
475+ **process** (global resources)
476+ globl runtime (*state*)
477+ interpreter (*state*)
478+ **thread** (runs "C-API" and Python bytecode)
479+ thread *state*
480+
481+ At the conceptual level: when a Python program starts, it looks exactly
482+ like that diagram, with one of each. The runtime may grow to include
483+ multiple interpreters, and each interpreter may grow to include
484+ multiple thread states.
466485
467486.. note ::
468487
469- The interpreter here is not the same as the "bytecode interpreter",
470- which is what regularly runs in threads, executing compiled Python code.
488+ A Python implementation won't necessarily implement the runtime
489+ layers distinctly or even concretely. The only exception is places
490+ where distinct layers are directly specified or exposed to users,
491+ like through the :mod: `threading ` module.
471492
472- A Python thread represents the state necessary for the Python runtime
473- to *run * in an OS thread. It also represents the execution of Python
474- code (or any supported C-API) in that OS thread. Depending on the
475- implementation, this probably includes the current exception and
476- the Python call stack. The Python thread always identifies the
477- interpreter it belongs to, meaning the state it shares
478- with other threads.
493+ .. note ::
494+
495+ The initial interpreter is typically called the "main" interpreter.
496+ Some Python implementations, like CPython, assign special roles
497+ to the main interpreter.
498+
499+ Likewise, the host thread where the runtime was initialized is known
500+ as the "main" thread. It may be different from the process' initial
501+ thread, though they are often the same. In some cases "main thread"
502+ may be even more specific and refer to the initial thread state.
503+ A Python runtime might assign specific responsibilities
504+ to the main thread, such as handling signals.
505+
506+ As a whole, the Python runtime consists of the global runtime state,
507+ interpreters, and thread states. The runtime ensures all that state
508+ stays consistent over its lifetime, particularly when used with
509+ multiple host threads. The runtime also exposes a way for host threads
510+ to "call into Python", which will be covered in the next subsection.
511+
512+ The global runtime, at the conceptual level, is just a set of
513+ interpreters. While they are otherwise isolated and independent from
514+ one another, they may share some data or other resources. The runtime
515+ is responsible for managing these global resources safely. The actual
516+ nature and management of these resources is implementation-specific.
517+ Ultimately, the external utility of the global runtime is limited
518+ to managing interpreters.
519+
520+ In contrast, an "interpreter" is conceptually what we would normally
521+ think of as the (full-featured) "Python runtime". When machine code
522+ executing in a host thread interacts with the Python runtime, it calls
523+ into Python in the context of a specific interpreter.
479524
480525.. note ::
481526
482- Here "Python thread" does not necessarily refer to a thread created
483- using the :mod: `threading ` module.
527+ The term "interpreter" here is not the same as the "bytecode
528+ interpreter", which is what regularly runs in threads, executing
529+ compiled Python code.
530+
531+ In an ideal world, "Python runtime" would refer to what we currently
532+ call "interpreter". However, it's been called "interpreter" at least
533+ since introduced in 1997 (a027efa5b).
534+
535+ Each interpreter completely encapsulates all of the non-process-global,
536+ non-thread-specific state needed for the Python runtime to work.
537+ Notably, the interpreter's state persists between uses. It includes
538+ fundamental data like :data: `sys.modules `. The runtime ensures
539+ multiple threads using the same interpreter will safely
540+ share it between them.
541+
542+ A Python implementation may support using multiple interpreters at the
543+ same time in the same process. They are independent and isolated from
544+ one another. For example, each interpreter has its own
545+ :data: `sys.modules `.
546+
547+ For thread-specific runtime state, each interpreter has a set of thread
548+ states, which it manages, in the same way the global runtime contains
549+ a set of interpreters. It can have thread states for as many host
550+ threads as it needs. It may even have multiple thread states for
551+ the same host thread, though that isn't as common.
552+
553+ Each thread state, conceptually, has all the thread-specific runtime
554+ data an interpreter needs to operate in one host thread. The thread
555+ state includes the current raised exception and the thread's Python
556+ call stack. It may include other thread-specific resources.
557+
558+ .. note ::
484559
485- Each Python thread is associated with a single OS thread, which is where
486- it can run. In the opposite direction, a single OS thread can have many
487- Python threads associated with it. However, only one of those Python
488- threads is "active" in the OS thread at time. The runtime will operate
489- in the OS thread relative to the active Python thread.
560+ The term "Python thread" can sometimes refer to a thread state, but
561+ normally it means a thread created using the :mod: `threading ` module.
490562
491- For an interpreter to be used in an OS thread, it must have a
492- corresponding active Python thread. Thus switching between interpreters
493- means changing the active Python thread. An interpreter can have Python
494- threads, active or inactive, for as many OS threads as it needs. It may
495- even have multiple Python threads for the same OS thread, though at most
496- one can be active at a time.
563+ Each thread state, over its lifetime, is always tied to exactly one
564+ interpreter and exactly one host thread. It will only ever be used in
565+ that thread. In the other direction, a host thread may have many
566+ Python thread states tied to it, for different interpreters.
497567
498568Once a program is running, new Python threads can be created using the
499569:mod: `threading ` module (on platforms and Python implementations that
500570support threads). Additional processes can be created using the
501571:mod: `os `, :mod: `subprocess `, and :mod: `multiprocessing ` modules.
502572You can run coroutines (async) in the main thread using :mod: `asyncio `.
503573Interpreters can be created and used with the
504- :mod: `concurrent.interpreters ` module.
574+ :mod: `~concurrent.interpreters ` module.
575+
576+ Calls into Python
577+ -----------------
578+
579+ A "call into Python" is an abstraction of "ask the Python runtime
580+ to do something". It necessarily involves targeting a single runtime
581+ context, whether global, interpreter, or thread. The layer depends
582+ on the desired operation. Most operations require a thread state.
583+
584+ When a running host thread calls into Python, the actual mechanism
585+ is implementation-specific. For example, CPython provides a C-API and
586+ the thread will literally call into Python through a C-API function.
587+
588+ .. drop paragraph?
589+
590+ Some thread-specific operations must only target a new thread state,
591+ while others may target any thread state, including one with a Python
592+ call already on its stack or a current exception set.
593+
594+ A thread-specific call into Python can target only one thread state.
595+ That means, when there are multiple Python thread states tied to the
596+ current host thread, only one of them can be in use at a time. It
597+ doesn't matter if the thread states belong to different interpreters
598+ or the same interpreter.
599+
600+ Calls into Python can be nested. Even if a thread has already called
601+ into Python, that operation could be interrupted by another call into
602+ Python targeting a different runtime context. For example, the
603+ implementation of the outer call might make the inner call directly.
604+ Alternately, the host or Python runtime might trigger some
605+ asyncronous callback that calls into Python.
606+
607+ Regardless, at the point of the inner call, the target is swapped.
608+ When the inner call finishes, the target is swapped back and the outer
609+ call resumes.
505610
506611
507612.. rubric :: Footnotes
0 commit comments