## PR Summary
Closes #20504
This is a massive PR and I realize that... I …don't fully expect anyone to read every single line of the newly added stub files. That said, these files have zero effect at run time for the library, so even if there are things that are wrong about them, they won't cause running code to fail. I'd like to get this in early in the 3.8 cycle to allow downstream libraries to test out the type hints with sufficient time to fix them before final release.
When reviewing, read enough of the stub files to get a sense of how they work, and to understand common patterns.
I attempted to be as complete and correct as I could be, but that is mostly reliant on existing docstrings being complete and correct (and some amount of fallback to variable names).
Highlighted changes to library code:
- `__all__` added to `matplotlib.__init__`
- `mypy` was not happy about some things being implicitly exported (things that were imported but other places use in the mpl top level namespace)
- This makes explicit that all mpl defined things are publically accessible from the mpl namespace
- Happy to add/subtract from the list
- Pyplot has type hints inline
- boilerplate.py updated to gather annotations
- because of the syntax rules of the return annotation (cannot line break on ` -> `) the textwrap based line wrapping has been removed and replaced with a clean-up step after writing of running `black`
- This makes some of the other behaviors (such as putting null bytes to avoid line wrapping the data kwarg) obsolete, but at the cost of a new dev (and testing) dependency on black
- The handwritten portions of pyplot are wrapped with `# fmt: off/on` comments, so only the autogenerated portion of the file is actually run through `black`
- Some type aliases for commonly used types:
- Names are negotiable, mostly just tacked on `Type` to the idea I was representing to avoid occluding a name we might want to use
- These are included in the .py files such that downstream users can use them as type hints
- In the earlier days of this work, they were in a `_typing.pyi` stub file that only worked while type checking, but redistributed to more natural homes and avoided that complexity.
- `colors.Color` (str or 3/4-tuple of floats)
- may wish to rename, especially if we want to be able to create a real `Color` class to handle things like comparisons, which has been proposed recently.
- `markers.MarkerType`(str, Path, MarkerStyle), `markers.FillStyleType` (Literal of strings)
- `lines.LineStyleType` (str or tuple for dashes), `lines.DrawStyleType` (Literal of strings), `lines.MarkEveryType` (many types possible)
For most of the library, stub files are used rather than inline type hints.
This has the advantage of not touching literally every public signature line in one PR, but does carry a tradeoff of direct utility of `mypy` and some other tools in the typing space.
`mypy` will trust the pyi files and disregard the actual implementation, meaning that if a parameter is added/removed, but the pyi file is not updated, mypy will use the outdated signature to type check.
The two exceptions where inline type hints are used are tests and pyplot.
Tests use inline typehints, but do so relatively minimally, only when mypy complains explicitly. It doesn't make a lot of sense to write stub files for tests, which are not called by downstream libraries (who the stub files are actually mostly for) but running type checking on tests makes a fair amount of sense as it is more representative of what downstream users will see.
Pyplot uses inline type hints because it is autogenerated, and it was easier to inject the type hints into the existing autogenerated code than it would be to do parallel lookup and write two files. The hand written portions of pyplot also do have type hints.
Since pyplot has inline type hints, mypy will actually check the implementation of the functions for type consistency, which can find certain errors.
There are some tools to help discover those problems, but I have found that they are too noisy and not configurable enough to be useful to us (yet).
See [this comment](https://github.com/ksunden/matplotlib/pull/1#issuecomment-1374282552) for a round up of some of the other type checking tools that are available and why I don't fully support them yet.
I may end up writing a script to find the most commonly anticipated discrepancies, but for getting started, the heuristic is "if a signature is touched, make sure to update the pyi file" (which are already a higher bar of review/etc as anything that changes a signature is an API change which should carry a release note.)
Most specifically, added/removed functions/methods which are not reflected in the pyi and signature changes.
There are some areas left for followup PRs as they are a) relatively complicated ideas that should get particular attention on review that would be lost in this massive PR or b) ideas that can be added incrementally
- data arg preprocessor
- typing shapes and dtypes of arrays (most everything is [ArrayLike]( https://numpy.org/devdocs/reference/typing.html#numpy.typing.ArrayLike) or np.ndarray, but these types do support being more specific. Using the full power that seems to be available is not as well documented as I would have liked, so may wish to reach out to somebody who knows more about this (and encourage/possibly write the docs that I think should exist for typing with numpy)
- Some methods (e.g. `tri*`, `streamplot`, etc) are defined outside of the Axes class, and so the pyplot generation code does not currently get their type hints. The code could be updated to be smarter about looking at attributes and getting their type hints, but that would make it more complicated, so cover the 90% case first.
Additionally, we may wish to be more relaxed or more specific in certain areas.
[Dask](https://github.com/dask/community/issues/255) has an interesting set of guidelines that perhaps we should adopt some or all of.
In particular, for this pass I was being as complete as I could, but they recommend things like "if the return type is a union of things, don't actually type it, because that can be more burdensome to downstream users than just saying `Any`"
Also some places where we specify callback functions may be overly specified or hard to follow with the full specification, so relaxing the type hint may aid readability.
I certainly don't follow every one of those guidelines yet, but I think they are mostly well reasoned and perhaps we should adopt them/go through and clean up (things like the return annotation on `__init__` or importing `collections.abc` over `typing` may be better as followup PRs.
I make fairly heavy use of [`typing.Literal`](https://docs.python.org/3/library/typing.html#typing.Literal). There may be some places where that is more unwieldy than we'd like and it should be `str`, but for the most part it has a reasonable correspondence to where we use `_api.check_in_list` (and its similar siblings). There may also be places where I went more generic but we can be more specific by using Literal.
Some types, particularly things like `tuple[float, float, float, float]` are a little unwieldy and may be better understandable replaced by a type alias (e.g. `RGBA` or `RectangleSpec` (names negotiable), because despite being the same type, their usage is logically different).
Also, tuple types are sometimes used when what we really mean is "unpackable thing with N elements" but the type system doesn't handle that _super_ well at this time (though perhaps ArrayLike can get us there...). I also think it is fair in those cases that we say "you _should_ give us a tuple, and for the best guarantee of forward compatibility, maybe cast it to one just to be sure".
The sphinx build is _mostly_ unaffected, as sphinx does not read pyi stub files (yet)... though pyplot does get all of the info in the signature lines, and required some things to be added to the list of ignored warnings (for now, at least... perhaps we could improve their ability to link at some point)
Todos before merge:
- [x] Move items from `_typing.pyi` to their more natural homes, and document them. This was a bit of a catchall, that ended up not having all that many things in it, but I wanted to be able to use the types and move them later. It is part of why sphinx is unhappy, since these are not documented.
- [x] Fix sphinx build
- [ ] Add documentation, especially of expectations moving forward (e.g. keeping these up to date)
- [ ] Write release note[s]
## PR Checklist
**Documentation and Tests**
- [ ] Has pytest style unit tests (and `pytest` passes)
- [ ] Documentation is sphinx and numpydoc compliant (the docs should [build](https://matplotlib.org/devel/documenting_mpl.html#building-the-docs) without error).
- [ ] New plotting related features are documented with examples.
**Release Notes**
- [ ] New features are marked with a `.. versionadded::` directive in the docstring and documented in `doc/users/next_whats_new/`
- [ ] API changes are marked with a `.. versionchanged::` directive in the docstring and documented in `doc/api/next_api_changes/`
- [ ] Release notes conform with instructions in `next_whats_new/README.rst` or `next_api_changes/README.rst`
<!--
Thank you so much for your PR! To help us review your contribution, please
consider the following points:
- A development guide is available at https://matplotlib.org/devdocs/devel/index.html.
- Help with git and github is available at
https://matplotlib.org/devel/gitwash/development_workflow.html.
- Create a separate branch for your changes and open the PR from this branch. Please avoid working on `main`.
- The PR title should summarize the changes, for example "Raise ValueError on
non-numeric input to set_xlim". Avoid non-descriptive titles such as
"Addresses issue #8576".
- The summary should provide at least 1-2 sentences describing the pull request
in detail (Why is this change required? What problem does it solve?) and
link to any relevant issues.
- If you are contributing fixes to docstrings, please pay attention to
http://matplotlib.org/devel/documenting_mpl.html#formatting. In particular,
note the difference between using single backquotes, double backquotes, and
asterisks in the markup.
We understand that PRs can sometimes be overwhelming, especially as the
reviews start coming in. Please let us know if the reviews are unclear or
the recommended next step seems overly demanding, if you would like help in
addressing a reviewer's comments, or if you have been waiting too long to hear
back on your PR.
-->