Title: Issue and test deprecation warnings by EliahKagan · Pull Request #1886 · gitpython-developers/GitPython · GitHub
Open Graph Title: Issue and test deprecation warnings by EliahKagan · Pull Request #1886 · gitpython-developers/GitPython
X Title: Issue and test deprecation warnings by EliahKagan · Pull Request #1886 · gitpython-developers/GitPython
Description: Scope Some deprecations in GitPython had no associated DeprecationWarning, and none had unit tests to verify that the warnings were issued under the intended circumstances. This adds... warnings, tests of the warnings, and where applicable, tests for subtle problems that can arise from adding the warnings as new runtime behavior on attribute access (see below for details) ...for everything in GitPython that is documented as deprecated in comments and/or docstrings, except where a DeprecationWarning is intentionally not issued either because some other kind of warning is emitted or because there is a specific reason not to do so: Deprecated attributes of the top-level git module that are listed in __all__ for compatibility intentionally do not warn because it would lead to developers silencing warnings if they don't want to see them when exploring in a REPL and using a wildcard import. (Deprecated attributes of git that are not listed in __all__ do warn.) Setting the HIDE_WINDOWS_KNOWN_ERRORS and HIDE_WINDOWS_FREEZE_ERRORS environment variables is deprecated, as noted in git/util.py, but this is expected to be entirely a user behavior rather than a behavior of code calling GitPython, and as such it already emits a warning with the logging framework. It may be worthwhile to test for this warning, but hopefully that code can simply be removed soon (and its existing tests thinned out further) by making the test suite entirely responsible for any special test-related rmtree behavior (#790). Passing $var, ${var}, and or Windows %var% notation in paths to expand environment variables is deprecated since #662. The current behavior is to try to issue UserWarning (which is seen in more situations than DeprecationWarning) except when expand_vars is available and False has been passed for it (suppressing the expansion). Besides that these are not DeprecationWarnings, I consider this to be its own issue, as there are design decisions to be made about multiple aspects of it, which should probably be made together. This also intentionally causes three previously accepted usages to be treated as static type errors: The way the deprecation warning is added for git.types.Lit_commit_ish is selected to also make it always a type error to appear in any annotation. Due to #1858 and #1859, it is implausible that any annotation with Lit_commit_ish has been, or is currently, correct. (It is nonetheless not removed, since doing so could cause new exceptions at runtime, including in working code.) Note that this does not affect anything else in git.types; other types there, including Commit_ish, can be used correctly, are not deprecated, and continue to work, including in annotations. Although mypy does not consider nonpublic attribute access a type error or attempt to detect it, pyright does, and the util attribute of the top-level git module is now identified as nonpublic. The git.util module (from git.util import X) is of course public, but due to the effect of a past wildcard import and the continued need to avoid breaking code outside GitPython that my have ended up depending on it, the util attribute of git actually aliases the git.index.util module. Because util has never been listed in git.__all__ (including when it had been dynamically generated), it is effectively nonpublic, but pyright treated it as public based on being temporarily set as the git.util module before being set to git.index.util. It is retained in dir() and its deprecation warning notes how it is a special case, but it is now seen as nonpublic. Some other aliases to modules that are not direct Python submodules of the top-level git module had in the past been created by wildcard imports. These are still accessible, for now, but since there has never been a reason for anyone to consider those aliases part of GitPython's public interface, they are now regarded as absent (and an error to access) by static type checkers. They are also now omitted from dir(), though this is to avoid confusion with actual direct Python submodules of git much more than to discourage their use. Other typing-related efforts here are of the opposite kind: making sure things don't break or change significantly. Two other notable aspects of the scope: The changes made to provide the runtime behavior of issuing DeprecationWarning when deprecated module-level attributes are accessed is, by its nature, immediately ready to also cover the case of a dynamically computed git.__version__ attribute, as discussed in #1716 (comment). See also this exploration of approaches to that. But to avoid bloating the scope, that change is not included in this PR; instead, a comment notes where that logic would go. Building further on #1782 and #1787, I expanded the Git.USE_SHELL docstring to better explain the deprecation and to be more helpful in presenting the issues and what to do instead of setting it to True. However, I suspect it may benefit from some further expansion soon to clarify the risks (though maybe I can figure out a way to remove or abridge some parts so it doesn't keep growing). I regard the USE_SHELL deprecation to be core to this whole PR, because I think of everything deprecated in GitPython that it is where the greatest risk of usage that is unaware of the disadvantages may be. It is also a class attribute, which is naturally capable of being read and written on the Git class as well as read (but not written) through Git instances. Making class attribute accesses issue warnings without breaking anything else is nontrivial, and while the approach I have taken avoids most possible pitfalls, some potentially remain. But I believe the value of surfacing this deprecation and its rationale is sufficient to justify it, both in and of itself, and when considering the effect that issuing other deprecation warnings but not this one would have in creating the false appearance that this deprecation is less important (when really it is more). Further details on this are below, and also in the tests. Avoiding subtle problems In many places, issuing a warning if it was not already issued, and testing it, were straightforward, because the warning should occur exactly when code in the body of some function runs, so the code to issue the warning can simply be added there, usually at the top of the function. The case of git.util.Iterable is less basic, involving a metaclass for the deprecated class, but that was already implemented, so all it needed were unit tests where derived classes were locally defined while capturing and verifying the expected DeprecationWarning. Furthermore, some deprecated attribute accesses were basic cases, because they were already properties, so where warnings were absent it was sufficient to warn in the already existing property accessor. In contrast, on attribute accesses that are not already properties or otherwise dynamically customized, adding new runtime behavior, even just to issue a warning, should be done carefully--if at all, but see below on rationale--because there are three kinds of things it can break, that can easily go undetected unless tested for explicitly: Access to the attribute in less common ways that ought to work for the sake of correctness or practicality. Access that should not work, such as writing a read-only attribute, should fail with the same exception as before and the same or a similar message, and with no state change. Runtime metadata. For example, some ways of customizing attribute access cause the attribute not to appear in dir(), which is often unintentional, and some ways of attempting to fix that cause other conceptually unrelated attributes not to appear in dir(). Another example is generated documentation, which is sometimes okay to change but should remain useful: Sphinx must find and use docstrings, and generated help from the help builtin (or pydoc) should at minimum not stop indicating that an attribute is deprecated as a consequence of changes made to issue a warning. When possible, attributes with an important default value that is shown by help() should continue to show it. Static typing. Static type checkers such as mypy and pyright treat __getattr__ and __getattribute__ methods to mean that arbitrary attributes are permitted. This is often good, since for example it makes it so code accessing dynamic callable attributes of Git instances is not analyzed as having type errors. But when such a method is added where fully dynamic attribute lookup is not conceptually correct, if done in view of the type checker (i.e., not hidden behind if not TYPE_CHECKING), it causes attempts to use misspelled or otherwise bogus attributes to be analyzed as correct and returning the Any type, on which subsequent operations also are treated as Any and not warned about, and so on. The other subtlety about static typing is the distinction between analyzing GitPython and analyzing another codebase that uses GitPython. For example, #1656 was not observed by GitPython's own static analysis because the project only uses mypy internally but both mypy and pyright are popular static type checkers; and #1349, which happens with mypy, nonetheless was not internally observed because of differences between running mypy on GitPython and running it on another project that uses GitPython (which partly have to do with configuration). The first and second of those points are covered in detail by substantial regression tests that engage with a number of their ramifications. The tests have docstrings and I believe they present the considerations, as well as why those considerations led to the specific implementation approaches used, clearly. But there are three exceptions to this: The effect on Sphinx-generated documentation and documentation displayed by help() I checked manually instead. Likewise, I manually checked exception messages, but this was based on a judgment that these were low risk in the design approach I took, where either the messages where written explicitly (in module-level __getattr__) or were delegated to Python through super() in overridden special methods. Because the implementation approach might change in the future, which is something the new tests usually try to accommodate, I do not consider the omission of exception-message assertions to be ideal. But the messages are not guaranteed to be the same in every version of Python and do change in some Python releases. The changes required to issue warnings on all accesses to Git.USE_SHELL are significant, and probably only justified because of the special value of issuing those particular warnings. (Their rationale is examined below.) In view of the importance of maintaining robust and accurate static type checking at the boundary of GitPython and code that uses it, I took the somewhat unusual approach of developing many of the tests in a separate project first while experimenting with and implementing some changes on my feature branch of GitPython. Then I used git-filter-repo and a couple of rebases to include those commits in my GitPython feature branch, broken up into a few chunks, and interleaved with the changes they test, in what seemed like the most honest order. Although CI results seem to be interesting on each commit and could be viewed in my fork in case interest arises, for those parts I'm pushing only the tips of each chunk, because there are many commits. Doing it that way also seems like it should make it easier to see where the chunks are. The new tests should be type-checked within the GitPython project in the same way the code of the git module is type-checked (whatever that is, i.e., I am not advocating that continue-on-error be removed from the mypy step of pythonpackage.yml at this time). To facilitate this, I have put them in their own directory within test/, fully type-annotated them, configured mypy to be runnable with no arguments and scan both git/ and test/deprecation/, and and modified the pythonpackage.yml CI workflow, tox.ini, and README.md with that simpler yet now slightly broader command. This has the potential eventual disadvantage that splitting up tests of the same units of code into multiple test modules could be unwieldy, if many more new test subpackages alongside test.performance and test.deprecation end up popping up for separate unrelated reasons in the future. However, it has the advantage that all deprecations except those for which no DeprecationWarning is issued for some special reason (see above) can now be efficiently discerned by examining those tests. For this reason I think this is justified at the moment and potentially even permanently. Running mypy in GitPython shows the same five errors as it did as of the reduction in #1859, none of which are in areas sufficiently related to anything touched here that they would stand to obscure problems introduced here. Rationale and design considerations of USE_SHELL warnings I believe that what I have done for Git.USE_SHELL requires a specific rationale. The reason is that, to issue the deprecation warnings, I have caused the Git class to have a custom metaclass. On the one hand, this should probably not break anything. As noted in the test_git_cmd.py module docstring, this could potentially break existing code by causing a metaclass conflict, if code that uses GitPython is not only inheriting from the Git class, and not only using it in multiple inheritance, but using it in multiple inheritance with at least one sibling class that has its own unrelated custom metaclass. The narrow considerations are: This is arguably plausible, since abc.ABC uses the custom metaclass abc.ABCMeta and therefore all classes derived from abstract base classes have it, including concrete leaves in such a hierarchy. Also, explicitly naming a protocol as a base class also entails having a custom metaclass. (Satisfying a protocol in the usual way by having all the appropriate members of course does not entail gaining a custom metaclass.) However, this seems like a fairly low risk, especially considering the propensity for interference between the effectively unlimited number of dynamic "methods" of Git instances, and the functionality of other classes in multiple inheritance. I would question what successful applications there are of multiple inheritance with a custom metaclass and Git as a sibling. To be clear, the issue is not strictly that it breaks multiple inheritance with other unrelated custom metaclasses, but rather that it would be a breaking change to other such code that already exists, because that code would have to be modified to define and use a new metaclass that inherits from both metaclasses. It seems to me that it may be justified based on the idea you have expressed there which, if I understand it correctly, entails at least that it is not of paramount importance to support unusual and inherently brittle forms of inheritance from Git that are not known or believed to be in current use. But the broader consideration is... how can this possibly be? How can it be necessary to use a metaclass just to add a side effect to class attribute access? Intuitively it feels like a descriptor, in the Git class, could do it. But a descriptor can only customize: All forms of access (getting, setting, and deleting) on an instance of the class it is placed in. Only reading on the class itself that it is placed in. This is to say that: __get__ (not to be confused with __getattr__ or __getattribute__) is called for both instance and class getattr operations (its instance parameter may be None). __set__ (not to be confused with __setattr__) is called only for instance setattr operations; getting the attribute from the class does not call __set__ but simply replaces the descriptor object. __delete__ (not to be confused with __delattr__ or __del__) is analogous to __set__, being called only for instance delattr operations; deleting the attribute from the class does not call __delete__ but simply removes the attribute (dropping the reference to the descriptor object). Therefore, neither custom descriptors, nor properties (due to property objects being descriptors) will do this for us in the Git class itself. Likewise, __getattr__, __getattribute__, and __setattr__ won't do it, because they only ever customize attribute access on instances of the class they are defined in. (This differs from __getattr__'s use in a module, where it customizes access in that module.) More precisely, none of these things will do it by themselves. They can warn on access to the USE_SHELL class attribute through instances (which I have done, with __getattribute__ in Git). They could even warn on accesses by reading it from the class; but not about the most dangerous operation most meriting a warning, which is setting it on the class. While the unit tests provide protection against technical pitfalls, as well as regression protection if the implementation strategy is changed in the future without accounting for all subtleties, they fundamentally cannot answer the underlying design question of whether the importance of warning about USE_SHELL is sufficient to justify the implementation. As I note in the test_cmd_git.py docstring: GitPython/test/deprecation/test_cmd_git.py Lines 54 to 56 in 10e0fc2 The tests in this module do not establish whether the danger of setting Git.USE_SHELL to True is high enough, and applications of deriving from Git and using an unrelated custom metaclass marginal enough, to justify introducing a metaclass to issue the warnings. As mentioned above and elsewhere, I believe it is justified. However, I am willing to make changes, including if necessary removing _GitMeta altogether. Assuming it is to be kept, a remaining question is whether it is better or worse to provide a public alias to the metaclass. It can of course be gotten with type(Git), but static type checkers consider such an expression anywhere in any type annotation to be an error. An alias does not commit Git to continue to have a custom metaclass, because it can be documented as aliasing its metaclass whether that metaclass is type or a custom metaclass. I've done that. I've made the tests for that alias, and the code of the alias and its docstring itself (which cautions against writing code that would need it as well as against using it in a type annotation when Type[Git] is meant), be the last two commits here. That is so they can easily be dropped or reverted if you think having that public alias is not an improvement.
Open Graph Description: Scope Some deprecations in GitPython had no associated DeprecationWarning, and none had unit tests to verify that the warnings were issued under the intended circumstances. This adds... warnings, ...
X Description: Scope Some deprecations in GitPython had no associated DeprecationWarning, and none had unit tests to verify that the warnings were issued under the intended circumstances. This adds... warnings, ...
Opengraph URL: https://github.com/gitpython-developers/GitPython/pull/1886
X: @github
Domain: github.com
| route-pattern | /:user_id/:repository/pull/:id/checks(.:format) |
| route-controller | pull_requests |
| route-action | checks |
| fetch-nonce | v2:7beabcc2-6d4f-0ab0-64b8-420a70f9cd6a |
| current-catalog-service-hash | 87dc3bc62d9b466312751bfd5f889726f4f1337bdff4e8be7da7c93d6c00a25a |
| request-id | 9996:C7A41:109ED07:1715F88:6968A911 |
| html-safe-nonce | f112d08ae28669ae1484b220c99159c6b9d49ea4ff092f47bd1173b28455236c |
| visitor-payload | eyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiI5OTk2OkM3QTQxOjEwOUVEMDc6MTcxNUY4ODo2OTY4QTkxMSIsInZpc2l0b3JfaWQiOiI5MTY1NTExMzEyOTUzMzU0NTEzIiwicmVnaW9uX2VkZ2UiOiJpYWQiLCJyZWdpb25fcmVuZGVyIjoiaWFkIn0= |
| visitor-hmac | 1b2c1cffd9bc1075c54bee9a5b1ecb270de3f6ccae7b9e62a04a38f8bb919375 |
| hovercard-subject-tag | pull_request:1798818369 |
| github-keyboard-shortcuts | repository,pull-request-list,pull-request-conversation,pull-request-files-changed,checks,copilot |
| google-site-verification | Apib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I |
| octolytics-url | https://collector.github.com/github/collect |
| analytics-location | / |
| fb:app_id | 1401488693436528 |
| apple-itunes-app | app-id=1477376905, app-argument=https://github.com/gitpython-developers/GitPython/pull/1886/checks |
| twitter:image | https://avatars.githubusercontent.com/u/1771172?s=400&v=4 |
| twitter:card | summary_large_image |
| og:image | https://avatars.githubusercontent.com/u/1771172?s=400&v=4 |
| og:image:alt | Scope Some deprecations in GitPython had no associated DeprecationWarning, and none had unit tests to verify that the warnings were issued under the intended circumstances. This adds... warnings, ... |
| og:site_name | GitHub |
| og:type | object |
| hostname | github.com |
| expected-hostname | github.com |
| None | fdc7c66bd36a6c12eb8e771e806db863266e573fc299e77f27505a768d4f8a98 |
| turbo-cache-control | no-preview |
| go-import | github.com/gitpython-developers/GitPython git https://github.com/gitpython-developers/GitPython.git |
| octolytics-dimension-user_id | 503709 |
| octolytics-dimension-user_login | gitpython-developers |
| octolytics-dimension-repository_id | 1126087 |
| octolytics-dimension-repository_nwo | gitpython-developers/GitPython |
| octolytics-dimension-repository_public | true |
| octolytics-dimension-repository_is_fork | false |
| octolytics-dimension-repository_network_root_id | 1126087 |
| octolytics-dimension-repository_network_root_nwo | gitpython-developers/GitPython |
| turbo-body-classes | logged-out env-production page-responsive full-width full-width-p-0 |
| disable-turbo | false |
| browser-stats-url | https://api.github.com/_private/browser/stats |
| browser-errors-url | https://api.github.com/_private/browser/errors |
| release | 3223a6503d318917691422cdadfbe16cd8fb21e5 |
| ui-target | full |
| theme-color | #1e2327 |
| color-scheme | light dark |
Links:
Viewport: width=device-width