-
-
Notifications
You must be signed in to change notification settings - Fork 25.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MNT Applies black formatting to most of the code base #18948
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @thomasjpfan !
Here is a reference on how we would setup git blame with black.
Aww that's interesting, it's unfortunate that Github still doesn't support that.
A few comments below. Could you please check that we are running black on the externals dir?
Also please add it to the .pre-commit-config.yaml
- repo: https://github.com/psf/black
rev: 19.3b0
hooks:
- id: black
types: [file, python]
That's the recommended way of not forgetting to apply black on each commit..
pyproject.toml
Outdated
\.eggs # exclude a few common directories in the | ||
| \.git # root of the project | ||
| \.mypy_cache | ||
| examples |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we use black for code, it might make sense to use it for examples as well maybe? Because if we don't use consistent formatting for all python it can be confusing for contributors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In yesterday's meeting we discussed that applying black to examples could potentially lead to big /scary code sections. No strong opinion on my side, although it's true that we've been following slightly different conventions for examples. Typically in the notebook-style ones, we write the import at the beginning of each cell instead of at the top of the file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me it comes down to a UX thing. When reading an example and the code block uses up a significant portion of the vertical space, does this make the example harder to read?
pyproject.toml
Outdated
| build | ||
| dist | ||
| doc | ||
| build_tools |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there python files in there, that we should avoid to format?
pyproject.toml
Outdated
| examples | ||
| build | ||
| dist | ||
| doc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The config.py can be formatted at least, I think.
setup.cfg
Outdated
dist, | ||
examples, | ||
sklearn/externals, | ||
doc, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docs and examples should still be linted by flake8 I think.
Do we plan to then apply black automatically on PRs or do we ask contributors to do that manually? |
Thanks @thomasjpfan Other stuff we'll need to do
Should we be doing all this (including this PR) in another branch, and merge into master in one shot?
Thing is, applying black automatically to PRs will likely force contributors to git pull and git merge each time. If they're not tech savy enough to read our docs and apply black themselves, they'll likely be confused by that process as well. Also @thomasjpfan do you know if we can use black on Cython files as well? |
Given the size of this PR, I wouldn't want to do anymore than apply black in this PR. I can get those PRs which can quickly be merged after this one.
I think the best way to do this is with a script they can use locally. This script would:
Not supported yet: psf/black#359 |
I agree. I'm suggesting to merge this PR and the other black-related ones into a new branch rather than on master. This way, we can then merge all black-related changes in one shot into master, and avoid any asynchronicity.
This is quite complex and we'd need to support 2 of these (unix and windows). It's probably simpler for now to have a |
66be950
to
5ea7d64
Compare
Regarding the timing, I think that just after the release of 0.24 looks like a good idea. Regarding the pre-commit, we could use a documentation likewise pandas: https://pandas.pydata.org/pandas-docs/stable/development/contributing.html#python-pep8-black It shows how to do it manually and refers to pre-commit if people want to. Since we have a section pre-commit, we can add additional stuff there. |
No opinion from my side but if we do that after 0.24, we'll need to make sure the Regardless, I still think we should do all this in a separate branch. And this PR should actually be the last one to be merged: it will introduce tons of merge conflicts and we need to point contributors to docs indicating how to apply black. So we should focus on the docs PRs first. |
Good point. It could avoid updating the documentation for that matter.
Looks like a good plan. I can create a PR and we can always change the targeted branch at the last minute. |
I opened #19031 to apply style changes that would make it so this PR can just be "apply black". I think this would simplify things because this PR would just be for applying black.
I think I am missing something. What kind of work do you see us doing on this separate branch? |
Basically what I listed in #18948 (comment). I find it safer and simpler to have everything ready in one branch and just merge the whole branch in one shot than do incremental changes. When this PR is merged, we need to have the docs already, the CI check, and ideally a sticky issue pointing to conflict resolution instructions. Otherwise, current PR authors will be lost in what to do. |
Any way we could move forward on this PR? #19031 is waiting for a second review. It would be good to have documentation on how to deal with existing PRs, but just a pinned issue could do too. If contributors encounter difficulties in the PRs they will say so/open issues and we can move forward from there. Which is still better than being stuck here discussing how many PRs we need for this :) |
A pinned issue could do without the docs, as long as this situation doesn't last too long. That is, we should release soon, with up-to-date docs. We will need to have these docs at some point anyway, so I don't see why we can't write them now. At the very very least we'll need a CI check, otherwise we'll have to submit re-formatting PRs every now and then and this will keep creating more conflicts. We can do all that here, or in a separate branch as suggested above, which would be IMHO much smoother |
Was there a discussion on moving to the 88 line length limit somewhere? (it is black's default) I'll begin work on the separate branch this weekend (now 😅). |
Yes, but I think people ended up being fine with it. IMHO using black with a smaller line length would cause even more vertical space being used, so I would stick to the default |
160f7c8
to
1db386c
Compare
I updated this PR to build on top of #20260 This PR does two things:
|
I prefer the way we have it now because it allows us to write the human readable for each ignored violation. Secondly, the flake8 default may change with different versions of flake8. By explicitly setting them, we have more control over ignored violations. |
285f2d6
to
bc2d15d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming the only commit here is applying black + enabling CI once #20260 is merged LTGM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as @rth. @thomasjpfan feel free to update the PR and merge them in the right order.
bc2d15d
to
b9478e0
Compare
b9478e0
to
bea7802
Compare
Thanks @thomasjpfan! |
Glad that this happened! |
ensure SVDD passes numpydoc validation (scikit-learn#20463) check for svdd in `test_sparse.py:check_svm_model_equal` to avoid calling `.predict_proba`
ensure SVDD passes numpydoc validation (scikit-learn#20463) check for svdd in `test_sparse.py:check_svm_model_equal` to avoid calling `.predict_proba`
ensure SVDD passes numpydoc validation (scikit-learn#20463) check for svdd in `test_sparse.py:check_svm_model_equal` to avoid calling `.predict_proba`
ensure SVDD passes numpydoc validation (scikit-learn#20463) check for svdd in `test_sparse.py:check_svm_model_equal` to avoid calling `.predict_proba`
ensure SVDD passes numpydoc validation (scikit-learn#20463) check for svdd in `test_sparse.py:check_svm_model_equal` to avoid calling `.predict_proba`
ensure SVDD passes numpydoc validation (scikit-learn#20463) check for svdd in `test_sparse.py:check_svm_model_equal` to avoid calling `.predict_proba`
Reference Issues/PRs
Related to #11336
What does this implement/fix? Explain your changes.
setup.cfg
.pyproject.toml
.git blame
with black. It requires merging this PR and then having another PR that adds a.git-blame-ignore-revs
. I would say thatgit blame
is an advanced feature ofgit
, which is often used to track down the PR that created the diff.black
.Any other comments?
For reference, django created a enhancement proposal for why they introduced using
black
for formatting.