-
-
Notifications
You must be signed in to change notification settings - Fork 25.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC add LaTeX to various linear models #30322
base: main
Are you sure you want to change the base?
DOC add LaTeX to various linear models #30322
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also wonder what @lucyleeow thinks here.
.. math:: | ||
\\frac{1}{2n_{\\operatorname{samples}}} | ||
\\vert \\vert Y- XW \\vert \\vert^2_F + | ||
\\alpha \\vert \\vert W \\vert \\vert_{2,1} | ||
|
||
where :math:`\\vert\\vert W \\vert\\vert_F` is the Frobenius norm of :math:`W`, | ||
and:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't love that these are basically rendered twice. Ideally we'd have only one version of them, not both, but I do see that it's less readable in latex format unrendered.
Maybe a good solution would be to rst-commen (with ..
) the "code form" and only have the latex form rendered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recall that this is a long standing debate. I was always more on the side of no-latex because I don't find it readable when looking at my IDE docstring. I don't know if modern IDE, is actually translating latex to HTML view nowadays?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I agree that I don't think it needs to be here twice. Less opinionated on whether its the code or latex one that we should have.
As @glemaitre said, I guess one view is for no latex in docstring, and have this in stuff in the user guide instead?
So as consistency, we do have math notations in a few places already in the codebase: $ git grep -A5 -B5 -p '\.\. math' "sklearn/*.py" > /tmp/log.txt
sklearn/cluster/_agglomerative.py=def ward_tree(X, *, connectivity=None, n_clusters=None, return_distance=False):
--
sklearn/cluster/_agglomerative.py- distance. Distances are updated in the following way
sklearn/cluster/_agglomerative.py- (from scipy.hierarchy.linkage):
sklearn/cluster/_agglomerative.py-
sklearn/cluster/_agglomerative.py- The new entry :math:`d(u,v)` is computed as follows,
sklearn/cluster/_agglomerative.py-
sklearn/cluster/_agglomerative.py: .. math::
sklearn/cluster/_agglomerative.py-
sklearn/cluster/_agglomerative.py- d(u,v) = \\sqrt{\\frac{|v|+|s|}
sklearn/cluster/_agglomerative.py- {T}d(v,s)^2
sklearn/cluster/_agglomerative.py- + \\frac{|v|+|t|}
sklearn/cluster/_agglomerative.py- {T}d(v,t)^2
--
sklearn/decomposition/_nmf.py=def non_negative_factorization(
--
sklearn/decomposition/_nmf.py- negative matrix X. This factorization can be used for example for
sklearn/decomposition/_nmf.py- dimensionality reduction, source separation or topic extraction.
sklearn/decomposition/_nmf.py-
sklearn/decomposition/_nmf.py- The objective function is:
sklearn/decomposition/_nmf.py-
sklearn/decomposition/_nmf.py: .. math::
sklearn/decomposition/_nmf.py-
sklearn/decomposition/_nmf.py- L(W, H) &= 0.5 * ||X - WH||_{loss}^2
sklearn/decomposition/_nmf.py-
sklearn/decomposition/_nmf.py- &+ alpha\\_W * l1\\_ratio * n\\_features * ||vec(W)||_1
sklearn/decomposition/_nmf.py-
--
sklearn/decomposition/_nmf.py=class NMF(_BaseNMF):
--
sklearn/decomposition/_nmf.py- whose product approximates the non-negative matrix X. This factorization can be used
sklearn/decomposition/_nmf.py- for example for dimensionality reduction, source separation or topic extraction.
sklearn/decomposition/_nmf.py-
sklearn/decomposition/_nmf.py- The objective function is:
sklearn/decomposition/_nmf.py-
sklearn/decomposition/_nmf.py: .. math::
sklearn/decomposition/_nmf.py-
sklearn/decomposition/_nmf.py- L(W, H) &= 0.5 * ||X - WH||_{loss}^2
sklearn/decomposition/_nmf.py-
sklearn/decomposition/_nmf.py- &+ alpha\\_W * l1\\_ratio * n\\_features * ||vec(W)||_1
sklearn/decomposition/_nmf.py-
--
sklearn/decomposition/_nmf.py=class MiniBatchNMF(_BaseNMF):
--
sklearn/decomposition/_nmf.py- factorization can be used for example for dimensionality reduction, source
sklearn/decomposition/_nmf.py- separation or topic extraction.
sklearn/decomposition/_nmf.py-
sklearn/decomposition/_nmf.py- The objective function is:
sklearn/decomposition/_nmf.py-
sklearn/decomposition/_nmf.py: .. math::
sklearn/decomposition/_nmf.py-
sklearn/decomposition/_nmf.py- L(W, H) &= 0.5 * ||X - WH||_{loss}^2
sklearn/decomposition/_nmf.py-
sklearn/decomposition/_nmf.py- &+ alpha\\_W * l1\\_ratio * n\\_features * ||vec(W)||_1
sklearn/decomposition/_nmf.py-
--
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py=class Sum(KernelOperator):
sklearn/gaussian_process/kernels.py- """The `Sum` kernel takes two kernels :math:`k_1` and :math:`k_2`
sklearn/gaussian_process/kernels.py- and combines them via
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py: .. math::
sklearn/gaussian_process/kernels.py- k_{sum}(X, Y) = k_1(X, Y) + k_2(X, Y)
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py- Note that the `__add__` magic method is overridden, so
sklearn/gaussian_process/kernels.py- `Sum(RBF(), RBF())` is equivalent to using the + operator
sklearn/gaussian_process/kernels.py- with `RBF() + RBF()`.
--
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py=class Product(KernelOperator):
sklearn/gaussian_process/kernels.py- """The `Product` kernel takes two kernels :math:`k_1` and :math:`k_2`
sklearn/gaussian_process/kernels.py- and combines them via
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py: .. math::
sklearn/gaussian_process/kernels.py- k_{prod}(X, Y) = k_1(X, Y) * k_2(X, Y)
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py- Note that the `__mul__` magic method is overridden, so
sklearn/gaussian_process/kernels.py- `Product(RBF(), RBF())` is equivalent to using the * operator
sklearn/gaussian_process/kernels.py- with `RBF() * RBF()`.
--
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py=class Exponentiation(Kernel):
sklearn/gaussian_process/kernels.py- """The Exponentiation kernel takes one base kernel and a scalar parameter
sklearn/gaussian_process/kernels.py- :math:`p` and combines them via
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py: .. math::
sklearn/gaussian_process/kernels.py- k_{exp}(X, Y) = k(X, Y) ^p
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py- Note that the `__pow__` magic method is overridden, so
sklearn/gaussian_process/kernels.py- `Exponentiation(RBF(), 2)` is equivalent to using the ** operator
sklearn/gaussian_process/kernels.py- with `RBF() ** 2`.
--
sklearn/gaussian_process/kernels.py=class ConstantKernel(StationaryKernelMixin, GenericKernelMixin, Kernel):
--
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py- Can be used as part of a product-kernel where it scales the magnitude of
sklearn/gaussian_process/kernels.py- the other factor (kernel) or as part of a sum-kernel, where it modifies
sklearn/gaussian_process/kernels.py- the mean of the Gaussian process.
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py: .. math::
sklearn/gaussian_process/kernels.py- k(x_1, x_2) = constant\\_value \\;\\forall\\; x_1, x_2
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py- Adding a constant kernel is equivalent to adding a constant::
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py- kernel = RBF() + ConstantKernel(constant_value=2)
--
sklearn/gaussian_process/kernels.py=class WhiteKernel(StationaryKernelMixin, GenericKernelMixin, Kernel):
--
sklearn/gaussian_process/kernels.py- The main use-case of this kernel is as part of a sum-kernel where it
sklearn/gaussian_process/kernels.py- explains the noise of the signal as independently and identically
sklearn/gaussian_process/kernels.py- normally-distributed. The parameter noise_level equals the variance of this
sklearn/gaussian_process/kernels.py- noise.
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py: .. math::
sklearn/gaussian_process/kernels.py- k(x_1, x_2) = noise\\_level \\text{ if } x_i == x_j \\text{ else } 0
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py- Read more in the :ref:`User Guide <gp_kernels>`.
sklearn/gaussian_process/kernels.py-
--
sklearn/gaussian_process/kernels.py=class RBF(StationaryKernelMixin, NormalizedKernelMixin, Kernel):
--
sklearn/gaussian_process/kernels.py- "squared exponential" kernel. It is parameterized by a length scale
sklearn/gaussian_process/kernels.py- parameter :math:`l>0`, which can either be a scalar (isotropic variant
sklearn/gaussian_process/kernels.py- of the kernel) or a vector with the same number of dimensions as the inputs
sklearn/gaussian_process/kernels.py- X (anisotropic variant of the kernel). The kernel is given by:
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py: .. math::
sklearn/gaussian_process/kernels.py- k(x_i, x_j) = \\exp\\left(- \\frac{d(x_i, x_j)^2}{2l^2} \\right)
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py- where :math:`l` is the length scale of the kernel and
sklearn/gaussian_process/kernels.py- :math:`d(\\cdot,\\cdot)` is the Euclidean distance.
sklearn/gaussian_process/kernels.py- For advice on how to set the length scale parameter, see e.g. [1]_.
--
sklearn/gaussian_process/kernels.py=class Matern(RBF):
--
sklearn/gaussian_process/kernels.py- :math:`\\nu=1.5` (once differentiable functions)
sklearn/gaussian_process/kernels.py- and :math:`\\nu=2.5` (twice differentiable functions).
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py- The kernel is given by:
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py: .. math::
sklearn/gaussian_process/kernels.py- k(x_i, x_j) = \\frac{1}{\\Gamma(\\nu)2^{\\nu-1}}\\Bigg(
sklearn/gaussian_process/kernels.py- \\frac{\\sqrt{2\\nu}}{l} d(x_i , x_j )
sklearn/gaussian_process/kernels.py- \\Bigg)^\\nu K_\\nu\\Bigg(
sklearn/gaussian_process/kernels.py- \\frac{\\sqrt{2\\nu}}{l} d(x_i , x_j )\\Bigg)
sklearn/gaussian_process/kernels.py-
--
sklearn/gaussian_process/kernels.py=class RationalQuadratic(StationaryKernelMixin, NormalizedKernelMixin, Kernel):
--
sklearn/gaussian_process/kernels.py- parameterized by a length scale parameter :math:`l>0` and a scale
sklearn/gaussian_process/kernels.py- mixture parameter :math:`\\alpha>0`. Only the isotropic variant
sklearn/gaussian_process/kernels.py- where length_scale :math:`l` is a scalar is supported at the moment.
sklearn/gaussian_process/kernels.py- The kernel is given by:
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py: .. math::
sklearn/gaussian_process/kernels.py- k(x_i, x_j) = \\left(
sklearn/gaussian_process/kernels.py- 1 + \\frac{d(x_i, x_j)^2 }{ 2\\alpha l^2}\\right)^{-\\alpha}
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py- where :math:`\\alpha` is the scale mixture parameter, :math:`l` is
sklearn/gaussian_process/kernels.py- the length scale of the kernel and :math:`d(\\cdot,\\cdot)` is the
--
sklearn/gaussian_process/kernels.py=class ExpSineSquared(StationaryKernelMixin, NormalizedKernelMixin, Kernel):
--
sklearn/gaussian_process/kernels.py- themselves exactly. It is parameterized by a length scale
sklearn/gaussian_process/kernels.py- parameter :math:`l>0` and a periodicity parameter :math:`p>0`.
sklearn/gaussian_process/kernels.py- Only the isotropic variant where :math:`l` is a scalar is
sklearn/gaussian_process/kernels.py- supported at the moment. The kernel is given by:
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py: .. math::
sklearn/gaussian_process/kernels.py- k(x_i, x_j) = \text{exp}\left(-
sklearn/gaussian_process/kernels.py- \frac{ 2\sin^2(\pi d(x_i, x_j)/p) }{ l^ 2} \right)
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py- where :math:`l` is the length scale of the kernel, :math:`p` the
sklearn/gaussian_process/kernels.py- periodicity of the kernel and :math:`d(\cdot,\cdot)` is the
--
sklearn/gaussian_process/kernels.py=class DotProduct(Kernel):
--
sklearn/gaussian_process/kernels.py- It is parameterized by a parameter sigma_0 :math:`\sigma`
sklearn/gaussian_process/kernels.py- which controls the inhomogenity of the kernel. For :math:`\sigma_0^2 =0`,
sklearn/gaussian_process/kernels.py- the kernel is called the homogeneous linear kernel, otherwise
sklearn/gaussian_process/kernels.py- it is inhomogeneous. The kernel is given by
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py: .. math::
sklearn/gaussian_process/kernels.py- k(x_i, x_j) = \sigma_0 ^ 2 + x_i \cdot x_j
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py- The DotProduct kernel is commonly combined with exponentiation.
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py- See [1]_, Chapter 4, Section 4.2, for further details regarding the
--
sklearn/manifold/_t_sne.py=def trustworthiness(X, X_embedded, *, n_neighbors=5, metric="euclidean"):
sklearn/manifold/_t_sne.py- r"""Indicate to what extent the local structure is retained.
sklearn/manifold/_t_sne.py-
sklearn/manifold/_t_sne.py- The trustworthiness is within [0, 1]. It is defined as
sklearn/manifold/_t_sne.py-
sklearn/manifold/_t_sne.py: .. math::
sklearn/manifold/_t_sne.py-
sklearn/manifold/_t_sne.py- T(k) = 1 - \frac{2}{nk (2n - 3k - 1)} \sum^n_{i=1}
sklearn/manifold/_t_sne.py- \sum_{j \in \mathcal{N}_{i}^{k}} \max(0, (r(i, j) - k))
sklearn/manifold/_t_sne.py-
sklearn/manifold/_t_sne.py- where for each sample i, :math:`\mathcal{N}_{i}^{k}` are its k nearest
--
sklearn/metrics/_classification.py=def cohen_kappa_score(y1, y2, *, labels=None, weights=None, sample_weight=None):
--
sklearn/metrics/_classification.py-
sklearn/metrics/_classification.py- This function computes Cohen's kappa [1]_, a score that expresses the level
sklearn/metrics/_classification.py- of agreement between two annotators on a classification problem. It is
sklearn/metrics/_classification.py- defined as
sklearn/metrics/_classification.py-
sklearn/metrics/_classification.py: .. math::
sklearn/metrics/_classification.py- \kappa = (p_o - p_e) / (1 - p_e)
sklearn/metrics/_classification.py-
sklearn/metrics/_classification.py- where :math:`p_o` is the empirical probability of agreement on the label
sklearn/metrics/_classification.py- assigned to any sample (the observed agreement ratio), and :math:`p_e` is
sklearn/metrics/_classification.py- the expected agreement when both annotators assign labels randomly.
--
sklearn/metrics/_classification.py=def f1_score(
--
sklearn/metrics/_classification.py- The F1 score can be interpreted as a harmonic mean of the precision and
sklearn/metrics/_classification.py- recall, where an F1 score reaches its best value at 1 and worst score at 0.
sklearn/metrics/_classification.py- The relative contribution of precision and recall to the F1 score are
sklearn/metrics/_classification.py- equal. The formula for the F1 score is:
sklearn/metrics/_classification.py-
sklearn/metrics/_classification.py: .. math::
sklearn/metrics/_classification.py- \\text{F1} = \\frac{2 * \\text{TP}}{2 * \\text{TP} + \\text{FP} + \\text{FN}}
sklearn/metrics/_classification.py-
sklearn/metrics/_classification.py- Where :math:`\\text{TP}` is the number of true positives, :math:`\\text{FN}` is the
sklearn/metrics/_classification.py- number of false negatives, and :math:`\\text{FP}` is the number of false positives.
sklearn/metrics/_classification.py- F1 is by default
--
sklearn/metrics/_classification.py=def fbeta_score(
--
sklearn/metrics/_classification.py- Asymptotically, `beta -> +inf` considers only recall, and `beta -> 0`
sklearn/metrics/_classification.py- only precision.
sklearn/metrics/_classification.py-
sklearn/metrics/_classification.py- The formula for F-beta score is:
sklearn/metrics/_classification.py-
sklearn/metrics/_classification.py: .. math::
sklearn/metrics/_classification.py-
sklearn/metrics/_classification.py- F_\\beta = \\frac{(1 + \\beta^2) \\text{tp}}
sklearn/metrics/_classification.py- {(1 + \\beta^2) \\text{tp} + \\text{fp} + \\beta^2 \\text{fn}}
sklearn/metrics/_classification.py-
sklearn/metrics/_classification.py- Where :math:`\\text{tp}` is the number of true positives, :math:`\\text{fp}` is the
--
sklearn/metrics/_classification.py=def log_loss(y_true, y_pred, *, normalize=True, sample_weight=None, labels=None):
--
sklearn/metrics/_classification.py- The log loss is only defined for two or more labels.
sklearn/metrics/_classification.py- For a single sample with true label :math:`y \in \{0,1\}` and
sklearn/metrics/_classification.py- a probability estimate :math:`p = \operatorname{Pr}(y = 1)`, the log
sklearn/metrics/_classification.py- loss is:
sklearn/metrics/_classification.py-
sklearn/metrics/_classification.py: .. math::
sklearn/metrics/_classification.py- L_{\log}(y, p) = -(y \log (p) + (1 - y) \log (1 - p))
sklearn/metrics/_classification.py-
sklearn/metrics/_classification.py- Read more in the :ref:`User Guide <log_loss>`.
sklearn/metrics/_classification.py-
sklearn/metrics/_classification.py- Parameters
--
sklearn/metrics/_ranking.py=def average_precision_score(
--
sklearn/metrics/_ranking.py-
sklearn/metrics/_ranking.py- AP summarizes a precision-recall curve as the weighted mean of precisions
sklearn/metrics/_ranking.py- achieved at each threshold, with the increase in recall from the previous
sklearn/metrics/_ranking.py- threshold used as the weight:
sklearn/metrics/_ranking.py-
sklearn/metrics/_ranking.py: .. math::
sklearn/metrics/_ranking.py- \\text{AP} = \\sum_n (R_n - R_{n-1}) P_n
sklearn/metrics/_ranking.py-
sklearn/metrics/_ranking.py- where :math:`P_n` and :math:`R_n` are the precision and recall at the nth
sklearn/metrics/_ranking.py- threshold [1]_. This implementation is not interpolated and is different
sklearn/metrics/_ranking.py- from computing the area under the precision-recall curve with the
--
sklearn/metrics/cluster/_supervised.py=def mutual_info_score(labels_true, labels_pred, *, contingency=None):
--
sklearn/metrics/cluster/_supervised.py- of the same data. Where :math:`|U_i|` is the number of the samples
sklearn/metrics/cluster/_supervised.py- in cluster :math:`U_i` and :math:`|V_j|` is the number of the
sklearn/metrics/cluster/_supervised.py- samples in cluster :math:`V_j`, the Mutual Information
sklearn/metrics/cluster/_supervised.py- between clusterings :math:`U` and :math:`V` is given as:
sklearn/metrics/cluster/_supervised.py-
sklearn/metrics/cluster/_supervised.py: .. math::
sklearn/metrics/cluster/_supervised.py-
sklearn/metrics/cluster/_supervised.py- MI(U,V)=\\sum_{i=1}^{|U|} \\sum_{j=1}^{|V|} \\frac{|U_i\\cap V_j|}{N}
sklearn/metrics/cluster/_supervised.py- \\log\\frac{N|U_i \\cap V_j|}{|U_i||V_j|}
sklearn/metrics/cluster/_supervised.py-
sklearn/metrics/cluster/_supervised.py- This metric is independent of the absolute values of the labels:
--
sklearn/metrics/pairwise.py=def nan_euclidean_distances(
--
sklearn/metrics/pairwise.py-
sklearn/metrics/pairwise.py- weight = Total # of coordinates / # of present coordinates
sklearn/metrics/pairwise.py-
sklearn/metrics/pairwise.py- For example, the distance between ``[3, na, na, 6]`` and ``[1, na, 4, 5]`` is:
sklearn/metrics/pairwise.py-
sklearn/metrics/pairwise.py: .. math::
sklearn/metrics/pairwise.py- \\sqrt{\\frac{4}{2}((3-1)^2 + (6-5)^2)}
sklearn/metrics/pairwise.py-
sklearn/metrics/pairwise.py- If all the coordinates are missing or if there are no common present
sklearn/metrics/pairwise.py- coordinates then NaN is returned for that pair.
sklearn/metrics/pairwise.py-
--
sklearn/metrics/pairwise.py=def haversine_distances(X, Y=None):
--
sklearn/metrics/pairwise.py- The Haversine (or great circle) distance is the angular distance between
sklearn/metrics/pairwise.py- two points on the surface of a sphere. The first coordinate of each point
sklearn/metrics/pairwise.py- is assumed to be the latitude, the second is the longitude, given
sklearn/metrics/pairwise.py- in radians. The dimension of the data must be 2.
sklearn/metrics/pairwise.py-
sklearn/metrics/pairwise.py: .. math::
sklearn/metrics/pairwise.py- D(x, y) = 2\\arcsin[\\sqrt{\\sin^2((x_{lat} - y_{lat}) / 2)
sklearn/metrics/pairwise.py- + \\cos(x_{lat})\\cos(y_{lat})\\
sklearn/metrics/pairwise.py- sin^2((x_{lon} - y_{lon}) / 2)}]
sklearn/metrics/pairwise.py-
sklearn/metrics/pairwise.py- Parameters
--
sklearn/preprocessing/_data.py=class KernelCenterer(ClassNamePrefixFeaturesOutMixin, TransformerMixin, BaseEstimator):
sklearn/preprocessing/_data.py- r"""Center an arbitrary kernel matrix :math:`K`.
sklearn/preprocessing/_data.py-
sklearn/preprocessing/_data.py- Let define a kernel :math:`K` such that:
sklearn/preprocessing/_data.py-
sklearn/preprocessing/_data.py: .. math::
sklearn/preprocessing/_data.py- K(X, Y) = \phi(X) . \phi(Y)^{T}
sklearn/preprocessing/_data.py-
sklearn/preprocessing/_data.py- :math:`\phi(X)` is a function mapping of rows of :math:`X` to a
sklearn/preprocessing/_data.py- Hilbert space and :math:`K` is of shape `(n_samples, n_samples)`.
sklearn/preprocessing/_data.py-
sklearn/preprocessing/_data.py- This class allows to compute :math:`\tilde{K}(X, Y)` such that:
sklearn/preprocessing/_data.py-
sklearn/preprocessing/_data.py: .. math::
sklearn/preprocessing/_data.py- \tilde{K(X, Y)} = \tilde{\phi}(X) . \tilde{\phi}(Y)^{T}
sklearn/preprocessing/_data.py-
sklearn/preprocessing/_data.py- :math:`\tilde{\phi}(X)` is the centered mapped data in the Hilbert
sklearn/preprocessing/_data.py- space.
sklearn/preprocessing/_data.py- And when it comes to the rendered version, I quite like them when encountering them in the rendered API pages. So I'd be okay with rst-commenting-out the python version and including the math notations here. Not sure what others think though. |
@@ -224,14 +224,29 @@ def lasso_path( | |||
|
|||
(1 / (2 * n_samples)) * ||y - Xw||^2_2 + alpha * ||w||_1 | |||
|
|||
.. math:: | |||
\\frac{1}{2n_{\\operatorname{samples}}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could probably use avoid the double backslash by using "raw" docstrings.
@@ -224,14 +224,29 @@ def lasso_path( | |||
|
|||
(1 / (2 * n_samples)) * ||y - Xw||^2_2 + alpha * ||w||_1 | |||
|
|||
.. math:: | |||
\\frac{1}{2n_{\\operatorname{samples}}} | |||
\\vert \\vert y - Xw \\vert \\vert^2_2 + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't there a less verbose version of the Euclidean norm that is understood by sphinx?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
\Vert
(with a capital "V") should display a double bar, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, we could just use ||
no? That might make such expression a lot more readable when reading the docstring in an IDE (even if slightly less correctly from a typographical point of view).
In a meeting, we discussed, and conclusion is to use |
Co-authored-by: Olivier Grisel <[email protected]>
Co-authored-by: Adrin Jalali <[email protected]> Co-authored-by: Thomas J. Fan <[email protected]> Co-authored-by: Loïc Estève <[email protected]>
…#30342) Co-authored-by: Lock file bot <[email protected]>
…arn#30343) Co-authored-by: Lock file bot <[email protected]>
…30345) Co-authored-by: Lock file bot <[email protected]>
…30344) Co-authored-by: Lock file bot <[email protected]>
Co-authored-by: Lock file bot <[email protected]>
…cikit-learn#27369) Co-authored-by: Olivier Grisel <[email protected]>
Co-authored-by: Olivier Grisel <[email protected]>
Co-authored-by: Christian Lorentzen <[email protected]> Co-authored-by: Chiara Marmo <[email protected]>
Co-authored-by: Thomas J. Fan <[email protected]>
…arn#30385) Co-authored-by: Lock file bot <[email protected]>
…30384) Co-authored-by: Lock file bot <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…it-learn#30393) Co-authored-by: Loïc Estève <[email protected]>
…0372) Co-authored-by: Jérémie du Boisberranger <[email protected]>
Co-authored-by: Olivier Grisel <[email protected]>
…learn#30410) Co-authored-by: Virgil Chan <[email protected]>
Co-authored-by: mekleo <[email protected]>
Co-authored-by: adrinjalali <[email protected]> Co-authored-by: Loïc Estève <[email protected]> Co-authored-by: Olivier Grisel <[email protected]>
Co-authored-by: Virgil Chan <[email protected]> Co-authored-by: Thomas J. Fan <[email protected]>
…30437) Co-authored-by: Lock file bot <[email protected]>
…30436) Co-authored-by: Lock file bot <[email protected]>
…arn#30435) Co-authored-by: Lock file bot <[email protected]>
…#30434) Co-authored-by: Lock file bot <[email protected]>
…ikit-learn#30349) Co-authored-by: Stefanie Senger <[email protected]>
Co-authored-by: Lock file bot <[email protected]> Co-authored-by: Olivier Grisel <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TL;DR: I updated the enet_path
documentation using the raw
doc-string format and discovered some trade-offs compared to the current implementation.
The updated doc-string renders as follows:
Here, only the LaTeX-formatted equations appear in the HTML file, while the code-formatted ones are rst
-commented out.
I noticed (at least) two issues with this formatting approach:
-
Indentation Adjustments: We have to modify the indentations in various parts of the doc-string to accommodate the
sphinx
extensionnumpydoc
. Even then, the HTML may not render correctly in all cases. For example, see my comment on the precompute parameter section. -
IDE Compatibility: The LaTeX-formatted equations don’t render correctly in IDEs. Here’s what it looks like in PyCharm on my local machine, where no third-party LaTeX rendering extensions are installed:
This aligns with @glemaitre's concern:
I recall that this is a long standing debate. I was always more on the side of no-latex because I don't find it readable when looking at my IDE docstring. I don't know if modern IDE, is actually translating latex to HTML view nowadays?
To conclude, using the raw
doc-string approach to include LaTeX introduces a different set of issues tied to numpydoc
. Specifically, we’d need to verify the spacing and indentation manually and wait for numpydoc updates to ensure proper HTML rendering.
In light of this, I agree more with @lucyleeow’s suggestion:
I guess one view is for no latex in docstring, and have this in stuff in the user guide instead?
keep the current doc-string format and move the LaTeX equations to the user guide instead.
precompute : 'auto', bool or array-like of shape (n_features, n_features), | ||
default='auto' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When using the raw
doc-string, the description for the precompute
parameter renders incorrectly.
I’ve tried several "obvious fixes," including:
However, it seems we have to violate the PEP-8 character limit to get it right:
In particular, it appears this is an ongoing issue with numpydoc
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naive question, does using \\
work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -516,7 +552,7 @@ def enet_path( | |||
See Also | |||
-------- | |||
MultiTaskElasticNet : Multi-task ElasticNet model trained with L1/L2 mixed-norm \ | |||
as regularizer. | |||
as regularizer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This indentation seems unavoidable because the sphinx
extension numpydoc
raises an error message,and sphinx
would not output the HTML files otherwise.
Reference Issues/PRs
Towards the Documentation Improvement Project.
What does this implement/fix? Explain your changes.
This PR enhances the doc-strings of the following linear model estimators by adding LaTeX-formatted equations:
For example, once merged, the HTML documentation for
ElasticNet
would render as follows:and the
enet_path
function would appear asThese improvements aim to make the documentation more user-friendly and accessible, whether viewed in the HTML documentation or directly in the source code.
Any other comments?
C.c. @adrinjalali, @glemaitre in advance.