DOC add LaTeX to various linear models #30322

virchan · 2024-11-21T13:27:38Z

Reference Issues/PRs

Towards the Documentation Improvement Project.

What does this implement/fix? Explain your changes.

This PR enhances the doc-strings of the following linear model estimators by adding LaTeX-formatted equations:

Lasso
LassoCV
Ridge
ElasticNet
ElasticNetCV
MultiTaskElasticNet
MultiTaskElasticNetCV

For example, once merged, the HTML documentation for ElasticNet would render as follows:

and the enet_path function would appear as

These improvements aim to make the documentation more user-friendly and accessible, whether viewed in the HTML documentation or directly in the source code.

Any other comments?

C.c. @adrinjalali, @glemaitre in advance.

…_linear_model

github-actions · 2024-11-21T13:28:57Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: fc34ae3. Link to the linter CI: here}

adrinjalali

also wonder what @lucyleeow thinks here.

adrinjalali · 2024-11-21T16:03:12Z

sklearn/linear_model/_coordinate_descent.py

+    .. math::
+        \\frac{1}{2n_{\\operatorname{samples}}}
+        \\vert \\vert Y- XW \\vert \\vert^2_F +
+        \\alpha \\vert \\vert W \\vert \\vert_{2,1}
+
+    where :math:`\\vert\\vert W \\vert\\vert_F` is the Frobenius norm of :math:`W`,
+    and::


I don't love that these are basically rendered twice. Ideally we'd have only one version of them, not both, but I do see that it's less readable in latex format unrendered.

Maybe a good solution would be to rst-commen (with .. ) the "code form" and only have the latex form rendered.

I recall that this is a long standing debate. I was always more on the side of no-latex because I don't find it readable when looking at my IDE docstring. I don't know if modern IDE, is actually translating latex to HTML view nowadays?

Yeah I agree that I don't think it needs to be here twice. Less opinionated on whether its the code or latex one that we should have.

As @glemaitre said, I guess one view is for no latex in docstring, and have this in stuff in the user guide instead?

adrinjalali · 2024-11-25T11:26:48Z

So as consistency, we do have math notations in a few places already in the codebase:

$ git grep -A5 -B5 -p '\.\. math' "sklearn/*.py" > /tmp/log.txt
sklearn/cluster/_agglomerative.py=def ward_tree(X, *, connectivity=None, n_clusters=None, return_distance=False):
--
sklearn/cluster/_agglomerative.py-        distance. Distances are updated in the following way
sklearn/cluster/_agglomerative.py-        (from scipy.hierarchy.linkage):
sklearn/cluster/_agglomerative.py-
sklearn/cluster/_agglomerative.py-        The new entry :math:`d(u,v)` is computed as follows,
sklearn/cluster/_agglomerative.py-
sklearn/cluster/_agglomerative.py:        .. math::
sklearn/cluster/_agglomerative.py-
sklearn/cluster/_agglomerative.py-           d(u,v) = \\sqrt{\\frac{|v|+|s|}
sklearn/cluster/_agglomerative.py-                               {T}d(v,s)^2
sklearn/cluster/_agglomerative.py-                        + \\frac{|v|+|t|}
sklearn/cluster/_agglomerative.py-                               {T}d(v,t)^2
--
sklearn/decomposition/_nmf.py=def non_negative_factorization(
--
sklearn/decomposition/_nmf.py-    negative matrix X. This factorization can be used for example for
sklearn/decomposition/_nmf.py-    dimensionality reduction, source separation or topic extraction.
sklearn/decomposition/_nmf.py-
sklearn/decomposition/_nmf.py-    The objective function is:
sklearn/decomposition/_nmf.py-
sklearn/decomposition/_nmf.py:    .. math::
sklearn/decomposition/_nmf.py-
sklearn/decomposition/_nmf.py-        L(W, H) &= 0.5 * ||X - WH||_{loss}^2
sklearn/decomposition/_nmf.py-
sklearn/decomposition/_nmf.py-                &+ alpha\\_W * l1\\_ratio * n\\_features * ||vec(W)||_1
sklearn/decomposition/_nmf.py-
--
sklearn/decomposition/_nmf.py=class NMF(_BaseNMF):
--
sklearn/decomposition/_nmf.py-    whose product approximates the non-negative matrix X. This factorization can be used
sklearn/decomposition/_nmf.py-    for example for dimensionality reduction, source separation or topic extraction.
sklearn/decomposition/_nmf.py-
sklearn/decomposition/_nmf.py-    The objective function is:
sklearn/decomposition/_nmf.py-
sklearn/decomposition/_nmf.py:    .. math::
sklearn/decomposition/_nmf.py-
sklearn/decomposition/_nmf.py-        L(W, H) &= 0.5 * ||X - WH||_{loss}^2
sklearn/decomposition/_nmf.py-
sklearn/decomposition/_nmf.py-                &+ alpha\\_W * l1\\_ratio * n\\_features * ||vec(W)||_1
sklearn/decomposition/_nmf.py-
--
sklearn/decomposition/_nmf.py=class MiniBatchNMF(_BaseNMF):
--
sklearn/decomposition/_nmf.py-    factorization can be used for example for dimensionality reduction, source
sklearn/decomposition/_nmf.py-    separation or topic extraction.
sklearn/decomposition/_nmf.py-
sklearn/decomposition/_nmf.py-    The objective function is:
sklearn/decomposition/_nmf.py-
sklearn/decomposition/_nmf.py:    .. math::
sklearn/decomposition/_nmf.py-
sklearn/decomposition/_nmf.py-        L(W, H) &= 0.5 * ||X - WH||_{loss}^2
sklearn/decomposition/_nmf.py-
sklearn/decomposition/_nmf.py-                &+ alpha\\_W * l1\\_ratio * n\\_features * ||vec(W)||_1
sklearn/decomposition/_nmf.py-
--
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py=class Sum(KernelOperator):
sklearn/gaussian_process/kernels.py-    """The `Sum` kernel takes two kernels :math:`k_1` and :math:`k_2`
sklearn/gaussian_process/kernels.py-    and combines them via
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py:    .. math::
sklearn/gaussian_process/kernels.py-        k_{sum}(X, Y) = k_1(X, Y) + k_2(X, Y)
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py-    Note that the `__add__` magic method is overridden, so
sklearn/gaussian_process/kernels.py-    `Sum(RBF(), RBF())` is equivalent to using the + operator
sklearn/gaussian_process/kernels.py-    with `RBF() + RBF()`.
--
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py=class Product(KernelOperator):
sklearn/gaussian_process/kernels.py-    """The `Product` kernel takes two kernels :math:`k_1` and :math:`k_2`
sklearn/gaussian_process/kernels.py-    and combines them via
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py:    .. math::
sklearn/gaussian_process/kernels.py-        k_{prod}(X, Y) = k_1(X, Y) * k_2(X, Y)
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py-    Note that the `__mul__` magic method is overridden, so
sklearn/gaussian_process/kernels.py-    `Product(RBF(), RBF())` is equivalent to using the * operator
sklearn/gaussian_process/kernels.py-    with `RBF() * RBF()`.
--
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py=class Exponentiation(Kernel):
sklearn/gaussian_process/kernels.py-    """The Exponentiation kernel takes one base kernel and a scalar parameter
sklearn/gaussian_process/kernels.py-    :math:`p` and combines them via
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py:    .. math::
sklearn/gaussian_process/kernels.py-        k_{exp}(X, Y) = k(X, Y) ^p
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py-    Note that the `__pow__` magic method is overridden, so
sklearn/gaussian_process/kernels.py-    `Exponentiation(RBF(), 2)` is equivalent to using the ** operator
sklearn/gaussian_process/kernels.py-    with `RBF() ** 2`.
--
sklearn/gaussian_process/kernels.py=class ConstantKernel(StationaryKernelMixin, GenericKernelMixin, Kernel):
--
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py-    Can be used as part of a product-kernel where it scales the magnitude of
sklearn/gaussian_process/kernels.py-    the other factor (kernel) or as part of a sum-kernel, where it modifies
sklearn/gaussian_process/kernels.py-    the mean of the Gaussian process.
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py:    .. math::
sklearn/gaussian_process/kernels.py-        k(x_1, x_2) = constant\\_value \\;\\forall\\; x_1, x_2
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py-    Adding a constant kernel is equivalent to adding a constant::
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py-            kernel = RBF() + ConstantKernel(constant_value=2)
--
sklearn/gaussian_process/kernels.py=class WhiteKernel(StationaryKernelMixin, GenericKernelMixin, Kernel):
--
sklearn/gaussian_process/kernels.py-    The main use-case of this kernel is as part of a sum-kernel where it
sklearn/gaussian_process/kernels.py-    explains the noise of the signal as independently and identically
sklearn/gaussian_process/kernels.py-    normally-distributed. The parameter noise_level equals the variance of this
sklearn/gaussian_process/kernels.py-    noise.
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py:    .. math::
sklearn/gaussian_process/kernels.py-        k(x_1, x_2) = noise\\_level \\text{ if } x_i == x_j \\text{ else } 0
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py-    Read more in the :ref:`User Guide <gp_kernels>`.
sklearn/gaussian_process/kernels.py-
--
sklearn/gaussian_process/kernels.py=class RBF(StationaryKernelMixin, NormalizedKernelMixin, Kernel):
--
sklearn/gaussian_process/kernels.py-    "squared exponential" kernel. It is parameterized by a length scale
sklearn/gaussian_process/kernels.py-    parameter :math:`l>0`, which can either be a scalar (isotropic variant
sklearn/gaussian_process/kernels.py-    of the kernel) or a vector with the same number of dimensions as the inputs
sklearn/gaussian_process/kernels.py-    X (anisotropic variant of the kernel). The kernel is given by:
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py:    .. math::
sklearn/gaussian_process/kernels.py-        k(x_i, x_j) = \\exp\\left(- \\frac{d(x_i, x_j)^2}{2l^2} \\right)
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py-    where :math:`l` is the length scale of the kernel and
sklearn/gaussian_process/kernels.py-    :math:`d(\\cdot,\\cdot)` is the Euclidean distance.
sklearn/gaussian_process/kernels.py-    For advice on how to set the length scale parameter, see e.g. [1]_.
--
sklearn/gaussian_process/kernels.py=class Matern(RBF):
--
sklearn/gaussian_process/kernels.py-    :math:`\\nu=1.5` (once differentiable functions)
sklearn/gaussian_process/kernels.py-    and :math:`\\nu=2.5` (twice differentiable functions).
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py-    The kernel is given by:
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py:    .. math::
sklearn/gaussian_process/kernels.py-         k(x_i, x_j) =  \\frac{1}{\\Gamma(\\nu)2^{\\nu-1}}\\Bigg(
sklearn/gaussian_process/kernels.py-         \\frac{\\sqrt{2\\nu}}{l} d(x_i , x_j )
sklearn/gaussian_process/kernels.py-         \\Bigg)^\\nu K_\\nu\\Bigg(
sklearn/gaussian_process/kernels.py-         \\frac{\\sqrt{2\\nu}}{l} d(x_i , x_j )\\Bigg)
sklearn/gaussian_process/kernels.py-
--
sklearn/gaussian_process/kernels.py=class RationalQuadratic(StationaryKernelMixin, NormalizedKernelMixin, Kernel):
--
sklearn/gaussian_process/kernels.py-    parameterized by a length scale parameter :math:`l>0` and a scale
sklearn/gaussian_process/kernels.py-    mixture parameter :math:`\\alpha>0`. Only the isotropic variant
sklearn/gaussian_process/kernels.py-    where length_scale :math:`l` is a scalar is supported at the moment.
sklearn/gaussian_process/kernels.py-    The kernel is given by:
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py:    .. math::
sklearn/gaussian_process/kernels.py-        k(x_i, x_j) = \\left(
sklearn/gaussian_process/kernels.py-        1 + \\frac{d(x_i, x_j)^2 }{ 2\\alpha  l^2}\\right)^{-\\alpha}
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py-    where :math:`\\alpha` is the scale mixture parameter, :math:`l` is
sklearn/gaussian_process/kernels.py-    the length scale of the kernel and :math:`d(\\cdot,\\cdot)` is the
--
sklearn/gaussian_process/kernels.py=class ExpSineSquared(StationaryKernelMixin, NormalizedKernelMixin, Kernel):
--
sklearn/gaussian_process/kernels.py-    themselves exactly. It is parameterized by a length scale
sklearn/gaussian_process/kernels.py-    parameter :math:`l>0` and a periodicity parameter :math:`p>0`.
sklearn/gaussian_process/kernels.py-    Only the isotropic variant where :math:`l` is a scalar is
sklearn/gaussian_process/kernels.py-    supported at the moment. The kernel is given by:
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py:    .. math::
sklearn/gaussian_process/kernels.py-        k(x_i, x_j) = \text{exp}\left(-
sklearn/gaussian_process/kernels.py-        \frac{ 2\sin^2(\pi d(x_i, x_j)/p) }{ l^ 2} \right)
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py-    where :math:`l` is the length scale of the kernel, :math:`p` the
sklearn/gaussian_process/kernels.py-    periodicity of the kernel and :math:`d(\cdot,\cdot)` is the
--
sklearn/gaussian_process/kernels.py=class DotProduct(Kernel):
--
sklearn/gaussian_process/kernels.py-    It is parameterized by a parameter sigma_0 :math:`\sigma`
sklearn/gaussian_process/kernels.py-    which controls the inhomogenity of the kernel. For :math:`\sigma_0^2 =0`,
sklearn/gaussian_process/kernels.py-    the kernel is called the homogeneous linear kernel, otherwise
sklearn/gaussian_process/kernels.py-    it is inhomogeneous. The kernel is given by
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py:    .. math::
sklearn/gaussian_process/kernels.py-        k(x_i, x_j) = \sigma_0 ^ 2 + x_i \cdot x_j
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py-    The DotProduct kernel is commonly combined with exponentiation.
sklearn/gaussian_process/kernels.py-
sklearn/gaussian_process/kernels.py-    See [1]_, Chapter 4, Section 4.2, for further details regarding the
--
sklearn/manifold/_t_sne.py=def trustworthiness(X, X_embedded, *, n_neighbors=5, metric="euclidean"):
sklearn/manifold/_t_sne.py-    r"""Indicate to what extent the local structure is retained.
sklearn/manifold/_t_sne.py-
sklearn/manifold/_t_sne.py-    The trustworthiness is within [0, 1]. It is defined as
sklearn/manifold/_t_sne.py-
sklearn/manifold/_t_sne.py:    .. math::
sklearn/manifold/_t_sne.py-
sklearn/manifold/_t_sne.py-        T(k) = 1 - \frac{2}{nk (2n - 3k - 1)} \sum^n_{i=1}
sklearn/manifold/_t_sne.py-            \sum_{j \in \mathcal{N}_{i}^{k}} \max(0, (r(i, j) - k))
sklearn/manifold/_t_sne.py-
sklearn/manifold/_t_sne.py-    where for each sample i, :math:`\mathcal{N}_{i}^{k}` are its k nearest
--
sklearn/metrics/_classification.py=def cohen_kappa_score(y1, y2, *, labels=None, weights=None, sample_weight=None):
--
sklearn/metrics/_classification.py-
sklearn/metrics/_classification.py-    This function computes Cohen's kappa [1]_, a score that expresses the level
sklearn/metrics/_classification.py-    of agreement between two annotators on a classification problem. It is
sklearn/metrics/_classification.py-    defined as
sklearn/metrics/_classification.py-
sklearn/metrics/_classification.py:    .. math::
sklearn/metrics/_classification.py-        \kappa = (p_o - p_e) / (1 - p_e)
sklearn/metrics/_classification.py-
sklearn/metrics/_classification.py-    where :math:`p_o` is the empirical probability of agreement on the label
sklearn/metrics/_classification.py-    assigned to any sample (the observed agreement ratio), and :math:`p_e` is
sklearn/metrics/_classification.py-    the expected agreement when both annotators assign labels randomly.
--
sklearn/metrics/_classification.py=def f1_score(
--
sklearn/metrics/_classification.py-    The F1 score can be interpreted as a harmonic mean of the precision and
sklearn/metrics/_classification.py-    recall, where an F1 score reaches its best value at 1 and worst score at 0.
sklearn/metrics/_classification.py-    The relative contribution of precision and recall to the F1 score are
sklearn/metrics/_classification.py-    equal. The formula for the F1 score is:
sklearn/metrics/_classification.py-
sklearn/metrics/_classification.py:    .. math::
sklearn/metrics/_classification.py-        \\text{F1} = \\frac{2 * \\text{TP}}{2 * \\text{TP} + \\text{FP} + \\text{FN}}
sklearn/metrics/_classification.py-
sklearn/metrics/_classification.py-    Where :math:`\\text{TP}` is the number of true positives, :math:`\\text{FN}` is the
sklearn/metrics/_classification.py-    number of false negatives, and :math:`\\text{FP}` is the number of false positives.
sklearn/metrics/_classification.py-    F1 is by default
--
sklearn/metrics/_classification.py=def fbeta_score(
--
sklearn/metrics/_classification.py-    Asymptotically, `beta -> +inf` considers only recall, and `beta -> 0`
sklearn/metrics/_classification.py-    only precision.
sklearn/metrics/_classification.py-
sklearn/metrics/_classification.py-    The formula for F-beta score is:
sklearn/metrics/_classification.py-
sklearn/metrics/_classification.py:    .. math::
sklearn/metrics/_classification.py-
sklearn/metrics/_classification.py-       F_\\beta = \\frac{(1 + \\beta^2) \\text{tp}}
sklearn/metrics/_classification.py-                        {(1 + \\beta^2) \\text{tp} + \\text{fp} + \\beta^2 \\text{fn}}
sklearn/metrics/_classification.py-
sklearn/metrics/_classification.py-    Where :math:`\\text{tp}` is the number of true positives, :math:`\\text{fp}` is the
--
sklearn/metrics/_classification.py=def log_loss(y_true, y_pred, *, normalize=True, sample_weight=None, labels=None):
--
sklearn/metrics/_classification.py-    The log loss is only defined for two or more labels.
sklearn/metrics/_classification.py-    For a single sample with true label :math:`y \in \{0,1\}` and
sklearn/metrics/_classification.py-    a probability estimate :math:`p = \operatorname{Pr}(y = 1)`, the log
sklearn/metrics/_classification.py-    loss is:
sklearn/metrics/_classification.py-
sklearn/metrics/_classification.py:    .. math::
sklearn/metrics/_classification.py-        L_{\log}(y, p) = -(y \log (p) + (1 - y) \log (1 - p))
sklearn/metrics/_classification.py-
sklearn/metrics/_classification.py-    Read more in the :ref:`User Guide <log_loss>`.
sklearn/metrics/_classification.py-
sklearn/metrics/_classification.py-    Parameters
--
sklearn/metrics/_ranking.py=def average_precision_score(
--
sklearn/metrics/_ranking.py-
sklearn/metrics/_ranking.py-    AP summarizes a precision-recall curve as the weighted mean of precisions
sklearn/metrics/_ranking.py-    achieved at each threshold, with the increase in recall from the previous
sklearn/metrics/_ranking.py-    threshold used as the weight:
sklearn/metrics/_ranking.py-
sklearn/metrics/_ranking.py:    .. math::
sklearn/metrics/_ranking.py-        \\text{AP} = \\sum_n (R_n - R_{n-1}) P_n
sklearn/metrics/_ranking.py-
sklearn/metrics/_ranking.py-    where :math:`P_n` and :math:`R_n` are the precision and recall at the nth
sklearn/metrics/_ranking.py-    threshold [1]_. This implementation is not interpolated and is different
sklearn/metrics/_ranking.py-    from computing the area under the precision-recall curve with the
--
sklearn/metrics/cluster/_supervised.py=def mutual_info_score(labels_true, labels_pred, *, contingency=None):
--
sklearn/metrics/cluster/_supervised.py-    of the same data. Where :math:`|U_i|` is the number of the samples
sklearn/metrics/cluster/_supervised.py-    in cluster :math:`U_i` and :math:`|V_j|` is the number of the
sklearn/metrics/cluster/_supervised.py-    samples in cluster :math:`V_j`, the Mutual Information
sklearn/metrics/cluster/_supervised.py-    between clusterings :math:`U` and :math:`V` is given as:
sklearn/metrics/cluster/_supervised.py-
sklearn/metrics/cluster/_supervised.py:    .. math::
sklearn/metrics/cluster/_supervised.py-
sklearn/metrics/cluster/_supervised.py-        MI(U,V)=\\sum_{i=1}^{|U|} \\sum_{j=1}^{|V|} \\frac{|U_i\\cap V_j|}{N}
sklearn/metrics/cluster/_supervised.py-        \\log\\frac{N|U_i \\cap V_j|}{|U_i||V_j|}
sklearn/metrics/cluster/_supervised.py-
sklearn/metrics/cluster/_supervised.py-    This metric is independent of the absolute values of the labels:
--
sklearn/metrics/pairwise.py=def nan_euclidean_distances(
--
sklearn/metrics/pairwise.py-
sklearn/metrics/pairwise.py-        weight = Total # of coordinates / # of present coordinates
sklearn/metrics/pairwise.py-
sklearn/metrics/pairwise.py-    For example, the distance between ``[3, na, na, 6]`` and ``[1, na, 4, 5]`` is:
sklearn/metrics/pairwise.py-
sklearn/metrics/pairwise.py:    .. math::
sklearn/metrics/pairwise.py-        \\sqrt{\\frac{4}{2}((3-1)^2 + (6-5)^2)}
sklearn/metrics/pairwise.py-
sklearn/metrics/pairwise.py-    If all the coordinates are missing or if there are no common present
sklearn/metrics/pairwise.py-    coordinates then NaN is returned for that pair.
sklearn/metrics/pairwise.py-
--
sklearn/metrics/pairwise.py=def haversine_distances(X, Y=None):
--
sklearn/metrics/pairwise.py-    The Haversine (or great circle) distance is the angular distance between
sklearn/metrics/pairwise.py-    two points on the surface of a sphere. The first coordinate of each point
sklearn/metrics/pairwise.py-    is assumed to be the latitude, the second is the longitude, given
sklearn/metrics/pairwise.py-    in radians. The dimension of the data must be 2.
sklearn/metrics/pairwise.py-
sklearn/metrics/pairwise.py:    .. math::
sklearn/metrics/pairwise.py-       D(x, y) = 2\\arcsin[\\sqrt{\\sin^2((x_{lat} - y_{lat}) / 2)
sklearn/metrics/pairwise.py-                                + \\cos(x_{lat})\\cos(y_{lat})\\
sklearn/metrics/pairwise.py-                                sin^2((x_{lon} - y_{lon}) / 2)}]
sklearn/metrics/pairwise.py-
sklearn/metrics/pairwise.py-    Parameters
--
sklearn/preprocessing/_data.py=class KernelCenterer(ClassNamePrefixFeaturesOutMixin, TransformerMixin, BaseEstimator):
sklearn/preprocessing/_data.py-    r"""Center an arbitrary kernel matrix :math:`K`.
sklearn/preprocessing/_data.py-
sklearn/preprocessing/_data.py-    Let define a kernel :math:`K` such that:
sklearn/preprocessing/_data.py-
sklearn/preprocessing/_data.py:    .. math::
sklearn/preprocessing/_data.py-        K(X, Y) = \phi(X) . \phi(Y)^{T}
sklearn/preprocessing/_data.py-
sklearn/preprocessing/_data.py-    :math:`\phi(X)` is a function mapping of rows of :math:`X` to a
sklearn/preprocessing/_data.py-    Hilbert space and :math:`K` is of shape `(n_samples, n_samples)`.
sklearn/preprocessing/_data.py-
sklearn/preprocessing/_data.py-    This class allows to compute :math:`\tilde{K}(X, Y)` such that:
sklearn/preprocessing/_data.py-
sklearn/preprocessing/_data.py:    .. math::
sklearn/preprocessing/_data.py-        \tilde{K(X, Y)} = \tilde{\phi}(X) . \tilde{\phi}(Y)^{T}
sklearn/preprocessing/_data.py-
sklearn/preprocessing/_data.py-    :math:`\tilde{\phi}(X)` is the centered mapped data in the Hilbert
sklearn/preprocessing/_data.py-    space.
sklearn/preprocessing/_data.py-

And when it comes to the rendered version, I quite like them when encountering them in the rendered API pages.

So I'd be okay with rst-commenting-out the python version and including the math notations here. Not sure what others think though.

ogrisel · 2024-11-25T14:52:18Z

sklearn/linear_model/_coordinate_descent.py

@@ -224,14 +224,29 @@ def lasso_path(

        (1 / (2 * n_samples)) * ||y - Xw||^2_2 + alpha * ||w||_1

+    .. math::
+        \\frac{1}{2n_{\\operatorname{samples}}}


We could probably use avoid the double backslash by using "raw" docstrings.

ogrisel · 2024-11-25T14:55:32Z

sklearn/linear_model/_coordinate_descent.py

@@ -224,14 +224,29 @@ def lasso_path(

        (1 / (2 * n_samples)) * ||y - Xw||^2_2 + alpha * ||w||_1

+    .. math::
+        \\frac{1}{2n_{\\operatorname{samples}}}
+        \\vert \\vert y - Xw \\vert \\vert^2_2 +


Isn't there a less verbose version of the Euclidean norm that is understood by sphinx?

\Vert (with a capital "V") should display a double bar, no?

Alternatively, we could just use || no? That might make such expression a lot more readable when reading the docstring in an IDE (even if slightly less correctly from a typographical point of view).

adrinjalali · 2024-11-25T15:01:36Z

In a meeting, we discussed, and conclusion is to use r"... to have raw docstrings here, and only keep the latex version in docstring, and we can move the ascii versions to down in the code instead if necessary.

…cikit-learn#30297)

Co-authored-by: Olivier Grisel <[email protected]>

…de (scikit-learn#29882)

Co-authored-by: Adrin Jalali <[email protected]> Co-authored-by: Thomas J. Fan <[email protected]> Co-authored-by: Loïc Estève <[email protected]>

…#30342) Co-authored-by: Lock file bot <[email protected]>

…arn#30343) Co-authored-by: Lock file bot <[email protected]>

…30345) Co-authored-by: Lock file bot <[email protected]>

…30344) Co-authored-by: Lock file bot <[email protected]>

…kit-learn#30341)

Co-authored-by: Lock file bot <[email protected]>

…cikit-learn#27369) Co-authored-by: Olivier Grisel <[email protected]>

Co-authored-by: Olivier Grisel <[email protected]>

Co-authored-by: Christian Lorentzen <[email protected]> Co-authored-by: Chiara Marmo <[email protected]>

…cikit-learn#30198)

…n#30348)

Co-authored-by: Thomas J. Fan <[email protected]>

…arn#30385) Co-authored-by: Lock file bot <[email protected]>

…30384) Co-authored-by: Lock file bot <[email protected]>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

…scikit-learn#29473)

…ariance_score` (scikit-learn#29978)

…arn#30361)

…it-learn#30393) Co-authored-by: Loïc Estève <[email protected]>

…0372) Co-authored-by: Jérémie du Boisberranger <[email protected]>

Co-authored-by: Olivier Grisel <[email protected]>

…learn#30410) Co-authored-by: Virgil Chan <[email protected]>

Co-authored-by: mekleo <[email protected]>

Co-authored-by: adrinjalali <[email protected]> Co-authored-by: Loïc Estève <[email protected]> Co-authored-by: Olivier Grisel <[email protected]>

)

Co-authored-by: Virgil Chan <[email protected]> Co-authored-by: Thomas J. Fan <[email protected]>

…#30409)

…30437) Co-authored-by: Lock file bot <[email protected]>

…30436) Co-authored-by: Lock file bot <[email protected]>

…arn#30435) Co-authored-by: Lock file bot <[email protected]>

…#30434) Co-authored-by: Lock file bot <[email protected]>

…ikit-learn#30349) Co-authored-by: Stefanie Senger <[email protected]>

Co-authored-by: Lock file bot <[email protected]> Co-authored-by: Olivier Grisel <[email protected]>

…ikit-learn#30428)

virchan

TL;DR: I updated the enet_path documentation using the raw doc-string format and discovered some trade-offs compared to the current implementation.

The updated doc-string renders as follows:

Here, only the LaTeX-formatted equations appear in the HTML file, while the code-formatted ones are rst-commented out.

I noticed (at least) two issues with this formatting approach:

Indentation Adjustments: We have to modify the indentations in various parts of the doc-string to accommodate the sphinx extension numpydoc. Even then, the HTML may not render correctly in all cases. For example, see my comment on the precompute parameter section.
IDE Compatibility: The LaTeX-formatted equations don’t render correctly in IDEs. Here’s what it looks like in PyCharm on my local machine, where no third-party LaTeX rendering extensions are installed:

This aligns with @glemaitre's concern:

I recall that this is a long standing debate. I was always more on the side of no-latex because I don't find it readable when looking at my IDE docstring. I don't know if modern IDE, is actually translating latex to HTML view nowadays?

To conclude, using the raw doc-string approach to include LaTeX introduces a different set of issues tied to numpydoc. Specifically, we’d need to verify the spacing and indentation manually and wait for numpydoc updates to ensure proper HTML rendering.

In light of this, I agree more with @lucyleeow’s suggestion:

I guess one view is for no latex in docstring, and have this in stuff in the user guide instead?

keep the current doc-string format and move the LaTeX equations to the user guide instead.

virchan · 2024-12-09T21:40:28Z

sklearn/linear_model/_coordinate_descent.py

+    precompute : 'auto', bool or array-like of shape (n_features, n_features),
+        default='auto'


When using the raw doc-string, the description for the precompute parameter renders incorrectly.

I’ve tried several "obvious fixes," including:

However, it seems we have to violate the PEP-8 character limit to get it right:

In particular, it appears this is an ongoing issue with numpydoc.

Naive question, does using \\ work?

I previously tried this method, but sphinx didn’t render it correctly:

virchan · 2024-12-09T21:41:18Z

sklearn/linear_model/_coordinate_descent.py

@@ -516,7 +552,7 @@ def enet_path(
    See Also
    --------
    MultiTaskElasticNet : Multi-task ElasticNet model trained with L1/L2 mixed-norm \
-    as regularizer.
+        as regularizer.


This indentation seems unavoidable because the sphinx extension numpydoc raises an error message,and sphinx would not output the HTML files otherwise.

virchan added 2 commits November 21, 2024 19:22

add LaTeX to various linear models.

32db80e

Merge branch 'scikit-learn:main' into documentation_improvement/latex…

b7e76b2

…_linear_model

github-actions bot added module:linear_model Documentation labels Nov 21, 2024

adrinjalali reviewed Nov 21, 2024

View reviewed changes

ogrisel reviewed Nov 25, 2024

View reviewed changes

virchan marked this pull request as draft November 28, 2024 08:51

glemaitre and others added 20 commits December 9, 2024 12:22

MAINT resize plotly figure to take right-hand sidebar into account (s…

226eb66

…cikit-learn#30297)

Check sample weight equivalence on sparse data (scikit-learn#30137)

5691658

Co-authored-by: Olivier Grisel <[email protected]>

CI Limit ninja number of parallel jobs in CircleCI (scikit-learn#30333)

9604c56

DOC Add section on resolving conflicts in lock files to developer gui…

240ba19

…de (scikit-learn#29882)

FIX Fix ExtraTreeRegressor missing data handling (scikit-learn#30318)

69cffe1

MAINT conversion old->new/new->old tags (bis) (scikit-learn#30327)

8796079

Co-authored-by: Adrin Jalali <[email protected]> Co-authored-by: Thomas J. Fan <[email protected]> Co-authored-by: Loïc Estève <[email protected]>

🔒 🤖 CI Update lock files for cirrus-arm CI build(s) 🔒 🤖 (scikit-learn…

826dd34

…#30342) Co-authored-by: Lock file bot <[email protected]>

🔒 🤖 CI Update lock files for free-threaded CI build(s) 🔒 🤖 (scikit-le…

c976dd6

…arn#30343) Co-authored-by: Lock file bot <[email protected]>

🔒 🤖 CI Update lock files for array-api CI build(s) 🔒 🤖 (scikit-learn#…

b310f48

…30345) Co-authored-by: Lock file bot <[email protected]>

🔒 🤖 CI Update lock files for scipy-dev CI build(s) 🔒 🤖 (scikit-learn#…

3b958a0

…30344) Co-authored-by: Lock file bot <[email protected]>

DOC Fix some typos in doc of RandomizedSearchCV and GridSearchCV (sci…

7fc22f6

…kit-learn#30341)

🔒 🤖 CI Update lock files for main CI build(s) 🔒 🤖 (scikit-learn#30346)

00434e1

Co-authored-by: Lock file bot <[email protected]>

ENH Array API support for f1_score and multilabel_confusion_matrix (s…

36b9315

…cikit-learn#27369) Co-authored-by: Olivier Grisel <[email protected]>

CI Add Windows free-threaded wheel (scikit-learn#30313)

8185a95

Co-authored-by: Olivier Grisel <[email protected]>

DOC add guideline for choosing a scoring function (scikit-learn#11430)

cf9c75d

Co-authored-by: Christian Lorentzen <[email protected]> Co-authored-by: Chiara Marmo <[email protected]>

DOC attempt to fix lorenz_curve in plot tweedie regression example (s…

8cc3b44

…cikit-learn#30198)

DOC Include precision_recall_fscore_support in array_api (scikit-lear…

4d6802e

…n#30348)

CI Actually use ccache in CircleCI (scikit-learn#30350)

451ae2c

DOC Use text label instead of emoticon in ML map (scikit-learn#30347)

346a7cc

Co-authored-by: Thomas J. Fan <[email protected]>

DOC Fix typo in _process_decision_function (scikit-learn#30358)

d040cb9

scikit-learn-bot and others added 29 commits December 9, 2024 12:22

🔒 🤖 CI Update lock files for free-threaded CI build(s) 🔒 🤖 (scikit-le…

d7f392d

…arn#30385) Co-authored-by: Lock file bot <[email protected]>

🔒 🤖 CI Update lock files for scipy-dev CI build(s) 🔒 🤖 (scikit-learn#…

a9f7b65

…30384) Co-authored-by: Lock file bot <[email protected]>

Bump the actions group with 2 updates (scikit-learn#30379)

be02ed2

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

CI Use sys.monitoring with coverage to speed-up Python >= 3.12 builds (…

3b946e7

…scikit-learn#29473)

ENH add support for Array API to mean_pinball_loss and `explained_v…

a10ca24

…ariance_score` (scikit-learn#29978)

CI Fix rendered doc affected paths for towncrier fragments (scikit-le…

f9110de

…arn#30361)

DOC Add changelog for free-threaded support (scikit-learn#30360)

c68fe42

FIX test_csr_polynomial_expansion_index_overflow on [scipy-dev] (scik…

0d36d6c

…it-learn#30393) Co-authored-by: Loïc Estève <[email protected]>

FIX KNeighbor classes correctly set positive_only tag (scikit-learn#3…

ef36e37

…0372) Co-authored-by: Jérémie du Boisberranger <[email protected]>

DOC Fix broken ref (scikit-learn#30407)

954e540

ENH Array API for check_consistent_length (scikit-learn#29519)

986874b

Co-authored-by: Olivier Grisel <[email protected]>

DOC Fix example comment being rendered as text (scikit-learn#30412)

abd0495

DOC Update DummyRegressor.fit docstring to be more precise (scikit-…

3cb3ac9

…learn#30410) Co-authored-by: Virgil Chan <[email protected]>

Simplify estimate gaussian covariances diag (scikit-learn#30414)

3e6dc6a

Co-authored-by: mekleo <[email protected]>

DOC Release Highlights for version 1.6 (scikit-learn#30392)

1253f8a

Co-authored-by: adrinjalali <[email protected]> Co-authored-by: Loïc Estève <[email protected]> Co-authored-by: Olivier Grisel <[email protected]>

MAINT add Maren Westermann in the documentation team (scikit-learn#30424

5a495e0

)

DOC fix link in HuberRegressor docstring (scikit-learn#30417)

a0bb6f7

Co-authored-by: Virgil Chan <[email protected]> Co-authored-by: Thomas J. Fan <[email protected]>

FIX deprecate integer valued numerical features for PDP (scikit-learn…

f9144b8

…#30409)

🔒 🤖 CI Update lock files for array-api CI build(s) 🔒 🤖 (scikit-learn#…

bd0bb0c

…30437) Co-authored-by: Lock file bot <[email protected]>

🔒 🤖 CI Update lock files for scipy-dev CI build(s) 🔒 🤖 (scikit-learn#…

f5acb2e

…30436) Co-authored-by: Lock file bot <[email protected]>

🔒 🤖 CI Update lock files for free-threaded CI build(s) 🔒 🤖 (scikit-le…

0e1846a

…arn#30435) Co-authored-by: Lock file bot <[email protected]>

🔒 🤖 CI Update lock files for cirrus-arm CI build(s) 🔒 🤖 (scikit-learn…

8b88d67

…#30434) Co-authored-by: Lock file bot <[email protected]>

DOC add caching example link to Pipeline class (scikit-learn#30421)

4b2ccb8

DOC Add links to example plot_kmeans_stability_low_dim_dense.py (sc…

c281719

…ikit-learn#30349) Co-authored-by: Stefanie Senger <[email protected]>

🔒 🤖 CI Update lock files for main CI build(s) 🔒 🤖 (scikit-learn#30438)

14db3a5

Co-authored-by: Lock file bot <[email protected]> Co-authored-by: Olivier Grisel <[email protected]>

DOC Correct short_summary for sklearn.kernel_approximation module (sc…

70ec8fe

…ikit-learn#30428)

REL Update news for 1.6.0 (scikit-learn#30441)

3bfefb5

updated enet_path doc-string.

46dc507

Merge branch 'main' into documentation_improvement/latex_linear_model

fc34ae3

virchan commented Dec 9, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DOC add LaTeX to various linear models #30322

DOC add LaTeX to various linear models #30322

virchan commented Nov 21, 2024

github-actions bot commented Nov 21, 2024 •

edited

Loading

adrinjalali left a comment

adrinjalali Nov 21, 2024

glemaitre Nov 21, 2024

lucyleeow Nov 21, 2024

adrinjalali commented Nov 25, 2024

ogrisel Nov 25, 2024

ogrisel Nov 25, 2024

ogrisel Nov 25, 2024

ogrisel Nov 25, 2024

adrinjalali commented Nov 25, 2024

virchan left a comment

virchan Dec 9, 2024

lucyleeow Dec 10, 2024

virchan Dec 10, 2024

virchan Dec 9, 2024

		precompute : 'auto', bool or array-like of shape (n_features, n_features),
		default='auto'

DOC add LaTeX to various linear models #30322

Are you sure you want to change the base?

DOC add LaTeX to various linear models #30322

Conversation

virchan commented Nov 21, 2024

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

github-actions bot commented Nov 21, 2024 • edited Loading

✔️ Linting Passed

adrinjalali left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adrinjalali commented Nov 25, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adrinjalali commented Nov 25, 2024

virchan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Nov 21, 2024 •

edited

Loading