Improving the runtime efficiency of the SVC for a precomputed kernel during inference

### Describe the workflow you want to enable

When one wants to use a custom kernel in the `svm.SVC` class ([docs](https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html)) there are two options: either to define a custom function and pass it as a callable to `svm.SVC` or to pre-compute the kernel matrix and use the `kernel=precomputed` argument of `svm.SVC`. From my understanding, following the class inheritance of `svm.SVC`: [source](https://github.com/scikit-learn/scikit-learn/blob/8cf09596bda0279d3e66d268ac9d0af878067718/sklearn/svm/_base.py#L224), these two options are computationally equivalent. _Please correct me if my assumption is wrong, because all the following points stem from that_.

In inference time, i.e. when you use `SVC.predict(X_test)`, with `kernel=precomputed`, the expected input matrix is of shape (n_test, n_train), where n_test and n_train are the number of test and train data samples, respectively. I believe there is an inefficiency because the above pre-computed matrix X_test includes more dot products than the ones needed to classify the test data points, since, the SVM model uses only the support vectors (subset of the training data vectors) to predict a label. To elaborate, a trained (fitted) SVM model <img src="https://render.githubusercontent.com/render/math?math=f"> classifies a test data point <img src="https://render.githubusercontent.com/render/math?math=\hat{x}"> in the following way (omitting the bias term),

<img src="https://render.githubusercontent.com/render/math?math=\hat{y}=\text{sign}[f(\hat{x})],\,\text{where}\, f(\hat{x})= \sum_{i=0}^{N_\text{sv}}\alpha_iy_iK(x_i,\hat{x})">, where <img src="https://render.githubusercontent.com/render/math?math=\alpha_i"> are the dual coefficients, <img src="https://render.githubusercontent.com/render/math?math=y_i "> are  the labels of the support vectors, <img src="https://render.githubusercontent.com/render/math?math=N_{SV}"> is the number of the support vectors identified during training, and <img src="https://render.githubusercontent.com/render/math?math=K"> is our kernel. The equation can be rewritten in a matrix form to accommodate classification of a set of test data points X_test, as mentioned above. 

All the dual coefficients that correspond to non-support data vectors are zero hence, the above sum is only over the support vectors. However, X_test has some columns that, in general, don't correspond to support vectors.

Isn't it more efficient to compute X_test with a shape of (n_test, `svm.SVC.n_support_`)? The only references to this issue that I was able to find are quite old (2012), by @amueller and @mblondel in #790 and https://github.com/scikit-learn/scikit-learn/pull/789#issuecomment-5259099. Nevertheless, no possible solution was discussed or implemented.

Can such an improvement be implemented? If not, I would really appreciate any information regarding the design choices that restrict such an implementation :)

### Describe your proposed solution

I wasn't able to pinpoint which parts of the source code could be changed to accommodate the above proposal. Instead, I will provide here a simple example of how the `svm.SVC` class could make use of the above enhancement if it is implemented.
```python
from sklearn.datasets import load_breast_cancer
from sklearn.svm import SVC
from sklearn.utils import shuffle
import numpy as np

dataset = load_breast_cancer()
X, y = shuffle(dataset.data,dataset.target)
X_train, X_test = X[:100, :], X[100:, :]
y_train, y_test = y[:100], y[100:]

svc = SVC(kernel='precomputed')

kernel_train = np.dot(X_train, X_train.T)  # simple example of a linear pre-computed kernel
svc.fit(kernel_train, y_train)
# What the current implementation requires:
kernel_test = np.dot(X_test, X_train.T)
# What would be more efficient:
kernel_test_ef = np.dot(X_test, X_train[svc.support_, :].T)

y_pred = svc.predict(kernel_test,y_test)# Raises an error if kerne_test_ef is used.		
													
```
An example of the shape difference between `kernel_test_ef` and `kernel_test`:
```python
svc.n_support_: [7 5]
kernel_test_eff.shape: (469, 12)
kernel_test.shape: (469, 100)
```
This difference can cause a significant runtime difference when scaling to large datasets.

### Describe alternatives you've considered, if relevant

_No response_

### Additional context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Improving the runtime efficiency of the SVC for a precomputed kernel during inference #22363

Describe the workflow you want to enable

Describe your proposed solution

Describe alternatives you've considered, if relevant

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Improving the runtime efficiency of the SVC for a precomputed kernel during inference #22363

Description

Describe the workflow you want to enable

Describe your proposed solution

Describe alternatives you've considered, if relevant

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions