Skip to main content
Ctrl+K
scikit-learn homepage scikit-learn homepage
  • Install
  • User Guide
  • API
  • Examples
  • Community
    • Getting Started
    • Release History
    • Glossary
    • Development
    • FAQ
    • Support
    • Related Projects
    • Roadmap
    • Governance
    • About us
  • GitHub
  • Install
  • User Guide
  • API
  • Examples
  • Community
  • Getting Started
  • Release History
  • Glossary
  • Development
  • FAQ
  • Support
  • Related Projects
  • Roadmap
  • Governance
  • About us
  • GitHub

Section Navigation

  • Release Highlights
    • Release Highlights for scikit-learn 1.6
    • Release Highlights for scikit-learn 1.5
    • Release Highlights for scikit-learn 1.4
    • Release Highlights for scikit-learn 1.3
    • Release Highlights for scikit-learn 1.2
    • Release Highlights for scikit-learn 1.1
    • Release Highlights for scikit-learn 1.0
    • Release Highlights for scikit-learn 0.24
    • Release Highlights for scikit-learn 0.23
    • Release Highlights for scikit-learn 0.22
  • Biclustering
    • A demo of the Spectral Biclustering algorithm
    • A demo of the Spectral Co-Clustering algorithm
    • Biclustering documents with the Spectral Co-clustering algorithm
  • Calibration
    • Comparison of Calibration of Classifiers
    • Probability Calibration curves
    • Probability Calibration for 3-class classification
    • Probability calibration of classifiers
  • Classification
    • Classifier comparison
    • Linear and Quadratic Discriminant Analysis with covariance ellipsoid
    • Normal, Ledoit-Wolf and OAS Linear Discriminant Analysis for classification
    • Plot classification probability
    • Recognizing hand-written digits
  • Clustering
    • A demo of K-Means clustering on the handwritten digits data
    • A demo of structured Ward hierarchical clustering on an image of coins
    • A demo of the mean-shift clustering algorithm
    • Adjustment for chance in clustering performance evaluation
    • Agglomerative clustering with and without structure
    • Agglomerative clustering with different metrics
    • An example of K-Means++ initialization
    • Bisecting K-Means and Regular K-Means Performance Comparison
    • Compare BIRCH and MiniBatchKMeans
    • Comparing different clustering algorithms on toy datasets
    • Comparing different hierarchical linkage methods on toy datasets
    • Comparison of the K-Means and MiniBatchKMeans clustering algorithms
    • Demo of DBSCAN clustering algorithm
    • Demo of HDBSCAN clustering algorithm
    • Demo of OPTICS clustering algorithm
    • Demo of affinity propagation clustering algorithm
    • Demonstration of k-means assumptions
    • Empirical evaluation of the impact of k-means initialization
    • Feature agglomeration
    • Feature agglomeration vs. univariate selection
    • Hierarchical clustering: structured vs unstructured ward
    • Inductive Clustering
    • Online learning of a dictionary of parts of faces
    • Plot Hierarchical Clustering Dendrogram
    • Segmenting the picture of greek coins in regions
    • Selecting the number of clusters with silhouette analysis on KMeans clustering
    • Spectral clustering for image segmentation
    • Various Agglomerative Clustering on a 2D embedding of digits
    • Vector Quantization Example
  • Covariance estimation
    • Ledoit-Wolf vs OAS estimation
    • Robust covariance estimation and Mahalanobis distances relevance
    • Robust vs Empirical covariance estimate
    • Shrinkage covariance estimation: LedoitWolf vs OAS and max-likelihood
    • Sparse inverse covariance estimation
  • Cross decomposition
    • Compare cross decomposition methods
    • Principal Component Regression vs Partial Least Squares Regression
  • Dataset examples
    • Plot randomly generated multilabel dataset
  • Decision Trees
    • Decision Tree Regression
    • Plot the decision surface of decision trees trained on the iris dataset
    • Post pruning decision trees with cost complexity pruning
    • Understanding the decision tree structure
  • Decomposition
    • Blind source separation using FastICA
    • Comparison of LDA and PCA 2D projection of Iris dataset
    • Faces dataset decompositions
    • Factor Analysis (with rotation) to visualize patterns
    • FastICA on 2D point clouds
    • Image denoising using dictionary learning
    • Incremental PCA
    • Kernel PCA
    • Model selection with Probabilistic PCA and Factor Analysis (FA)
    • Principal Component Analysis (PCA) on Iris Dataset
    • Sparse coding with a precomputed dictionary
  • Developing Estimators
    • __sklearn_is_fitted__ as Developer API
  • Ensemble methods
    • Categorical Feature Support in Gradient Boosting
    • Combine predictors using stacking
    • Comparing Random Forests and Histogram Gradient Boosting models
    • Comparing random forests and the multi-output meta estimator
    • Decision Tree Regression with AdaBoost
    • Early stopping in Gradient Boosting
    • Feature importances with a forest of trees
    • Feature transformations with ensembles of trees
    • Features in Histogram Gradient Boosting Trees
    • Gradient Boosting Out-of-Bag estimates
    • Gradient Boosting regression
    • Gradient Boosting regularization
    • Hashing feature transformation using Totally Random Trees
    • IsolationForest example
    • Monotonic Constraints
    • Multi-class AdaBoosted Decision Trees
    • OOB Errors for Random Forests
    • Plot class probabilities calculated by the VotingClassifier
    • Plot individual and voting regression predictions
    • Plot the decision boundaries of a VotingClassifier
    • Plot the decision surfaces of ensembles of trees on the iris dataset
    • Prediction Intervals for Gradient Boosting Regression
    • Single estimator versus bagging: bias-variance decomposition
    • Two-class AdaBoost
  • Examples based on real world datasets
    • Compressive sensing: tomography reconstruction with L1 prior (Lasso)
    • Faces recognition example using eigenfaces and SVMs
    • Image denoising using kernel PCA
    • Lagged features for time series forecasting
    • Model Complexity Influence
    • Out-of-core classification of text documents
    • Outlier detection on a real data set
    • Prediction Latency
    • Species distribution modeling
    • Time-related feature engineering
    • Topic extraction with Non-negative Matrix Factorization and Latent Dirichlet Allocation
    • Visualizing the stock market structure
    • Wikipedia principal eigenvector
  • Feature Selection
    • Comparison of F-test and mutual information
    • Model-based and sequential feature selection
    • Pipeline ANOVA SVM
    • Recursive feature elimination
    • Recursive feature elimination with cross-validation
    • Univariate Feature Selection
  • Frozen Estimators
    • Examples of Using FrozenEstimator
  • Gaussian Mixture Models
    • Concentration Prior Type Analysis of Variation Bayesian Gaussian Mixture
    • Density Estimation for a Gaussian mixture
    • GMM Initialization Methods
    • GMM covariances
    • Gaussian Mixture Model Ellipsoids
    • Gaussian Mixture Model Selection
    • Gaussian Mixture Model Sine Curve
  • Gaussian Process for Machine Learning
    • Ability of Gaussian process regression (GPR) to estimate data noise-level
    • Comparison of kernel ridge and Gaussian process regression
    • Forecasting of CO2 level on Mona Loa dataset using Gaussian process regression (GPR)
    • Gaussian Processes regression: basic introductory example
    • Gaussian process classification (GPC) on iris dataset
    • Gaussian processes on discrete data structures
    • Illustration of Gaussian process classification (GPC) on the XOR dataset
    • Illustration of prior and posterior Gaussian process for different kernels
    • Iso-probability lines for Gaussian Processes classification (GPC)
    • Probabilistic predictions with Gaussian process classification (GPC)
  • Generalized Linear Models
    • Comparing Linear Bayesian Regressors
    • Comparing various online solvers
    • Curve Fitting with Bayesian Ridge Regression
    • Decision Boundaries of Multinomial and One-vs-Rest Logistic Regression
    • Early stopping of Stochastic Gradient Descent
    • Fitting an Elastic Net with a precomputed Gram Matrix and Weighted Samples
    • HuberRegressor vs Ridge on dataset with strong outliers
    • Joint feature selection with multi-task Lasso
    • L1 Penalty and Sparsity in Logistic Regression
    • L1-based models for Sparse Signals
    • Lasso model selection via information criteria
    • Lasso model selection: AIC-BIC / cross-validation
    • Lasso on dense and sparse data
    • Lasso, Lasso-LARS, and Elastic Net paths
    • Logistic function
    • MNIST classification using multinomial logistic + L1
    • Multiclass sparse logistic regression on 20newgroups
    • Non-negative least squares
    • One-Class SVM versus One-Class SVM using Stochastic Gradient Descent
    • Ordinary Least Squares Example
    • Ordinary Least Squares and Ridge Regression Variance
    • Orthogonal Matching Pursuit
    • Plot Ridge coefficients as a function of the regularization
    • Plot multi-class SGD on the iris dataset
    • Poisson regression and non-normal loss
    • Polynomial and Spline interpolation
    • Quantile regression
    • Regularization path of L1- Logistic Regression
    • Ridge coefficients as a function of the L2 Regularization
    • Robust linear estimator fitting
    • Robust linear model estimation using RANSAC
    • SGD: Maximum margin separating hyperplane
    • SGD: Penalties
    • SGD: Weighted samples
    • SGD: convex loss functions
    • Theil-Sen Regression
    • Tweedie regression on insurance claims
  • Inspection
    • Common pitfalls in the interpretation of coefficients of linear models
    • Failure of Machine Learning to infer causal effects
    • Partial Dependence and Individual Conditional Expectation Plots
    • Permutation Importance vs Random Forest Feature Importance (MDI)
    • Permutation Importance with Multicollinear or Correlated Features
  • Kernel Approximation
    • Scalable learning with polynomial kernel approximation
  • Manifold learning
    • Comparison of Manifold Learning methods
    • Manifold Learning methods on a severed sphere
    • Manifold learning on handwritten digits: Locally Linear Embedding, Isomap…
    • Multi-dimensional scaling
    • Swiss Roll And Swiss-Hole Reduction
    • t-SNE: The effect of various perplexity values on the shape
  • Miscellaneous
    • Advanced Plotting With Partial Dependence
    • Comparing anomaly detection algorithms for outlier detection on toy datasets
    • Comparison of kernel ridge regression and SVR
    • Displaying Pipelines
    • Displaying estimators and complex pipelines
    • Evaluation of outlier detection estimators
    • Explicit feature map approximation for RBF kernels
    • Face completion with a multi-output estimators
    • Introducing the set_output API
    • Isotonic Regression
    • Metadata Routing
    • Multilabel classification
    • ROC Curve with Visualization API
    • The Johnson-Lindenstrauss bound for embedding with random projections
    • Visualizations with Display Objects
  • Missing Value Imputation
    • Imputing missing values before building an estimator
    • Imputing missing values with variants of IterativeImputer
  • Model Selection
    • Balance model complexity and cross-validated score
    • Class Likelihood Ratios to measure classification performance
    • Comparing randomized search and grid search for hyperparameter estimation
    • Comparison between grid search and successive halving
    • Confusion matrix
    • Custom refit strategy of a grid search with cross-validation
    • Demonstration of multi-metric evaluation on cross_val_score and GridSearchCV
    • Detection error tradeoff (DET) curve
    • Effect of model regularization on training and test error
    • Multiclass Receiver Operating Characteristic (ROC)
    • Nested versus non-nested cross-validation
    • Plotting Cross-Validated Predictions
    • Plotting Learning Curves and Checking Models’ Scalability
    • Post-hoc tuning the cut-off point of decision function
    • Post-tuning the decision threshold for cost-sensitive learning
    • Precision-Recall
    • Receiver Operating Characteristic (ROC) with cross validation
    • Sample pipeline for text feature extraction and evaluation
    • Statistical comparison of models using grid search
    • Successive Halving Iterations
    • Test with permutations the significance of a classification score
    • Underfitting vs. Overfitting
    • Visualizing cross-validation behavior in scikit-learn
  • Multiclass methods
    • Overview of multiclass training meta-estimators
  • Multioutput methods
    • Multilabel classification using a classifier chain
  • Nearest Neighbors
    • Approximate nearest neighbors in TSNE
    • Caching nearest neighbors
    • Comparing Nearest Neighbors with and without Neighborhood Components Analysis
    • Dimensionality Reduction with Neighborhood Components Analysis
    • Kernel Density Estimate of Species Distributions
    • Kernel Density Estimation
    • Nearest Centroid Classification
    • Nearest Neighbors Classification
    • Nearest Neighbors regression
    • Neighborhood Components Analysis Illustration
    • Novelty detection with Local Outlier Factor (LOF)
    • Outlier detection with Local Outlier Factor (LOF)
    • Simple 1D Kernel Density Estimation
  • Neural Networks
    • Compare Stochastic learning strategies for MLPClassifier
    • Restricted Boltzmann Machine features for digit classification
    • Varying regularization in Multi-layer Perceptron
    • Visualization of MLP weights on MNIST
  • Pipelines and composite estimators
    • Column Transformer with Heterogeneous Data Sources
    • Column Transformer with Mixed Types
    • Concatenating multiple feature extraction methods
    • Effect of transforming the targets in regression model
    • Pipelining: chaining a PCA and a logistic regression
    • Selecting dimensionality reduction with Pipeline and GridSearchCV
  • Preprocessing
    • Compare the effect of different scalers on data with outliers
    • Comparing Target Encoder with Other Encoders
    • Demonstrating the different strategies of KBinsDiscretizer
    • Feature discretization
    • Importance of Feature Scaling
    • Map data to a normal distribution
    • Target Encoder’s Internal Cross fitting
    • Using KBinsDiscretizer to discretize continuous features
  • Semi Supervised Classification
    • Decision boundary of semi-supervised classifiers versus SVM on the Iris dataset
    • Effect of varying threshold for self-training
    • Label Propagation digits active learning
    • Label Propagation digits: Demonstrating performance
    • Label Propagation learning a complex structure
    • Semi-supervised Classification on a Text Dataset
  • Support Vector Machines
    • One-class SVM with non-linear kernel (RBF)
    • Plot classification boundaries with different SVM Kernels
    • Plot different SVM classifiers in the iris dataset
    • Plot the support vectors in LinearSVC
    • RBF SVM parameters
    • SVM Margins Example
    • SVM Tie Breaking Example
    • SVM with custom kernel
    • SVM-Anova: SVM with univariate feature selection
    • SVM: Maximum margin separating hyperplane
    • SVM: Separating hyperplane for unbalanced classes
    • SVM: Weighted samples
    • Scaling the regularization parameter for SVCs
    • Support Vector Regression (SVR) using linear and non-linear kernels
  • Working with text documents
    • Classification of text documents using sparse features
    • Clustering text documents using k-means
    • FeatureHasher and DictVectorizer Comparison
  • Examples
  • Kernel Approximation

Kernel Approximation#

Examples concerning the sklearn.kernel_approximation module.

Scalable learning with polynomial kernel approximation

Scalable learning with polynomial kernel approximation

previous

Permutation Importance with Multicollinear or Correlated Features

next

Scalable learning with polynomial kernel approximation

© Copyright 2007 - 2025, scikit-learn developers (BSD License).

OSZAR »