How to use Feature selection in MATLAB

3 visualizaciones (últimos 30 días)
Hugo
Hugo el 20 de En. de 2022
Respondida: Akanksha el 28 de Abr. de 2025
Hi,
I have a CSV file with 10 columns and 1000 lines. Apart from the headers (1st line), all the results are numeric. I would like to run feature selection to the first 9 columns, using as target the last column (column) 10.
I have three doubts/questions about my problem
-Which feature selection method, from those shown here: https://www.mathworks.com/discovery/feature-selection.html
is more suitable.
-Which function should be used for feature selection?
-How can I setup my code to obtain results from feature selection?
Thank you,

Respuestas (1)

Akanksha
Akanksha el 28 de Abr. de 2025
Hey @Hugo,
Answering your queries:
1.Which feature selection method to be used? The best method depends on your data and your goal. For regression problems (ie when target is numeric):
  • ReliefF (for regression),
  • F-test (filter method),
  • LASSO regression (embedded method) and
  • Sequential Feature Selection (wrapper method).
while for classification problems (ie when target is categorical):
  • ReliefF (for classification),
  • F-test/ANOVA,
  • Sequential Feature Selection.
Also, Sequential Feature Selection (using sequentialfs) is a robust and general-purpose method, as it works for both regression and classification, and can be paired with any model.
2. Which function should be used for feature selection?
MATLAB R2021a provides several functions for feature selection. The most general and commonly used are:
  • sequentialfs:Sequential feature selection for regression or classification.
  • relieff:Ranks features using the ReliefF algorithm.
  • lasso:Performs LASSO regression and selects features by shrinking coefficients to zero.
3. Below is the sample code that will help you achieve your results in MATLAB R2021a.
% Feature selection example for regression (MATLAB R2021a)
% 1. Load data from CSV
data = readmatrix('yourfile.csv'); % Replace with your actual filename
X = data(:, 1:9); % Features (first 9 columns)
Y = data(:, 10); % Target (last column)
% 2. Define regression model function for sequentialfs
fun = @(Xtrain, Ytrain, Xtest, Ytest) ...
mean((Ytest - predict(fitlm(Xtrain, Ytrain), Xtest)).^2);
% 3. Run sequential feature selection
opts = statset('display','iter'); % Show progress
[fs, history] = sequentialfs(fun, X, Y, 'cv', 5, 'options', opts);
% 4. Display selected features
disp('Selected feature columns:');
disp(find(fs));
% 5. Plot feature selection history
figure;
plot(history.Crit, 'o-');
xlabel('Number of features');
ylabel('Cross-validated MSE');
title('Feature selection history');
grid on;
Hope this helps!

Categorías

Más información sobre Software Development Tools en Help Center y File Exchange.

Etiquetas

Productos


Versión

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by