about high dimension and low sample using PCA
1 visualización (últimos 30 días)
Mostrar comentarios más antiguos
I am using PCA to detect the abnormality in time-series data. Currently I have high dimension and low sample dataset (15*530 data matrix). I am wondering if I can use PCA to obtain the statistic such as T^2 and SPE. I noticed that some articles stated that it is improper to use PCA to obtain the statistics under such case.
0 comentarios
Respuestas (1)
Aditya
el 27 de Jun. de 2024
Using PCA to detect abnormalities in time-series data, especially with a high-dimensional and low-sample dataset, can be challenging. The primary concern is that PCA may not provide reliable results when the number of features (dimensions) significantly exceeds the number of samples. This is because PCA relies on the covariance matrix, which can be poorly estimated in such scenarios.
% Simulate high-dimensional, low-sample data
rng(0);
DATASET = rand(15, 530);
% Apply PCA
[coeff, score, latent] = pca(DATASET);
% Calculate T² statistic
T2 = sum((score ./ sqrt(latent')).^2, 2);
% Calculate SPE (Q-statistic)
reconstructed = score * coeff';
SPE = sum((DATASET - reconstructed).^2, 2);
% Set threshold for T² and SPE (e.g., 95% confidence level)
alpha = 0.05;
T2_threshold = chi2inv(1 - alpha, size(coeff, 2));
SPE_threshold = prctile(SPE, 95);
% Detect abnormalities
abnormal_T2 = T2 > T2_threshold;
abnormal_SPE = SPE > SPE_threshold;
disp('Abnormalities detected by T²:');
disp(abnormal_T2);
disp('Abnormalities detected by SPE:');
disp(abnormal_SPE);
0 comentarios
Ver también
Categorías
Más información sobre Dimensionality Reduction and Feature Extraction en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!