Do predictive variables need to be standardised before applying PCA in classification learner?

Question

Impala el 16 de Abr. de 2024 a las 12:47

0
Enlazar

Enlace directo a esta pregunta

https://es.mathworks.com/matlabcentral/answers/2107486-do-predictive-variables-need-to-be-standardised-before-applying-pca-in-classification-learner

Comentada: Taylor el 17 de Abr. de 2024 a las 13:25

Hi,

I have a feature table and I'm looking to apply PCA in classification learner. Do I need to first standardise my feature table before I import the data into classification learner to apply PCA and train models?

Reason I ask is that if I want to perform PCA e.g. in the live editor, it is usually recommended to standardise data using the zscore function and then apply PCA to generate pareto, biplots etc. However, the classification learner doesn't seem to do this. Additionally, when I look at the fisheriris dataset examples in the help files, it looks like the data is not standardised before being imported to the classification learner (https://uk.mathworks.com/help/stats/feature-selection-and-feature-transformation.html).

Any guidance will be really appreciated!

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Taylor el 16 de Abr. de 2024 a las 16:04

0
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/2107486-do-predictive-variables-need-to-be-standardised-before-applying-pca-in-classification-learner#answer_1442386

Yes, it is generally recommended to standardize data before applying PCA, especially when the variables in your dataset are measured on different scales or have different units of measurement.

Standardization transforms the data to have a mean of zero and a standard deviation of one. This process ensures that each variable contributes equally to the analysis and prevents variables with larger scales from dominating the variance explained by the principal components.

Skipping the step of standardization when applying PCA might be appropriate or necessary in certain contexts. In the case of the fisheriris dataset, it is because the data has homogeneous units and scales. If all variables in your dataset are measured in the same units and have approximately the same scale and variance, standardization might not be necessary. In such cases, the original scales of the variables are meaningful, and preserving these scales might be important for interpretation. For example, if all variables are measurements of length in centimeters, standardizing would remove this common scale, potentially making the results less interpretable.

2 comentarios
Mostrar NingunoOcultar Ninguno

Impala el 16 de Abr. de 2024 a las 21:07

Hi Taylor,

Thank you for your explanation. So if I standardise my feature table using zscore, and create a new feature table, say called featTable_stand - is that the table I then use in classification learner to train models?

Thank you!

Taylor el 17 de Abr. de 2024 a las 13:25

Correct. If your data requires standardizing, that is an appropriate workflow.

Iniciar sesión para comentar.

Do predictive variables need to be standardised before applying PCA in classification learner?

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

2 comentarios
Mostrar NingunoOcultar Ninguno

Más respuestas (0)

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

Do predictive variables need to be standardised before applying PCA in classification learner?

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

2 comentarios Mostrar NingunoOcultar Ninguno

Más respuestas (0)

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

2 comentarios
Mostrar NingunoOcultar Ninguno