How to assign numbers to categorical values in a dataset?

6 visualizaciones (últimos 30 días)
pp
pp el 18 de Mayo de 2020
Comentada: Adam Danz el 18 de Mayo de 2020
I'm preparing a dataset for machine learning. The dataset contains a column name "Holiday". The column contains more than a million row of values. It is categorical in nature and contains 4 unique values - 0 (as a string), a, b, c.
I want to assign the values 0 to 0 and 1 to the rest of them - a, b and c. How do I do that? Is there a readymade function?

Respuesta aceptada

Adam Danz
Adam Danz el 18 de Mayo de 2020
Editada: Adam Danz el 18 de Mayo de 2020
If you want to return logical values,
dummyVars = Holiday ~= '0'; % Holiday is categorical
If you want to return integer values,
dummyVars = double(Holiday ~= '0'); % Holiday is categorical
Note that any value of Holiday that doesn't equal 0 will be assigned a value of 1.
  4 comentarios
pp
pp el 18 de Mayo de 2020
Thanks! That did the job. Is it possible to extend this so that we can assign other numbers to a, b and c? Let's say 1, 2 and 3?
Adam Danz
Adam Danz el 18 de Mayo de 2020
In that case, you can use
[groups, groupID] = findgroups(Holiday)
or
[groupID, groups] = grp2idx(a); % requires stats & ML toolbox

Iniciar sesión para comentar.

Más respuestas (0)

Productos


Versión

R2019b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by