removing specified data from variable
6 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
I have a 100x2 dataset I am working with. I also have 2 random distributions of data.
I want to modify my original dataset in the following way:
- randomly generate a number from random distribution 1 and keep this many rows of the data.
- randomly generate a number from random distribution 2 and remove this many rows of the data
I want to do this for the full length of the dataset.
Can anybody help me define this?
time = [1:1:100];
var = rand(100,1);
data = [time' var]; %dataset
dist1 = 1 + (20-1).*rand(100,1); %random distribution 1
dist2 = 10 + (30-10).*rand(100,1); %random distribution 2
position1 = randi(length(dist1));
card1 = dist1(position);
position2 = randi(length(dist2))l
card2 = dist2(position);
8 comentarios
Davide Masiello
el 7 de Nov. de 2022
Is there a maximum amount of rows that are kept or removed each time?
Respuestas (2)
Davide Masiello
el 7 de Nov. de 2022
Editada: Davide Masiello
el 7 de Nov. de 2022
I think the following code is a simpler way of achieving your task, but it does not implement the "pulling a number from a random distribution", because honestly I still do not understand what that would be for.
Instead, at each iteration it generates a random integer (max 20) and that would be the new increment of rows to either keep or remove.
See below the code with printed text describing the action at each iteration.
data = [(1:100)' rand(100,1)] % Dataset
datanew = [];
distribution1 = randi(100,100,1); % Array of random integers (to be replaced with gaussian distribution later)
distribution2 = randi(100,100,1); % Array of random integers (to be replaced with gaussian distribution later)
index = 0;
iter = 1;
while index < size(data,1)
fprintf('This is iteration number %d.\n',iter)
if isequal(mod(iter,2),1)
increment = min(distribution1(randi(length(distribution1),1,1)),size(data,1)-index);
fprintf('The random number is %d.\n',increment)
fprintf('We keep the rows between %d and %d.\n',[index+1,index+increment])
datanew = [datanew;data(index+1:index+increment,:)];
else
increment = min(distribution2(randi(length(distribution2),1,1)),size(data,1)-index);
fprintf('The random number is %d.\n',increment)
fprintf('The rows between %d and %d do not get added to the new dataset.\n',[index+1,index+increment])
end
iter = iter+1;
index = index+increment;
end
size(data)
size(datanew)
5 comentarios
Davide Masiello
el 7 de Nov. de 2022
But why do you first generate a random distribution and then randomly take a value from it?
How is this different from just generating a random number.
I.e.
how is this
distribution1 = randi(10,100,1); % array of 100 random integers from (max val. = 10)
a = distribution1(randi(100,1,1)) % integer randomly pulled from distribution 1
different from this
a = randi(10,1,1) % random integer between 1 and 10
Davide Masiello
el 7 de Nov. de 2022
Ok I see now, sorry I must have skipped that part.
I have modified my answer so that the number of rows to keep/remove is pulled randomly from the vectors which I called distribution1 and distribution2.
These are random vectors, you can replace them with the gaussian distributions at your discretion.
Ver también
Categorías
Más información sobre Random Number Generation en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!