# Distribute a sum using binomial distribution

5 views (last 30 days)
Vorticella aequilata on 2 Dec 2021
Edited: David Goodmanson on 2 Dec 2021
I have a matrix say 'a' (3x6). According to the binomial distribution, I want to divide the sum of each row into a new matrix with two collumns (e.g. 'distribute1'). For this I use binornd but i want the values of the numbers generated to sum up to the value of the sum of each row. For example, let's say a row sums up to 5. The values generated by binornd should be either: 0 and 5, 1 and 4 or 2 and 3.
Is it correct to use binornd? Should I use binopdf?
If it helps, I am trying to make a script from a paper. This is wat it says: "the total RNAs in them [in my case it is equal to the sum of each row of 'a'] were distributed into two compartments [in my case 'distribute1,2 or 3'] according to the binomial distribution" 10.1038/s41559-018-0650-z
What is the logic for chosing the values of the n (no of trials) and p (probability of succes in each trial) input arguments? For now, I used the row sum as n and 0.5 as p. But I'm not sure if it is correct (these parameters are not mentioned in the paper).
Here is my code:
a=[1 2 3 4 5 6; 7 8 9 10 11 12; 13 14 15 16 17 18]
suma=sum(a,2)
distribute1=binornd(suma(1,:),0.5,[1 2])
distribute2=binornd(suma(2,:),0.5,[1 2])
distribute3=binornd(suma(3,:),0.5,[1 2])
Than you!

David Goodmanson on 2 Dec 2021
Edited: David Goodmanson on 2 Dec 2021
Hi Vorticella,
In the code you are using, the two numbers are chosen independently and usually don't add up to be the row sum. if a given row sums up to n, then I believe the required result is
a = binornd(n,.5,1)
result = [a n-a]
Here binornd picks the number of times the left column is chosen, and the remainder are put into the right column. You still have to decide on the probability p, and in the absence of any other information, p = .5 is more or less implied.

Jeff Miller on 2 Dec 2021
If I understand correctly,
• it is fine to use binornd with n = row total. It sounds like that is what was done in the paper.
• just use binornd to get the number in the first compartment for each row; the number in the second compartment is the row n minus the number in the first compartment.
• it makes sense to use p = 0.5 if you expect both compartments to get the same number in the long run. If that doesn't make sense in your situation, you might make p higher/lower to get more/less in the first compartment.
Hope that helps