function with multiple datasets

8 visualizaciones (últimos 30 días)
Chriss
Chriss el 29 de Mayo de 2017
Editada: dpb el 29 de Mayo de 2017
I have been asked to make af function which calculates the total and average tax payed in 15 different states. My data is imported from excel and looks like:
1 (5000x4 cell)
'Serialnumber' 'Name' 'Gender' 'State'
'4654194314624' 'Sara ' 'M' [ 14.00]
'4565196234864' 'Michael Sims ' 'M' [ 4.00]
'6548197914565' 'Mazon Watt   ' 'M' [ 7.00]
2 (5000x2 cell)
'Serialnumber' 'Income'
'4567456766023' [ 76500.00]
'2567467775130' [ 90750.00]
3 (16x4 cell):
'State' 'lower tax' 'tax limit' 'higher tax'
'7' [27.00] [ 55900.00] [51.00]
'8' [25.00] [ 55100.00] [45.00]
'9' [26.00] [ 56000.00] [45.00]
'10' [28.00] [ 54000.00] [43.00]
To calculate the average and total tax i need all three files, but i haven't been able to see through it yet. I thought about creating a new cell containing all relevant data. I also tried the following, but that would only give me the total for a single state. Any simple ways to go about it?
for i=2:length(Income)
if strcmp(Income(i,1),Citizens(i,1))
personalIncome = Income{i,2};
region = Citizens{i,4};
end
if Citizens{i,4}==1
personalincome=Income{i,2};
incomesum=incomesum+personalincome;
people=people+Citizens{i,4};
end
for v=2:length(Taxes)
if strcmp(Taxes(v,1),num2str(region))
tax1=Taxes{v,2};
tax2lim=Taxes{v,3};
tax2=Taxes{v,4};
end
end
if incomesum<=Taxes{v,3}
taxesPaid = incomesum*tax1/100;
else
taxesPaid = taxesPaid+((tax2lim*tax1/100)+((personalincome-tax2lim)*tax2/100))
end
tax=incomesum/people*taxesPaid;
totaltax=totaltax+taxesPaid;
  6 comentarios
dpb
dpb el 29 de Mayo de 2017
Ewww...guess should've realized it would have been given the subject and data. :(
Well, didn't provide the total solution in "one swell foop" but tried to make sure Chriss did enough to be able to reproduce on own from some sample (similar to yours albeit taken a step further).
Seems great reluctance here to actually dive into Matlab itself rather than continue to poke at it from around the edges... :)
Chriss
Chriss el 29 de Mayo de 2017
spot on Guillaume. guess i am not the only one having problem with that part of it. And sorry if you are offended dpb, but i only wrote the question after i had tried a dosen of different things (not table though). We have not been taught about table and as far as we have heard, use global for almost everything :-)

Iniciar sesión para comentar.

Respuestas (1)

dpb
dpb el 29 de Mayo de 2017
Editada: dpb el 29 de Mayo de 2017
Repeating with only some minor changes from the previous to get you started...begin with the script to read the files and create the tables--this is what I showed earlier excepting I'd just created some local copies of text files that copied some of your data. The names in the spreadsheet may not be very handy to use from your earlier posting as some are pretty long and included spaces that'll get munged on by readtable to make acceptable variable names so you'll probably want to clean those up some to make easier to use.
CIT=readtable('Citizens.xlsx');
INC=readtable('Income.xlsx');
TAX=read('Taxes.xlsx');
"readtable creates one variable in T for each column in the file and reads variable names from the first row of the file. By default, the variables created are double if the entire column is numeric, or cell arrays of strings if any element in a column is not numeric."
Hence, the SSN, state/region and the tax data are all going to be numeric while the remaining will be cell strings. But that should be almost all it takes to make the table. You may find the SSN is displayed as floating point value given its magnitude; that can be solved by storing as int32 or categorical as chose in the earlier example.
Now, we've already discussed the desirability of merging the income data with that for the indivudals...
CIT_INC=join(CIT,INC); % that was easy, wasn't it???!!!
Now what do we need to compute? There's the tax for each individual based on income and region. Showed a function to do that on an individual basis, how to do it for all and then by groups is where the other functions come in.
You can either use those builtin features or use the routines as I wrote earlier that work on a given SSN and vectorize them to work over an array of SSN and then pass that based on the state. Either is pretty straightforward and requires only a minimal amount of code and the desired subsets can be selected by the '==' operator as I demonstrated before with either the table or the use of the array as long as you don't try to mix string and numeric representation of the same values (primarily the state/region was the culprit).
  8 comentarios
dpb
dpb el 29 de Mayo de 2017
I'd wager without you can rewrite the functionality of all your code in about 10-15 lines at most. You've got cells; that's where your problems start because you haven't normalized them to be consistent.
All it takes to eliminate the globals is to pass the tables as I've shown already.
Chriss
Chriss el 29 de Mayo de 2017
i'd wish i could rewrite 10 function files in 10 minutes - and even to table! I have never really used the table format before, and am still quite new to MATLAB so to rewrite it would probably take me all night. and i have already not slept for 40 hours.

Iniciar sesión para comentar.

Categorías

Más información sobre Data Type Identification en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by