How can I restrict data arrays to do a linear regression between 2 points?
5 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
Lexie Wilson
el 19 de Jul. de 2016
Comentada: Star Strider
el 20 de Jul. de 2016
I would like to do a linear regression using polyfit, but only on part of the dataset. I have 2 arrays, Wavelength (x axis) and Flux (y axis). I would like to regress the data in the range of Wavelength >1515 & Wavelength < 1750, and then find the slope of the trend line that unites the fluxes (y values) in this range. I do to know how to restrict my data set in this way (without importing the data again!). I tried scaling my axes, but the polyfit function still considered all values in my dataset.
Here is what I have so far:
if true
% code
%%Initialize variables.
filename = '/Users/lexiwilson/Documents/SURF/DataIrradiance/DEC/WSD_26DEC/WAIS1226201500166.asd.irr.pco.txt';
delimiter = {'\t',' '};
startRow = 39;
datetime = strcat('/Users/lexiwilson/Documents/SURF/DataIrradiance/DEC/WSD_26DEC/','122615_','00:33:56');
%%Read columns of data as strings:
% For more information, see the TEXTSCAN documentation.
formatSpec = '%s%s%[^\n\r]';
%%Open the text file.
fileID = fopen(filename,'r');
%%Read columns of data according to format string.
% This call is based on the structure of the file used to generate this
% code. If an error occurs for a different file, try regenerating the code
% from the Import Tool.
textscan(fileID, '%[^\n\r]', startRow-1, 'ReturnOnError', false);
dataArray = textscan(fileID, formatSpec, 'Delimiter', delimiter, 'MultipleDelimsAsOne', true, 'ReturnOnError', false);
%%Close the text file.
fclose(fileID);
%%Convert the contents of columns containing numeric strings to numbers.
% Replace non-numeric strings with NaN.
raw = [dataArray{:,1:end-1}];
numericData = NaN(size(dataArray{1},1),size(dataArray,2));
for col=[1,2]
% Converts strings in the input cell array to numbers. Replaced non-numeric
% strings with NaN.
rawData = dataArray{col};
for row=1:size(rawData, 1);
% Create a regular expression to detect and remove non-numeric prefixes and
% suffixes.
regexstr = '(?<prefix>.*?)(?<numbers>([-]*(\d+[\,]*)+[\.]{0,1}\d*[eEdD]{0,1}[-+]*\d*[i]{0,1})|([-]*(\d+[\,]*)*[\.]{1,1}\d+[eEdD]{0,1}[-+]*\d*[i]{0,1}))(?<suffix>.*)';
try
result = regexp(rawData{row}, regexstr, 'names');
numbers = result.numbers;
% Detected commas in non-thousand locations.
invalidThousandsSeparator = false;
if any(numbers==',');
thousandsRegExp = '^\d+?(\,\d{3})*\.{0,1}\d*$';
if isempty(regexp(thousandsRegExp, ',', 'once'));
numbers = NaN;
invalidThousandsSeparator = true;
end
end
% Convert numeric strings to numbers.
if ~invalidThousandsSeparator;
numbers = textscan(strrep(numbers, ',', ''), '%f');
numericData(row, col) = numbers{1};
raw{row, col} = numbers{1};
end
catch me
end
end
end
%%Replace non-numeric cells with NaN
R = cellfun(@(x) ~isnumeric(x) && ~islogical(x),raw); % Find non-numeric cells
raw(R) = {NaN}; % Replace non-numeric cells
%%Allocate imported array to column variable names
Wavelength = cell2mat(raw(:, 1));
Flux = cell2mat(raw(:, 2));
%%Plot wavelength vs irradiance
figure()
plot(Wavelength, Flux);
title(filename);
xlabel('Wavelength (nm)');
ylabel('Irradiance (W/m^2)');
axis([350,2200,-0.5,2]);
%zoom to 1.6 micron window
figure()
plot(Wavelength, Flux);
title(filename);
xlabel('Wavelength (nm)');
ylabel('Irradiance (W/m^2)');
axis([1374,1838,-0.05,0.15]);
Ystartindx = find(Wavelength == 1515); %index of wavelength = 1515nm
Ystart = Flux(Ystartindx); %corresponding flux
Yendindx = find(Wavelength == 1750); %index of wavelength = 1750nm
Yend = Flux(Yendindx);%corresponding flux
hold on;
%make linear fit and print slope to console
waverange = find(Wavelength > 1515 & Wavelength < 1750);
fluxrange = find(Flux > Ystart & Flux < Yend);
P = polyfit(waverange,fluxrange,1);
fit = P(1)*waverange + P(2);
plot(waverange,fit,'k');
disp(P(1)); %print slope to console
%save plot in directory as jpeg
%saveas(gcf,datetime,'jpeg');
%%Clear temporary variables
clearvars filename delimiter startRow formatSpec fileID dataArray ans raw numericData col rawData row regexstr result numbers invalidThousandsSeparator thousandsRegExp me R;
end
There errors I get claim that my arrays waverange & fluxrange are not the same size (which, they aren't). How can I make them the same size, and restrict the X & Y values to a range in the middle of my data set?
0 comentarios
Respuesta aceptada
Star Strider
el 20 de Jul. de 2016
The waverange seems to be defining your data range, so use it for both, and use polyval to evaluate the fit:
%make linear fit and print slope to console
waverange = find(Wavelength > 1515 & Wavelength < 1750);
P = polyfit(Wavelength(waverange),Flux(waverange),1);
fit = polyval(p, Wavelength(waverange));
See if that works.
2 comentarios
Más respuestas (0)
Ver también
Categorías
Más información sobre Data Distribution Plots en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!