Hi All, I appreciate your help with this. Sorry for the tag if you are busy please ignore it.
Best match location data - two tables
    5 visualizaciones (últimos 30 días)
  
       Mostrar comentarios más antiguos
    
    AHMED FAKHRI
 el 9 de Jun. de 2021
  
Hi 
I have location data (longitude and latitude) for certain sites and I want to compare these two with a national databse to know the local authority of these sites + their postcodes. 
I have two tables: 
Sitename	                Latitude	       Longitude
14182-Pembroke Refinery	    51.686081	       -5.0271217
8059-New Rugby	            52.376283	       -1.2860374
8037-Ketton	                52.636537	      -0.54737666
And
Postcode	Local authority	  Longitude	   Latitude
CV21 2RY	Rugby	          -1.288555	52.376856
TR26 2JQ	Cornwall	      -5.490944	50.207428
SY10 7RN	Shropshire	     -3.067703	52.917641
SA71 5SJ	Pembrokeshire	  -5.02713	51.686093
PE9 3SX	    Rutland          -0.5462	52.636828
By best matching both the latitude and longitude data from the first table with the second, the postcode and local authority will be known. 
please note that the longitude and latitude are NOT exactly matched but will differ slightly hence I want the best match possible based on both info. 
Please see the Excel sheet which contain all the data.
I appreciate your help with this. 
Many thanks 
Respuesta aceptada
  Image Analyst
      
      
 el 10 de Jun. de 2021
        I'm not one of those 3 talented individuals you asked for, but maybe you'll like my answer.  See if it's what you want.  If so, please accept the answer.  If not, tell me what needs to be different.
%---------------------------------------------------------------------------------
% Create reference Look Up Table
% Postcode	Local authority	  Longitude	   Latitude
% CV21 2RY	Rugby	          -1.288555	52.376856
% TR26 2JQ	Cornwall	      -5.490944	50.207428
% SY10 7RN	Shropshire	     -3.067703	52.917641
% SA71 5SJ	Pembrokeshire	  -5.02713	51.686093
% PE9 3SX   Rutland          -0.5462	52.636828
lookUpTable = table('Size', [5, 4], ...
	'Variabletypes', {'string', 'string', 'double', 'double'}, ...
	'VariableNames', {'Postcode', 'LocalAuthority', 'Longitude', 'Latitude'});
lookUpTable.Postcode = ["CV21 2RY"; "TR26 2JQ"; "SY10 7RN"; "SA71 5SJ"; "PE9 3SX"];
lookUpTable.LocalAuthority = ["Rugby"; "Cornwall"; "Shropshire"; "Pembrokeshire"; "Rutland"];
lookUpTable.Longitude = [-1.288555; -5.490944; -3.067703; -5.02713; -0.5462];
lookUpTable.Latitude = [52.376856; 50.207428; 52.917641; 51.686093; 52.636828]
%---------------------------------------------------------------------------------
% Create a table of unknown sites that we will try to find in the reference table.
% Sitename	                Latitude	       Longitude
% 14182-Pembroke Refinery	    51.686081	       -5.0271217
% 8059-New Rugby	            52.376283	       -1.2860374
% 8037-Ketton	                52.636537	      -0.54737666
tableOfUnknowns = table('Size', [3, 3], ...
	'Variabletypes', {'string', 'double', 'double'}, ...
	'VariableNames', {'Sitename', 'Longitude', 'Latitude'});
tableOfUnknowns.Sitename = ["14182-Pembroke Refinery"; "8059-New Rugby"; "8037-Ketton"];
tableOfUnknowns.Longitude = [-5.0271217; -1.2860374; -0.54737666];
tableOfUnknowns.Latitude = [51.686081; 52.376283; 52.636537]
%---------------------------------------------------------------------------------
% Run down the table of unknowns seeing if it's reasonably close
% to any row of the reference look up table.
rows = height(tableOfUnknowns);
for row = 1 : rows
	fprintf('Processing row #%d of %d.\n', row, rows);
	% Find the distance (radius) of this row to all the reference rows.
	deltaLat = lookUpTable.Latitude - tableOfUnknowns.Latitude(row);
	deltaLon = lookUpTable.Longitude - tableOfUnknowns.Longitude(row);
	distances = sqrt(deltaLat .^ 2 + deltaLon .^ 2);
	% Find the min distance
	[minDistance, rowOfMin] = min(distances);
	closestPostcode = lookUpTable.Postcode(rowOfMin);
	closestLocalAuthority = lookUpTable.LocalAuthority(rowOfMin);
	fprintf('    The closest Local Authority to %s is %s, which has Postcode %s.\n    They are separated by %f degrees.\n', ...
		tableOfUnknowns.Sitename(row), closestLocalAuthority, closestPostcode, minDistance);
end
You'll see
lookUpTable =
  5×4 table
     Postcode     LocalAuthority     Longitude    Latitude 
    __________    _______________    _________    _________
    "CV21 2RY"    "Rugby"            -1.288555    52.376856
    "TR26 2JQ"    "Cornwall"         -5.490944    50.207428
    "SY10 7RN"    "Shropshire"       -3.067703    52.917641
    "SA71 5SJ"    "Pembrokeshire"     -5.02713    51.686093
    "PE9 3SX"     "Rutland"            -0.5462    52.636828
tableOfUnknowns =
  3×3 table
            Sitename              Longitude     Latitude 
    _________________________    ___________    _________
    "14182-Pembroke Refinery"     -5.0271217    51.686081
    "8059-New Rugby"              -1.2860374    52.376283
    "8037-Ketton"                -0.54737666    52.636537
Processing row #1 of 3.
    The closest Local Authority to 14182-Pembroke Refinery is Pembrokeshire, which has Postcode SA71 5SJ.
    They are separated by 0.000015 degrees.
Processing row #2 of 3.
    The closest Local Authority to 8059-New Rugby is Rugby, which has Postcode CV21 2RY.
    They are separated by 0.002582 degrees.
Processing row #3 of 3.
    The closest Local Authority to 8037-Ketton is Rutland, which has Postcode PE9 3SX.
    They are separated by 0.001212 degrees.
4 comentarios
  Adam Danz
    
      
 el 10 de Jun. de 2021
				
      Editada: Adam Danz
    
      
 el 10 de Jun. de 2021
  
			My solution using pdist2 won't work with 1.7M rows.
However, it's technically incorrect to use euclidean distance (in ImageAnalyst's answer) with lat/lon which are spherical coordinates. The coordinates should be converted to Cartesian before computing euclidean distance or the distance should be computed in spherical units (Haversine formula, for example). However since you're using relative distance and not absolute distance you may not run into any problems unless a group of coordinates are close enough to each other to be affected by the Cartesian assumption.
  Image Analyst
      
      
 el 10 de Jun. de 2021
				I agree with Adam.  I know computing distance that way is not the most precise way (perhaps the Mapping Toolbox has a more accurate formula).  However I think the distances are small enough (close together on the globe), and actually I assumed the main need was not to find the distances super accurately, but to find the closest Local Authority.  Since the distances will likely not be close to one another, accuracy doesn't matter for simply finding the closest location.  It will find the closest one regardless if the distances are somewhat inaccurate.
Más respuestas (1)
  Adam Danz
    
      
 el 10 de Jun. de 2021
        I'd use pdist2 to compute the pair-wise distances between your list of coordinates and the list in the database to form a matrix "D" of distanaces (they will all be positive). For each row (or column, depending on how you enter the inputs to pdist2), you'll look for the index of the minimum using [~,minidx]=min(D) or min(D,2).
However, from what I understand of map coordinates, they are not linear so you'll need to convert them to cartesian coordinates using latlon2local i think.  Alternatively, instead of using pdist2 you could use distance to compute distance on the sphere.  
I do not have experience with either method other than using pdist2 with cartesian coordinates.  
If you give it a shot and run into a specific problem I might  be able to give further advice. 
Ver también
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!


