Extracting data from pdf files
    28 visualizaciones (últimos 30 días)
  
       Mostrar comentarios más antiguos
    
    joseph Frank
      
 el 19 de Abr. de 2014
  
    
    
    
    
    Respondida: Christopher Creutzig
    
 el 27 de Abr. de 2021
            Hi,
I have around 300 pdf files with 19 pages each. I want to extract from each of them a fraction of a table on page 4 in order to build a research data set. Is i possible to do so using matlab? if so,which toolboxes and functions I need. I have matlab 2013a.
0 comentarios
Respuesta aceptada
  Kristian Gennaci
    
 el 21 de Abr. de 2014
        Hi Joseph,
Have you tried using this File Exchange submission?
This seems like the most promising solution. Alternatively, if you could convert the tables to an excel spreadsheet/CSV format, they can then easily be parsed using MATLAB's Excel/CSV functions:
I'll let you know if I find any other solutions.
Best,
Kristian
0 comentarios
Más respuestas (1)
  Christopher Creutzig
    
 el 27 de Abr. de 2021
        JFTR, since R2017b, extractFileText('filename.pdf','Pages',4) from Text Analytics Toolbox gives you the text on ("physical") page 4 of the PDF, from which you can then extract the parts you need with string operations (extractBetween, regexp, etc.).
0 comentarios
Ver también
Categorías
				Más información sobre Characters and Strings en Help Center y File Exchange.
			
	Productos
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!


