How to read a text or a .c file without comments into a cell array ?

Question

0 votos

I have a c file name 'function.c'

It have code and comments as well comments are noted with '/*' '*/' Or '//' OR sometimes '%'

I just have to read the content of code not the comments

I have tried : 1) fid=fopen('function.c') code = textscan(fid,'%s','CommentStyle',{'/*','*/'})

but this help me in excluding comments in between '/*' and '*/'

2) fid=fopen('function.c') code = textscan(fid,'%s','CommentStyle','//')

this excludes only comments with '//'

Are there any ways to excludes both types of comments ??

1 comentario
Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

dpb el 18 de Sept. de 2013

Editada: dpb el 18 de Sept. de 2013

Don't think so in on go...best I can think of is to read it with the first option and write that as a temporary file then process that file with the second.

I think textscan will only handle the case of '/*' '*/' pairs if they're first on the line, however, so perhaps you could get the same effect by reading the whole file as a char() array w/ fread, do a global substitution for strrep('/*','//') and then process that from memory?

Failing that, since one would presume c source files aren't terribly long, read line-by-line w/ fgetl and do the search on each line for either. This will let you get the end of line comments to as a side benefit to make up for the extra complexity...

ADDENDUM:

I reviewed the doc's for textscan; looks like it would catch the /* */ sequences correctly; what I recalled as beginning of line is actually written/documented as beginning of field.

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Follow Question

Answer 1

Jan el 19 de Sept. de 2013

Editada: Jan el 19 de Sept. de 2013

Abrir en MATLAB Online

0 votos

The problem is not uniquely defined. How can you decide which comment style is used? What will happen for:

this is the contents /* and an intermediate // C++ comment */

Now removing the C++ comment at first will invalidate the file. The same happens, when % comments appear also. Another example:

this is contents  // Here a C comment starts /*
this is contents  // Here the C comment ends */

Does textscan consider quoted strings?

fprintf("This is not a comment: // %s\n", "hello");

By the way, if you mean Matlab comments, consider {% %} and ... also, because both start comments as % does.

Parsing comments is not trivial. The most reliable technique is to let Matlab's editor display the text with syntax highlighting, capture the screen, delete green pixels, use an OCR tool to parse the rest. ;-)

2 comentarios
Mostrar Ninguno Ocultar Ninguno

dpb el 19 de Sept. de 2013

Is the Matlab editor reliable for comment parsing in C/C++ code?

Jan el 19 de Sept. de 2013

Editada: Jan el 19 de Sept. de 2013

Yes. This has been added between 6.5 and R2008a. The editor even creates C-comments for Ctrl-R and removes C and C++ comments for Ctrl-T.

Perhaps even the undocumented SyntaxTextPane can display this, if the code type is set correctly.

[EDITED] Yes, see http://undocumentedmatlab.com/blog/syntax-highlighted-labels-panels/

Iniciar sesión para comentar.

Answer 2

Cedric el 19 de Sept. de 2013

Editada: Cedric el 19 de Sept. de 2013

Abrir en MATLAB Online

1 voto

What about a simple..

 buffer = fileread( 'function.c' ) ;
 buffer = regexprep( buffer, '/\*.*?\*/', '' ) ;    
 buffer = regexprep( buffer, '//.*?(\r)?\n', '$1\n' ) ;

which seems to be working in standard situations. Of course, managing all situations including pathological ones would require more work, but this has the advantage to be simple and you might not need to manage pathological cases, especially if you wrote these C files yourself, consistently and with rigor.

PS: I tested it with the following content

 NotComment 1 /* Comment */
 // Comment
 /* Comment */ NotComment 2
 NotComment 3 // Comment
 /* Comment // Comment */ NotComment 4
 /* Comment
    Comment // Comment
    Comment */
 NotComment 5

2 comentarios
Mostrar Ninguno Ocultar Ninguno

Jan el 20 de Sept. de 2013

@Cedric: Providing your test data is a rock solid definition of the capabilities of your code. While regexp expressions always look like somebody has rolled an angry armadillo over the keyboard, the test data are clear due to their simplicity. +1

Cedric el 20 de Sept. de 2013

Abrir en MATLAB Online

Thank you Jan. Well, providing the test data shows that I am not managing this situation that you listed

 this is contents  // Here a C comment starts /*
 this is contents  // Here the C comment ends */

We could actually, but I'll see what the OP answers before making patterns more complex.

Iniciar sesión para comentar.

Answer 3

Simon el 19 de Sept. de 2013

Abrir en MATLAB Online

0 votos

Hi!

I usually do it this way:

* read full file with textscan
* define cell array of comment strings
* use regexp to find all lines with a comment string in it (or in front of)
* delete the lines found

2 comentarios
Mostrar Ninguno Ocultar Ninguno

Jan el 19 de Sept. de 2013

ignore comment keys in quotes strings
consider multi-line comments

Simon el 19 de Sept. de 2013

Yes, my list was not complete. I'm sure it can be enlarged with many more entries! It's easier if you have a fixed-format source code like I usually have with Fortran 77. You just search for "C" in column 1 ...

Iniciar sesión para comentar.

How to read a text or a .c file without comments into a cell array ?

1 comentario
Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

Respuesta aceptada

2 comentarios
Mostrar Ninguno Ocultar Ninguno

Más respuestas (2)

2 comentarios
Mostrar Ninguno Ocultar Ninguno

2 comentarios
Mostrar Ninguno Ocultar Ninguno

Categorías

Etiquetas

Community Treasure Hunt

How to read a text or a .c file without comments into a cell array ?

1 comentario Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

Respuesta aceptada

2 comentarios Mostrar Ninguno Ocultar Ninguno

Más respuestas (2)

2 comentarios Mostrar Ninguno Ocultar Ninguno

2 comentarios Mostrar Ninguno Ocultar Ninguno

Categorías

Etiquetas

Ver también

Community Treasure Hunt

1 comentario
Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

2 comentarios
Mostrar Ninguno Ocultar Ninguno

2 comentarios
Mostrar Ninguno Ocultar Ninguno

2 comentarios
Mostrar Ninguno Ocultar Ninguno