Find count of repeated letters (sequence)

Sir,
How to find the no. of repeated sequence (letters) in the given sentence.
for example, a="I want THAAAAAT APPPPPLE ):):): totally unprepared";
The No. of repeated sequences are: 3
ie.,
1. THAAAAAT
2. APPPPPLE
3. ):):):
thanks

4 comentarios

Walter Roberson
Walter Roberson el 8 de Oct. de 2013
'THAAAAAT' is not a repeated sequence. It contains a repeated sequence.
Could you confirm that the sequences can be more than just adjacent letters such as the A's in THAAAAAT ? The '):' repeating as a unit is intended to be noticed?
Jothi
Jothi el 9 de Oct. de 2013
sir,
Repeated sequence is not an adjacent letter. It can be any letter or special character continuously repeated more than two times.
ie., In the word THAAAAAT a letter A is continuously repeated more than two times.
similarly, In the word APPPPPLE a letter P is continuously repeated more than two times.
how to find this.
thank you.
Walter Roberson
Walter Roberson el 9 de Oct. de 2013
Your #3, ):):): does not have continuously repeated symbols.
If the repeated sequences are to be identified, then why would all of THAAAAAT be output, and not just AAAAA ?
Jothi
Jothi el 9 de Oct. de 2013
:) this symbol indicates one type of emotion symbol (positive emotion).
I don't want the output as string just find the no. of repeated sequences are appeared in the given sentence. ie.,
input is,
a="I want THAAAAAT APPPPPLE ):):): totally unprepared";
output is,
No. of repeated sequences are: 3

Iniciar sesión para comentar.

Respuestas (2)

Cedric
Cedric el 8 de Oct. de 2013
Editada: Cedric el 8 de Oct. de 2013
Try to understand the following and fine-tune it to your needs:
n = sum( diff([0, diff(a)==0]) == 1 )
In particular, evaluate
diff(a)==0
and see how your problem actually translates into counting clusters of the outcome of diff(a)==0.

4 comentarios

Cedric
Cedric el 9 de Oct. de 2013
Editada: Cedric el 9 de Oct. de 2013
Just to be sure, the repeated sequences are:
AAAAA
PPPPP
ll (in 'totally')
is that right? If so, then my answer works.
Jothi
Jothi el 10 de Oct. de 2013
sir,
the repeated sequences are more than two.
AAAAA
PPPPP
ll - not more than two (in 'totally')
thank you sir.
Cedric
Cedric el 10 de Oct. de 2013
Editada: Cedric el 10 de Oct. de 2013
You seem to indicate that one repeated sequence is '):'. As far as I am concerned, there is no simple generic solution if you want to detect repeated, arbitrary patterns. To illustrate,
'AABBCCDDEEFFAABBCCDDEEFF'
Here, repeated patterns are
'AA', 'BB', .., 'FF', 'AABB', 'BBCC', .., 'AABBCC', 'BBCCDD', ..,
'AABBCCDD', 'BBCCDDEE', .., 'AABBCCDDEEFF'
Using regular expressions, we can probably get some solution but it will be prohibitively time consuming.
Sean de Wolski
Sean de Wolski el 10 de Oct. de 2013
Yeah, every emoticon would have to be predefined. For the chatroom we use here there are even word emoticons like (b) which inserts a frosty beer mug or (ply) which inserts an image of a playing card.

Iniciar sesión para comentar.

Walter Roberson
Walter Roberson el 1 de Dic. de 2016
a='I want THAAAAAT APPPPPLE ):):): totally unprepared';
regexp(a, '(.+)\1{2,}', 'match')
ans =
1×3 cell array
'AAAAA' 'PPPPP' '):):):'

Categorías

Más información sobre Entering Commands en Centro de ayuda y File Exchange.

Preguntada:

el 8 de Oct. de 2013

Respondida:

el 1 de Dic. de 2016

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by