Looking for a regular expression to match some strings
Thread poster: Hans Lenting
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
Nov 5, 2023

I have imported an XML file in my CAT tool and some segments look like this:

b059b0d9ce604312bb590d30e7193de7
90e2822792dd4d2e831bb0456b0101c8

(These strings refer to GUIDs and my CAT tool doesn't exclude them.) What regular expression would match segments like these?

\A[your suggestion here]\z

The regular expression should only match lowercase letters and numbers (no spaces, uppercase letters etc.)


[Edited at 2023-11-05 09:
... See more
I have imported an XML file in my CAT tool and some segments look like this:

b059b0d9ce604312bb590d30e7193de7
90e2822792dd4d2e831bb0456b0101c8

(These strings refer to GUIDs and my CAT tool doesn't exclude them.) What regular expression would match segments like these?

\A[your suggestion here]\z

The regular expression should only match lowercase letters and numbers (no spaces, uppercase letters etc.)


[Edited at 2023-11-05 09:57 GMT]
Collapse


 
Philippe Locquet
Philippe Locquet  Identity Verified
Portugal
Local time: 16:02
English to French
+ ...
Maybe Nov 5, 2023

Hans Lenting wrote:

\A[your suggestion here]\z

The regular expression should only match lowercase letters and numbers (no spaces, uppercase letters etc.)


[Edited at 2023-11-05 09:57 GMT]


Give that a try:
^(?!.*[a-fA-F0-9]{32}).*$


 
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
TOPIC STARTER
Regex Nov 5, 2023

\A([a-z]*\d*)+\z

 
Philip Lees
Philip Lees  Identity Verified
Greece
Local time: 18:02
Greek to English
Inverse Nov 6, 2023

Inverting the condition seems to work in Perl.

for ( 'b059b0d9ce604312bb590d30e7193de7',
'90e2822792dd4d2e831bb0456b0101c8',
'HELLO_world',
'90e2822792dd4d2e831bb04&^%b0101c8' ) {
print "$_ => ";
if ( /[^a-z0-9]+/ ){
print 'No match';
}else{
print 'Match';
}
print "\n";
}

b059b0d9ce604312bb590d30e7193de7 => Match
90e2822792dd4d2e831bb0456b0101c8 => Match... See more
Inverting the condition seems to work in Perl.

for ( 'b059b0d9ce604312bb590d30e7193de7',
'90e2822792dd4d2e831bb0456b0101c8',
'HELLO_world',
'90e2822792dd4d2e831bb04&^%b0101c8' ) {
print "$_ => ";
if ( /[^a-z0-9]+/ ){
print 'No match';
}else{
print 'Match';
}
print "\n";
}

b059b0d9ce604312bb590d30e7193de7 => Match
90e2822792dd4d2e831bb0456b0101c8 => Match
HELLO_world => No match
90e2822792dd4d2e831bb04&^%b0101c8 => No match

If these are hex codes, as seems likely, you might want to change the a-z to a-f.
Collapse


Dan Lucas
 
Dan Lucas
Dan Lucas  Identity Verified
United Kingdom
Local time: 16:02
Member (2014)
Japanese to English
Back to 1993 Nov 6, 2023

Philip Lees wrote:
Inverting the condition seems to work in Perl.

Old school. Respect!

Dan


 
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
TOPIC STARTER
No Nov 6, 2023

Philippe Locquet wrote:

Give that a try:
^(?!.*[a-fA-F0-9]{32}).*$


Thanks, but this doesn't do the job.


Philippe Locquet
 
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
TOPIC STARTER
Almost :) Nov 6, 2023

Hans Lenting wrote:

\A([a-z]*\d*)+\z


This almost does the job. But of course, "almost" isn't good enough.

Screen Shot 2023-11-06 at 10.00.34

EDIT: I could restrict the number of letters to two.


[Edited at 2023-11-06 09:27 GMT]


 
Stepan Konev
Stepan Konev  Identity Verified
Russian Federation
Local time: 18:02
English to Russian
Try this Nov 8, 2023

\b\w{32}\b

If you only want strings like this:
90e2822792dd4d2e831bb0456b0101c8
(nothing but just 32 chars)

but not like this:
Blah-blah-blah 90e2822792dd4d2e831bb0456b0101c8 blah-blah-blah
(32 chars in a sentence)

replace \b with ^ and $ to read as ^\w{32}$

[Edited at 2023-11-08 14:37 GMT]


Philip Lees
 
Philip Lees
Philip Lees  Identity Verified
Greece
Local time: 18:02
Greek to English
Wrong Nov 9, 2023

Dan Lucas wrote:

Philip Lees wrote:
Inverting the condition seems to work in Perl.

Old school. Respect!

Dan

Except that it doesn't work.

Mine matches all kinds of irrelevant stuff like 'foo', 'bar', and '123'. What was I thinking?


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 17:02
Member (2006)
English to Afrikaans
+ ...
Google for how to validate GUID Nov 9, 2023

I think if you google for how to "validate" a GUID you'd find more clues. This page comes close, although not quite cigar (e.g. it matches brackets and hyphens, too).

ChatGPT's solution is similar to Philippe's (although it seems that you won't need the uppercase letters).

chat guid regex

According to https://regex101.com, ChatGPT's solution matches:

regex101 guid

[Edited at 2023-11-09 08:20 GMT]


 
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
TOPIC STARTER
Like always Nov 9, 2023

Samuel Murray wrote:

Which dialect of regex are we talking about again?


Java


 


To report site rules violations or get help, contact a site moderator:

Moderator(s) of this forum
Laureana Pavon[Call to this topic]

You can also contact site staff by submitting a support request »

Looking for a regular expression to match some strings






CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

Buy now! »
Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »