Looking for a regular expression to match some strings Thread poster: Hans Lenting
|
I have imported an XML file in my CAT tool and some segments look like this: b059b0d9ce604312bb590d30e7193de7 90e2822792dd4d2e831bb0456b0101c8 (These strings refer to GUIDs and my CAT tool doesn't exclude them.) What regular expression would match segments like these? \A[your suggestion here]\z The regular expression should only match lowercase letters and numbers (no spaces, uppercase letters etc.)
[Edited at 2023-11-05 09:... See more I have imported an XML file in my CAT tool and some segments look like this: b059b0d9ce604312bb590d30e7193de7 90e2822792dd4d2e831bb0456b0101c8 (These strings refer to GUIDs and my CAT tool doesn't exclude them.) What regular expression would match segments like these? \A[your suggestion here]\z The regular expression should only match lowercase letters and numbers (no spaces, uppercase letters etc.)
[Edited at 2023-11-05 09:57 GMT] ▲ Collapse | | |
Hans Lenting wrote: \A[your suggestion here]\z The regular expression should only match lowercase letters and numbers (no spaces, uppercase letters etc.)
[Edited at 2023-11-05 09:57 GMT] Give that a try: ^(?!.*[a-fA-F0-9]{32}).*$ | | |
Hans Lenting Netherlands Member (2006) German to Dutch TOPIC STARTER |
Inverting the condition seems to work in Perl. for ( 'b059b0d9ce604312bb590d30e7193de7', '90e2822792dd4d2e831bb0456b0101c8', 'HELLO_world', '90e2822792dd4d2e831bb04&^%b0101c8' ) { print "$_ => "; if ( /[^a-z0-9]+/ ){ print 'No match'; }else{ print 'Match'; } print "\n"; } b059b0d9ce604312bb590d30e7193de7 => Match 90e2822792dd4d2e831bb0456b0101c8 => Match ... See more Inverting the condition seems to work in Perl. for ( 'b059b0d9ce604312bb590d30e7193de7', '90e2822792dd4d2e831bb0456b0101c8', 'HELLO_world', '90e2822792dd4d2e831bb04&^%b0101c8' ) { print "$_ => "; if ( /[^a-z0-9]+/ ){ print 'No match'; }else{ print 'Match'; } print "\n"; } b059b0d9ce604312bb590d30e7193de7 => Match 90e2822792dd4d2e831bb0456b0101c8 => Match HELLO_world => No match 90e2822792dd4d2e831bb04&^%b0101c8 => No match If these are hex codes, as seems likely, you might want to change the a-z to a-f. ▲ Collapse | |
|
|
Dan Lucas United Kingdom Local time: 16:02 Member (2014) Japanese to English
Philip Lees wrote: Inverting the condition seems to work in Perl. Old school. Respect! Dan | | |
Hans Lenting Netherlands Member (2006) German to Dutch TOPIC STARTER
Philippe Locquet wrote: Give that a try: ^(?!.*[a-fA-F0-9]{32}).*$ Thanks, but this doesn't do the job. | | |
Hans Lenting Netherlands Member (2006) German to Dutch TOPIC STARTER
Hans Lenting wrote: \A([a-z]*\d*)+\z This almost does the job. But of course, "almost" isn't good enough. EDIT: I could restrict the number of letters to two.
[Edited at 2023-11-06 09:27 GMT] | | |
Stepan Konev Russian Federation Local time: 18:02 English to Russian
\b\w{32}\b If you only want strings like this: 90e2822792dd4d2e831bb0456b0101c8 (nothing but just 32 chars) but not like this: Blah-blah-blah 90e2822792dd4d2e831bb0456b0101c8 blah-blah-blah (32 chars in a sentence) replace \b with ^ and $ to read as ^\w{32}$
[Edited at 2023-11-08 14:37 GMT] | |
|
|
Dan Lucas wrote: Philip Lees wrote: Inverting the condition seems to work in Perl. Old school. Respect! Dan Except that it doesn't work. Mine matches all kinds of irrelevant stuff like 'foo', 'bar', and '123'. What was I thinking? | | |
Samuel Murray Netherlands Local time: 17:02 Member (2006) English to Afrikaans + ... Google for how to validate GUID | Nov 9, 2023 |
I think if you google for how to "validate" a GUID you'd find more clues. This page comes close, although not quite cigar (e.g. it matches brackets and hyphens, too). ChatGPT's solution is similar to Philippe's (although it seems that you won't need the uppercase letters). According to https://regex101.com, ChatGPT's solution matches:
[Edited at 2023-11-09 08:20 GMT] | | |
Hans Lenting Netherlands Member (2006) German to Dutch TOPIC STARTER
Samuel Murray wrote: Which dialect of regex are we talking about again? Java | | |