0
4

[–] 19690874? 0 points 4 points (+4|-0) ago 

Using regex for a simple pattern you know at compile-time is a lousy way to do it. Checking if a char is an ASCII letter can be done in 3 instructions in C: https://godbolt.org/z/HMvLLC . Using regex would be hundreds.

0
0

[–] ELS_BrigadeWarning ago  (edited ago)

and even that has the convenience function of isalpha() in C and its rough equivalent Char.IsLetter() in C#. Half the stuff people use regex for boggles my mind. They embrace a more complex solution when hand lexing is so often easier and more performant.

0
0

[–] ELS_BrigadeWarning ago 

Don't use Regex for that shit, use Char.IsLetter()

0
0

[–] berne ago  (edited ago)

So? What were you actually trying to do, and how did you solve it? :-)

0
1

[–] argosciv [S] 0 points 1 point (+1|-0) ago 

Was trying to split a user-inputted string into its characters and run ops on each character (if said character is a letter).

Turns out the regex was perfectly fine, I just had a logical error further down which at first appeared like a regex problem. Fixing the logical error showed that the regex was working perfectly fine.

The error being that I wasn't converting uppercase letters to lowercase letters when checking them against a dictionary which only has lowercase letters in it. This made me think that the regex was skipping over uppercase letters, but it wasn't, I just wasn't using them correctly (hence also why Example C 'worked' due to lowercasing the character at that particular point -- albeit that I didn't want to do so there).

0
2

[–] berne 0 points 2 points (+2|-0) ago  (edited ago)

Okay. I was just curious - because it looked so overly elaborate just for detecting letters. I figured you had a reason besides the utilitarian to use regexes and that the input char array was beyond your control or for "performance considerations".

There is an IsLetter(...) method in the Char class that can detect if a character is a Unicode 'letter' but that includes non-english letters. The simplest (and fastest(?)) way to detect if a char is an english letter would be simply "if ((c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z')) ...". This will not have any of the overhead that string conversions and regexes have and will also use the processor's cache and speculative execution efficiently - if that is ever a consideration.

Using regexes can be very efficient but mostly for complex syntaxes or longer 'strings'. Rolling your 'own' parser in such a case is likely a waste of time.

Good luck!

0
0

[–] folgeyharry ago 

put a * after [a-zA-Z], otherwise you're only going to match 1 character strings

0
0

[–] argosciv [S] ago 

They are indeed only 1 character strings. Original string was converted to a CharArray, to then be iterated through.

0
1

[–] folgeyharry 0 points 1 point (+1|-0) ago  (edited ago)

I tried all 3 examples using mono on linux and they all worked fine. Also, why use regex at all? You can just compare x to 'a', etc. I included a 4th example in the pastebin which shows the complete program I tried.

https://pastebin.com/k1HQR5hu

0
0

[–] neogag ago  (edited ago)

I don't have time to look into this now, but know that there is a /i regex flag that means case-insensitive, which might be what you want.

edit: Also use this to iterate very quickly and find out what works/doesn't work: https://regex101.com/

0
0

[–] argosciv [S] ago  (edited ago)

Tried (?-i) in example A. Is /i proper syntax?

EDIT: Note that the programming language is C#.

0
1

[–] lemon11 0 points 1 point (+1|-0) ago 

I don't know C#, but if its engine is PCRE, the syntax for flags inside the pattern should be like (?i)pattern or (?i:pattern).