wstribizew
55 supporters
My Regex "Road-So-Far"

My Regex "Road-So-Far"

Nov 28, 2020

Welcome, regex User!

I am regularly asked at StackOverflow how I learnt to use regular expressions. You might be curious about that, too, so I decided to share my regex "road so far".

I learnt about regular expressions in 2009, when I was working for a localization company that used a CAR (Computer-Aided Revision) tool to check translation quality. Most of the checks were implemented as regular expression rules that needed maintaining and regular enhancement. The app was written in C#, which was my primary programming language back then. I was totally unaware of various regex flavors at that time, I thought that .NET regular expressions can be used anywhere. Well, I quickly learnt it is not so, when I had to start writing scripts in Python using re library (there was no chance of using the PyPi regex module), and building PCRE rules for a machine translation tool to handle specific text entities.

In 2014, I joined StackOverflow, and began following regex-tagged questions. My first answer was more like a comment, just like many beginners' answers. It took me some month or two to start giving more helpful answers, and much more time to learn various regex flavors. Most important things about regex I learnt were:

  • Knowing regular expressions also means knowing when to use them and when not

  • A regular expression itself is not all, you must know how to use it in the target environment

People often build a regular expression using online regex testers, paste the text pattern into their code, and are surprised the regex does not fetch the expected results. Most common reasons are 1) wrong regex flavor selected in the site options, 2) improper special character escaping, 3) regex tag omission, 4) failure to account for various line ending types and 5) using a string literal as plain text in the sample input field of an online regex editor. What you should do is:

  • Know your plain text input (use print in Python, console.log in JavaScript, etc.)

  • Learn about string escape sequences in your language

  • Always refer to your regex library implementation reference to understand what set of features it supports (e.g. in Bash, you can't use \d to match digits, you have to use [0-9] or [[:digit:]])

  • Understand the importance of flags (especially the s and m flag difference in Perl-compatible regex flavors) and mind that g flag is often implemented in the form of separate methods/functions in specific language (cf. preg_match and preg_match_all in PHP)

  • Use the regular expressions with the right regex method in your programming language.

See my Regular expression works on regex101.com, but not on prod answer on StackOverflow for more details.

I have been checking regex-tagged SO questions every day since 2015. Since answering a question takes an effort of reading up the relevant reference and testing the regular expression at the regex online testers and online IDEs, I have been learning things with time.

Now, I mainly use regular expressions in Notepad++, .NET and Python, but thanks to StackOverflow, I can also write regex solutions in many more languages, including PHP, JavaScript, Java, R, Perl, Bash (sed, grep, awk), Ruby, Kotlin, Dart, Go, Groovy, Go, Groovy, Scala, C++, Swift, Tcl, C and Delphi. I also tried my hand in Lua patterns, too (they are not regular expressions).

So, if you ask me now what you can do master regular expressions, I'd say:

I am planning to write some e-book and a regex online course in the future, so stay tuned!

Enjoy this post?

Buy wstribizew a smoothie

More from wstribizew