Using Regular Expressions (RegEx) in Ruby

You may have heard of RegEx, which is short for regular expressions. It can come off as intimidating, but it’s not too bad once you get used to what the patterns mean and how to construct an actual expression and use it. Once you get used to thinking about strings and text in a more abstract way, it can be a useful tool for solving problems where you are looking for common patterns in a set of data.

RegEx is a method of pattern matching: a way to filter strings or text based on a pattern, usually to extract and modify the desired text. In this article, we will discuss how to use Regular Expressions and how to test those expressions using Ruby methods to incorporate into your logic for your project.

Find your bootcamp match

Select Your Interest

Your experience

Time to start

GET MATCHED

By completing and submitting this form, you agree that Career Karma, LLC may deliver or cause to be delivered information, advertisements, and telemarketing messages regarding their services by email, call, text, recording, and message using a telephone system, dialer, automated technology or system, artificial or prerecorded voice or message device to your email and/or telephone number(s) (and not any other person’s email or telephone number) that you entered. Consent is not a condition of receiving information, receiving Career Karma services, or using the website, and you may obtain information by emailing info@careerkarma.com. Message & Data rates may apply. Message frequency may vary. Text STOP to unsubscribe. Terms of Service and Privacy Policy govern the processing and handling of your data.

One tool that is extremely helpful when it comes to visualizing and understanding RegEx is a site called Rubular. Click the link here to test RegEx using the block of text that’s already been populated. You’ll notice that in between the two forward slashes is a string with the word ‘neighbor’ in it.

Believe it or not, this is a regular expression! Whole words, sentences, paragraphs even can technically be called regular expressions (as long as they are in between two forward slashes). The Rubular environment highlights for us every single instance of the pattern ‘neighbor’ in our block of text – even instances where neighbor is part of a bigger word, too! That being said, you might want to find something more abstract than an exact word match. This is where metacharacters come in.

Metacharacters

Anas Alshanti FeXpdV001o4 Unsplash — Using RegEx can be challenging for seasoned programmers, so don’t get discouraged.

Just as atoms are the building blocks of pretty much everything we see around, metacharacters are the building blocks of regular expressions. As you add on to your regular expression, the overall pattern changes. And when the overall pattern changes, the results you get back from from the methods you use can be different.

Listed below are several ways to modify your regular expression so you can find a pattern that works for you. There is no one absolutely right way to write a regular expression for phone numbers or emails, etc. – it’s all about what your needs are for your project.

Metacharacter	Matches	Example
[abc]	A character class that matches a single character in the string that could be a, b or c	/[eig]/ can match portions of neighbor, apple, or gate
[^abc]	A negated character class that matches every single character in the string *but* a, b or c	/[^eig]/ can match portions of neighbor, apple, or gate
[a-z]	A character class that matches any single character in the range a-z	/[e-i]/ can match single characters in portions of neighbor, apple, or gate
[a-zA-Z]	A character class that matches a range of characters from a-z or A-Z	/[e-i]/ can match single characters in portions of “Hi neighbor!”, Grapple, or gate
^	Start of line	/^Hello/ matches lines that start with ‘Hello’
$	End of line	/Goodbye$/ matches lines that end in ‘Goodbye’
\A	Start of string. Similar to ‘^’ , but with no multiline mode	/\Aa/ matches the ‘a’ in apple, but not the ‘a’ in apricot since it’s not the beginning of the string: apple apricot
\z	End of string. Similar to ‘$’, but with no multiline mode	/\za/ matches the ‘a’ in zebra, but not the ‘a’ in libra since it’s not the end of the string librazebra
.	Wild card. Dot matches any character.	/./ will match any single character in apple
+	Matches one or more of the previous metacharacter	/aa+/ will match ‘aa’, ‘aaaaaaa’ but will not match ‘a’ since it has to be one or more of the previous metacharacter (which in this instance is the second a)
*	Matches zero or more of the previous metacharacter	/ab*/ will match ‘a’, ‘ab’, ‘abbbbbb’
\s	Any whitespace character	/^The\s.+s$/ will match The Beatles, The Rolling Stones, The Cranberries, etc.
\S	Any non-whitespace character	/\S+/ will match The Beatles, The Rolling Stones, The Cranberries, etc.
\d	Any digit	/\d+/ will match 22, 33333, 0, etc
\D	Any non-digit	/\D+/ will match ‘Hello, goodbye’
\w	Any word character	/ny\w*/ will match ‘ny_152’, ‘nypost39’, etc
\W	Any non-word character	/\W+/ will match ‘)(*&^%$’
a{3}	Exactly 3 of ‘a’	/\d{3}-\d{3}-\d{4}/ will match 555-555-5555
a{3,}	Three or more of ‘a’	/[a-zA-Z0-9!#$^&)(]{8,}/ will match ‘xEBqRx14B7TAQp’ ⇐ which looks like it could be used as a password!
a{3, 6}	Three to six of ‘a’	/[a-zA-Z0-9!#$^&*)(]{8,32}/ will match ‘0XX!pC3Odpu30Qc’ because it’s more than 3 and less than 32 characters in length
a?	0 or 1 of ‘a’	/\d?-\d{3)-\d{3}-\d{4}/ will match a phone number with an international code attached to front and one without an international code attached to front.

Using metacharacters is great for validation when it comes to users filling out forms on websites. We want to make sure correct information is entered – that would be a great use of RegEx to make sure the pattern of an address or of an email or phone number is the correct format. This leads to better organized databases with less user error when registering new accounts.

Methods to Test RegEx in Ruby

Here’s the code we are going to use to differentiate between scan and match:

#!/usr/bin/ruby
 
class RegexTest
   def initialize(str, regex)
      @str = str
      @regex = regex
      @result = str.scan(regex)
   end
   def display_details()
      puts "String =  #@str"
      puts "regex =  #@regex"
      puts "result = #@result"
   end
end
# Create Objects
str1 = RegexTest.new("The rain in Spain stays mainly on the plain", /\w+ain/)
str2 = RegexTest.new("In Hertford, Hereford, and Hampshire, hurricanes hardly ever happen", /H\w+/)
# Call Methods
str1.display_details()
str2.display_details()

Scan

The scan method in Ruby returns an array of all strings that match your regular expression:

str1: result = ["rain", "Spain", "main", "plain"]

str2: result = ["Hertford", "Hereford", "Hampshire"]

This allows you to do whatever you would like with the result.

RegExp Match

The regular expression Match method is very, VERY similar to scan, but finds the first instance of a match instead of all matches. Change @result = str.scan(regex) to @result = str.match(regex) to take a look at the difference:

str1: result = rain

str2: result = Hertford

Match, however, returns a <Matchdata> object. It’s got some methods associated with it that can be used in your logic when you use your results. Take a look at the Ruby docs for more information on what you could use there.

Grep

Grep is an enumerable method for finding matching strings in arrays. It will return an array of all strings that match your regular expression. With the code we have, we have to make sure that the string we passed in is split up into an array.

To do this change this line of code:

@result = str.match(regex)

And change it to:

@result = str.split(/\s|,/).grep(regex);

You will then get a result similar to the first result:

str1: result = ["rain", "Spain", "main", "plain"]

str2: result = ["Hertford", "Hereford", "Hampshire"]

Str =~ RegEx

Using the =~ basic matching operator, we can compare the string to the regular expression and return the first index of a match. It will return nil if there is no match.

Conclusion

In this article, we discussed how to use regular expressions (RegEx) in Ruby. If you want to learn more about what you can build with Ruby, check out our article, ”What Is Ruby Code Used For?”

Want a better way to learn Ruby? Let Career Karma help you find the best training program for you.

About us: Career Karma is a platform designed to help job seekers find, research, and connect with job training programs to advance their careers. Learn about the CK publication.

What's Next?

Want to dive deeper?

Ask a question to our community

Want to explore tech careers?

Take our careers quiz

About the Author

Christina Kopecky

Technical Writer at Career Karma

Christina is an experienced technical writer, covering topics as diverse as Java, SQL, Python, and web development. She earned her Master of Music in flute performance from the University of Kansas and a bachelor's degree in music with minors in French an... read more about the author

Jul 13, 2020