Posted on

There’s a new cool feature in GTM called a RegEx Table variable. Here’s a simple use case. Suppose your website has subdirectories for each language version. So you’d have

For many purposes it’s useful to track those websites in different Google Analytics properties. Now using a RegEx Table variable this has become pretty simple. Where you’d previously either had to create many multiple tags (copies of each other) or write a Javascript variable that extracted the language parameter.

For the example above, you can configure a RegEx Table variable in the following way:

Some deconstructing:

  • I unchecked full matches, that just seems a confusing option to me for a small number of patterns (meaning rows in your table).
  • The Input variable is the Page Path. It contains strings like /language/subdirectory/.
  • Every pattern starts with ^ meaning whatever follows has to be at the beginning of Page Path.
  • . denotes any character. The quantifier * means we want to repeat the dot zero or multiple times. Of course there’s also a quantifier for one or multiple times which we do not want to use.

So for example

  • en/ is not matched (because it does not start with /en/); it defaults.
  • /en/ is matched with the first row
  • /zh is not matched
  • /zh/sub/thing/ is matched with the second row.

A Different Input Variable

You could also use a different input variable than I used in the example above.  Say you’re at the page:

Then you have the following built-in variables that return some part of that:

  1. Page Hostname, typically subdomain + domain, so
  2. Page Path would be just /blog/pillar/
  3. Page URL would be the full URL above https://….=1

So if you wanted to filter for the “protocol” you have to use Page URL, not Page Path or Page Hostname. if you’d be interested in subdomains, use Page Hostname (cause it’s reduced) and in our case you could be using either Page Path or Page URL.

If you want to use the Page URL it would could like this.

Some final deconstructing:

  • I did not start the patterns with ^, because usually appears in the Page URL and usually only once (although this is not always true).
  • Here we want to use the . as literal character so we have to escape it.
  • The pattern website\.com matches only the string
  • while the pattern would also match websiteDcom or or websitedcom.

Hope you are now ready for your use case.

Leave a Reply