Exclude Spam in Google Analytics

I see some mixed advice about stopping referrer spam in Google Analytics, some of it good, but also some of it being not recommended. Here is a simple way of solving this problem.

There are two basic types of spam: CRAWLER and GHOST.

Crawler visits our website and can be filtered out, while GHOST cannot be filtered out. It is just guessing our GA number, and never actually visits our site.

Here is the procedure for stopping Crawler spam:

1. Create a new view. (It is good to keep the original unfiltered view intact).
In Google Analytics, go to Admin/View/Dropdown list/Create new view. Then call it” Filtered bots”, or something like that.

2. Setup a filter which EXCLUDES a CAMPAIGN SOURCE, and paste this:


Excluding them one-by-one is a never ending story. Doing it with regex is much more powerful because it will cover a wider range.
Sometimes a new one may appear, but it is manageable. Just expand the list. (Put the pipe (“|”, it means “or”) and then add the domain for exclusion. Do not end the regex with the pipe). Here is more about the regex, in case somebody needs it: https://support.google.com/analytics/answer/1034324?hl=en.

Ghost spam:

1. Go to technology/network/hostname and note all valid hostnames. (your domain, translate.googleusercontent.com, webcache.googleusercontent.com, etc.).

2. Now setup another filter which INCLUDES a HOSTNAME.



(Historic data will not be affected, but we can use the above rules on the advanced segment, which would allow us to analyze clean historic data).

And that’s it. Two filters, Google Analytics clean. 🙂

In addition to that, go to admin/view settings and turn on the “Exclude all hits from known bots and spiders” option.

Regarding robots.txt, If we use WordPress, it is not vise to block WP folders as this would prevent Google to fully render and understand our site. (This advice can be seen on Internet, but it belongs to past times). Particularly do not block wp-includes.

One more thing. There is another type of spam, which is mostly not considered as such:  the traffic from irrelevant countries.

If we sell (target) globally, that’s great. But if we are selling locally, or only in several countries, then all other traffic is distorting our numbers and conversion rates.
It is nice to have that traffic, and it shouldn’t be filtered out, of course, but what should be done is creating a segment which includes only the traffic from relevant countries. Only then we can focus on correct numbers.

What gets measured, gets managed.


Leave a Comment