This post is really short but I thought it was worth sharing from a recent issue we encountered just before christmas that caused 60% of all the pages on a client’s site had disappeared in Google without warning. Luckily, we spotted this straight away through searching for the brand and noticed it wasn’t ranking #1 anymore.

Each month we carry out monthly site snapshots for our clients, which allow us to monitor a site’s performance, trends and spot any opportunities for organic growth.

In this instance, we checked the source code and inspect element tool, only to find nothing. We crawled the site and didn’t see any issues there either. We then checked the URL inspection tool in Google Search Console and identified the following:

URL Inspection Tool Status

Then we noticed index pages under ‘coverage’ in GSC dropping rapidly. What is going on?

 

Index Pages Dropping in Google Search Console


There was a ‘noindex’ detected in ‘X-Robots-Tag’ HTTP header, later confirmed by our in-house alert tool, which we use to monitor these kinds of issues and address them accordingly.

After looking in the .htaccess file, we noticed the following snippet of code:

<FilesMatch "Xmas_Menu\.pdf|Final-Menu-Web\.pdf|menu\.pdf|"> 
    header set x-robots-tag: noindex
    </FilesMatch>
)

We made no changes from our end, however, when we asked the client if they had made any changes to the site, they mentioned their developer had been trying to remove any instances of PDFs from being indexed. In doing this, they accidentally no indexed the entire site, causing all pages to drop out of the index.

Here’s what happened – the FilesMatch directive was mistakenly matching any visit to the website because of a small syntax error in the regular expression (see screengrab below). The filenames to ignore are separated by pipe “|” characters, but leaving a trailing one at the end causes a match to ALL filenames.

Regular Expression Coded Wrong

The client would normally consult us first before making such changes, however, it’s a lesson learned when trying to block certain pages/PDFs via the .htaccess file, and making such small changes can have a devastating impact on organic performance. It took around 9 days for all the pages to be indexed again, with organic traffic/conversions back to normal.

Organic Traffic Increased Again

Little things like this can happen from time to time, and are often out of your control. Educating clients and communicating with your developers first will stop such things from causing further problems.