Digital Quark
Spread The Knowledge

Common XML Sitemap Errors in WordPress on Google and Bing Webmaster Tools

Common XML Sitemap Errors in WordPress on Google and Bing Webmaster Tools

Usually adding a sitemap to webmaster tools is a breeze but sometimes, the technical issues may arise which will frustrate the non-technical users. There are many common XML sitemap errors and their solution listed below for both  Google and Bing webmaster tools.

Sitemap Errors for Google Webmaster Tools

Error: XML declaration allowed only at the start of the document

Most common cause of this error is the multiple lines of whitespace preceding the <?xmltag. To verify this is the cause open your sitemap URL and download it.

Open it in a text editor and locate <?xmltag. If you see preceding white lines there then this is the cause of the error in Google Webmaster Tools. This is usually caused by a malfunctioning plugin (Not Yoast SEO plugin. They have patched it long back). You may do a plugin conflict testing and check sitemap. You may also want to check the source code of sitemap. It usually has helpful information on which plugins might be affecting it.

Error: Unsupported format. Your file appears to be an HTML page

Most common cause is a caching plugin like W3 Total Cache. It happens when you select the setting to prevent visitors from viewing your sitemap. Cache plugins usually convert pages into static HTML for fast loading which may cause the problem with this particular setting.

We are keeping the resolution in line with W3 Total Cache plugin because it is known to have this issue. Create a user agent group by navigating to the W3 Total Cache’s settings. Then go to Performance and then to User Agent Groups. Click on button “Create a group”. Create a group with name “Google” or “Googlebot” and then in the user agents field you just enter “googlebot”. This will force W3 Total Cache to make a separate cache for Googlebot to access your sitemap.

Error: Your Sitemap or Sitemap index file doesn’t properly declare the namespace

This error appears when URLs are modified after the sitemap is generated. Usually, this is the situation where it is happening without user’s knowledge.

This error is most commonly seen on WP Engine WordPress hosting. They provide a troubleshooting feature called HTML Post Processing for adding support for CDNs and SSL certificates. This feature enables all elements on all web pages to load through the SSL which may not have been loading securely before. Do not try to resolve this issue on your own as the resolutions online break things more than they fix. You should file a support ticket with WP Engine for advice on how to exclude the sitemap namespace after the use of HTML post processing.

Error: General HTTP error: 404 not found

A 404 error basically means that the sitemap does not exist for a variety of reasons. This particular error may be caused because of issues in any of the following:

  • Issues with Sitemap Index. The possible resolutions are:
    1. Permalink Settings reset may quickly solve a 404 error when it has occured after a recent change to your installation. Go to your WordPress Dashboard. Navigate to Settings on left sidebar then to Permalinks and click save without changing anything.
    2.Run through the settings of the SEO/Sitemap plugin you are using. Check that at least one taxonomy is not excluded. Exclusion of all taxonomies may prevent the generation of a sitemap. Also, you need to exclude a post type or taxonomy in which there is no content yet. A sitemap requires categories with published content.
    3.If you still receive a 404, check the rewrite rules (found below) for your setup. Go to file .htaccess in your hosting service’s cPanel’s file manager.
    WARNING: Take backup of .htaccess before making any changes. Revert back to original file immediately in case of any issues.

For Apache based hosting, add the following lines to .htaccess file.

# XML Sitemap Rewrite Fix
RewriteEngine On
RewriteBase /
RewriteRule ^sitemap_index.xml$ /index.php?sitemap=1 [L]
RewriteRule ^locations.kml$ /index.php?sitemap=wpseo_local_kml [L]
RewriteRule ^geo_sitemap.xml$ /index.php?sitemap=geo [L]
RewriteRule ^([^/]+?)-sitemap([0-9]+)?.xml$ /index.php?sitemap=$1&sitemap_n=$2 [L]
RewriteRule ^([a-z]+)?-?sitemap.xsl$ /index.php?xsl=$1 [L]
# END XML Sitemap Rewrite Fix

For NGINX based hosting, add the following lines to .htaccess file.

#SEO Sitemaps fix
location ~ ([^/]*)sitemap(.*).x(m|s)l$ {
## this redirects sitemap.xml to /sitemap_index.xml
rewrite ^/sitemap.xml$ /sitemap_index.xml permanent;
## this makes the XML sitemaps work
rewrite ^/([a-z]+)?-?sitemap.xsl$ /index.php?xsl=$1 last;
rewrite ^/sitemap_index.xml$ /index.php?sitemap=1 last;
rewrite ^/([^/]+?)-sitemap([0-9]+)?.xml$ /index.php?sitemap=$1&sitemap_n=$2 last;
## The following lines are optional for the premium extensions
## News SEO
rewrite ^/news-sitemap.xml$ /index.php?sitemap=wpseo_news last;
## Local SEO
rewrite ^/locations.kml$ /index.php?sitemap=wpseo_local_kml last;
rewrite ^/geo-sitemap.xml$ /index.php?sitemap=wpseo_local last;
## Video SEO
rewrite ^/video-sitemap.xsl$ /index.php?xsl=video last;
}

  • Issues with Geo- Sitemap. These can be resolved as follows:

This issue arises usually because of conflict between the GEO sitemap and Cache plugin. Fixing this is easy. Add the name of of your Geo sitemap, for example “geo-sitemap.xml?” to the Browser Cache 404 error exception list in the cache plugin settings. If no such option is provided in settings of your cache plugin, then reconsider other cache plugin and/or file a bug report with cache plugin developer.

Alternatively, you may rewrite the .htaccess file as stated above for Apache/NGINX based hosting.

  • Issues with Video sitemap, or News sitemap when you have defined custom post types named Video or News respectively. In this case, disable the sitemap functionality and change the custom post type names. If that does not resolve the issue then contact the respective plugin developers for a solution.

Error: Unknown news site

The most likely cause for this error is that your website may not be approved for Google News. To resolve this issue, you will need to confirm that the URL submitted to Google News exactly matches the URL in your sitemap. Or if you have not submitted the URL yet, visit the Google News Publisher Center.

Error: URL blocked / restricted by robots.txt

Robots.txt is a file which tells a search engine whether it is allowed to crawl a page or not. When Google cannot crawl the specific URL due to a robots.txt restriction, this error pops up.

Sitemap Errors for Bing Webmaster Tools

Bing is pretty resilient to most errors. Usually the only error you get there is the one below.

Error: Download of the sitemap or feed failed

This error simply means that Bing is not able to access your sitemap. To confirm that, utilize Bing’s Mobile Friendliness Verification Tool to check if Bing can access the sitemap. If this tool is able to show sitemap, then delete the old sitemap and resubmit it to Bing. If the tool is unable to load sitemap, then it might be an issue with the website loading speed or there may be rules set for Bing in robots.txt, or .htaccess file to restrict the crawling.

About the Author Rajat Khanna

I am a full-time SEO consultant, blogger, and social media marketer. When I am not busy working, I let myself into reading, Netflix, and gaming. My long-term goal is to enable people in online marketing space with respectable passive income.