How to create a perfect Robots.txt for Magento

If you run a Magento-based Ecommerce, a so-called robot.txt file may be added onto your site by default to prevent search engines from indexing your site. It may occur at the development stage when you don’t even have real prices, products, or services. Your site won`t show up in top search engines results. Actually, this program can be easily set up and fixed through the Magento Admin Panel options.

Advantages of Using robots.txt in Magento

Another question is what for such robots are needed. Search engines tend to send tiny spiders to search your site to receive information in return. It is done to get your pages indexed in the search results. Well, Robots.txt is the best way to let search engines know where they are not encouraged to index. Except for the search engines, these robots can execute a specific function automatically for HTML and link validation. This file’s goal is to hide site’s Javascript, SID parameters, and prevent content duplication. It helps to improve your Magento SEO and reduce the amount of server resources. At last, these programs help with reducing the footprint other web-robots make on your bandwidth allocation through specifying a Crawl-Delay. So, there is enough reasons to involve Magento robots. But it is crucial to do it the right way.

Things You Should Know before

Before you decide to install Robots.txt file, you should know that its settings cover 1 domain at a time. In case you have any sub-domains (i.e. shop.example.com), a separate robots.txt is required for them. When you run multiple online stores, it makes sense to involve separate files for each of them. On the whole, the process of implementing Robots.txt function is very simple: it’s nothing but a text file, so anyone can create it quickly with the help of preferred text editors. You can choose between DreamWeaver, Notepad, vim, and other code editors. A range of different robots exists. For instance, Googlebot and Bingbot can be used as crawlers. What really matters is that once you’ve launched Robots.txt file, it is supposed to reside at the root: if your store domain is, for example, www.e-store.com, you should insert robots.txt under the domain root where app directory is also present. It will be accessed as www.e-store.com/robots.txt then. Saving this file under any directory or subdirectory is useless. Two more necessary considerations when using robots.txt for Magento site are:
  • This file is publicly available, so anyone can see the unwanted selections of your server
  • The file may be ignored by the robots, especially malware that are able to scan the web for security vulnerabilities

Installation Process and Tips

There are several ways to install Magento Robots.txt. First, let us talk about how to do it manually. Since 2010, this file is available on the web. It is enough to copy the content we provide below to paste it in a newly created field named Robots.txt. Sitemap.xml location has to be changed before uploading the file into the site’s root (even if the Magento installation is in the subdirectory). This version of robots.txt is offered by byte.nl as an optimal one.

Another way to install Robots.txt for Magento, is to follow these simple guide lines:

  1. Download the robots.text file first (there are a lot of sources available).
  2. Whenever your Magento is installed within a subdirectory, you will have to modify the robots.txt file correspondingly. It means, for instance, changing ‘Disallow: /404/’ to ‘Disallow: /your-sub-directory/404/’ and ‘Disallow: /app/’ to ‘Disallow: /your-sub-directory/app/’.
  3. Check if the domain you use has a sitemap.xml and add URL to your sitemap.xml afterwards.
  4. It’s time to upload the robots.txt file to your root folder. Just place the file within ‘httpdocs/’ directory. It can be done in two ways: by logging in your Control Panel with your credentials and via FTP client of your preference.

Useful tools to check your Robot.txt

To make sure that you set up your Robots.txt right try to use one of the tools listed below. They’ll help you to analyze the code and fix mistakes if there’s any. The easiest and the most reliable way to do so is to use Google Webmaster Tools. It allows you to check your robots.txt for free right from your admin panel. To do so you should:
  1. Go to Google Webmasters
  2. Click on «Search Console»
  3. Type your website’s name
  4. Click on «Crawl» in the Dashboard panel
  5. Choose «Robots.txt Tester» in the drop-down menu
  6. Type «Robots.txt» in the line after the slash
  7. Hit the «Test» button
  8. Enjoy
1 If you want to try some other options and for some reasons avoid using Google Webmaster Tools, pay attention to these tools, which are also can help you to check settings:

For Magento Backend

This one involves applying an extension for robots.txt file. Instead of doing the whole job in hand, you can download special tools to generate robots.txt for Magento. Via the settings you can alter some main options. The good thing is that you can add own rules in addition to standard settings.

Reindexing robots.txt

Search engines often read changed Magento robots.txt for too long. Such tools as GWT can point at the time when your site was last indexed. If you want Google or other search engines to get the up-dated version sooner than in 24 hours or a hundred of visits, you can use Header Cache-Control in your .htaccess file. Apply this statement to your .htaccess file: On the whole, the majority of the Magento agencies have very similar approach when it comes to robots.txt. It is better to get an appropriate consultancy before copy/pasting any of the suggested codes to avoid harming your online Magento or Magento 2 store. If you have any questions about anything I’ve mentioned in this blog post or anything else related to Magento, please feel free to drop me a message via this form. I am on the team performing Magento SEO audits for large Magento projects.

Need help with robots.txt?

Leave a Reply

Your email address will not be published. Required fields are marked *