The last few editions of Magento 2 show some shiny new features about robots.txt and they bring some interesting issues with them. In this blog post, I will try to walk you through the various issues you can face while modifying the Magento 2 robots.txt file.
Episode I: The Status Code Menace
First of all, you can notice (if you are a freak like me who actually checks the status code of most of the URLs you see), then it is true that by default if you install a clean Magento 2 And you do not add any robots.txt file yourself, if you or any search engine tries to access the robots.txt file rather than obtaining Bot 404, then you actually have a status of 200 The page will be found:
Note that it also weighs some bytes. This robots.txt file does not actually exist physically in the root of your store. This is actually a dynamically generated file that is generated because some new configuration options in Magento 2 admin that I will now get with you.
Why is a blank dynamically generated position 200 page instead of 404? Because Magento 2. Can you disable it in any way by the administrator and instead you can get 404 pages? No.
Episode II: The Sitemap Injection
If you navigate to “Store> Configuration> Catalog> XML Sitemap” and scroll to “Search Engine Submission Settings”, then you will get a setting that enables you to add a Sitemap: Instructions on your robots.txt file :
If you enable it, then your main Sitemap index file will be added to the URL (which contains only two URLs, your actual URL Sitemap and the URL of the Image Sitemap) into the robots.txt file that is generated on your website dynamically And if you go to yourstore.com/robots.txt, you will see something like this:
Episode III: The Content Design Configuration
What does robots.txt have to do with design one might ask? Nobody knows, however, let’s navigate to “Content> Design> Configuration”.
In this very logical place, lets edit a global or website scope:
Here you can open an accordion section called “Search Engine Robot”.
First of all you will know that they are empty. But if your file already includes sitemap instructions enabled in the previous section of this blog post? Because Magento 2
Lets add a few lines of code like in the screenshot bellow to the robots.txt file and see what happens, shall we?
PRO TIP: If you’re having trouble saving the configuration at this step locally, try this fix, it worked for me. Why? Because Magento 2.
Once a sitemap and a custom robots.txt text are enabled, what can we ask for at the end of the dynamically-generated robots.txt file? How will Magento 2 combine and merge this lesson, and where will sitemap instructions appear? Well … this is what we get:
A miss-formatted robots.txt file that doesn’t go into a new line where it should with a sitemap directive appended at the end of it.
One would expect that clicking a reset to default button would return a blank field as that was the default state we found this in once Magento was installed, right? Wrong. What we get is this:
A badly written boilerplate for Magento 2 robots.txt file that I wouldn’t recommend using as it disallows everything with a parameter on the store.
Episode IV: A New Hope
What happens if we now add a custom robots.txt file that is actually physically present in the root of your store?
Whatever we did in the previous steps, we completely overwrote it. This ignores the sitemap injection of Episode II as well as all the lessons you input in Episode III. And if you have written it correctly, then it works and is formatted correctly.
So to conclude…
For at least now, store your old sticks with adding robots.txt in Magento 2 – add an actual text file to the root of your store.