You’ve setup your store with painstaking detail, crafted the perfect combination of product offerings, and made it as human-friendly as possible. But are you also making sure the search engine robots are happy?
If you are not paying attention to the robots that crawl your store, it may cost you exposure to future shoppers (and therefore money). Have no fear, I’ve put together a list of the top the top 4 most important things to know so your store will keep the robots happy and coming back for more.
What better place to start pleasing the robots than a file built specifically for them: Robots.txt.
The robots.txt file lets the robots know which sections of your site it can crawl, with two important caveats:
The robots.txt file can be used to keep well-behaved bots such as Google and Bing out of sections of your store that you do not want to appear in search engine results. This is basically the sledgehammer approach to excluding large portions of your site from the search engines.
If you have an entire directory or section you want to always exclude, then robots.txt may be the right tool for the job. However, use caution as the robots.txt file may prevent search engines from crawling pages. If a robot is unable to crawl a page on your store, it will not be able to read things such as the Meta Robots tag listed below.
Create a robots.txt file and place it at the root of your store (i.e. - “http://example.com/robots.txt”). This file should be publicly accessible. Here is an example robots.txt that blocks access to a /secret directory.
User-agent: * Disallow: /secret/
The Meta Robots tag let’s you control search-engine behavior at the page-level for your store. If the robots.txt is the sledgehammer, the meta robots is the scalpel.
There are two main components you can control with meta robots:
You should use meta robots within your store to exclude specific pages on your site from being indexed or crawled. For example, if you have a page that doesn’t provide much value from a search engine point of view, such as a sizing chart, you may choose to have robots skip this page during their crawl. Google in particular has penalized e-commerce stores for “thin content” issues starting back in 2012 it is important to make sure you are presenting your best content to the search engines in today’s search landscape.
The default for meta robots is “index, follow”. The most common use of meta robots outside of the defaults is “noindex, follow”, which means “Don’t index this page, but continue crawling the links on this page”.
Here are a few cases you may consider using meta robots within your store:
The Meta Robots tag is placed in the HTML HEAD section of each page in your store’s website:
1 2 3 4 5 <html> <head> <meta name="robots" content="noindex, follow"> </head> </html>
If Meta Robots is the scalpel, that would make the Rel Nofollow approach the… needle? Rel nofollow let’s you control the way robots crawl your store for each web link within your page. In other words, the Rel Nofollow is a way to inform the robots which links you do NOT want them to follow. Robots may choose to ignore this directive, so consider it more of a suggestion than a rule.
In general, rel=“nofollow” technique has largely fallen out of favor and should not be used to control robots and crawling within your store. The Meta Robots technique should be used instead in this case.
However, there are two legitimate cases for adding rel=“nofollow” to outbound links from your store:
Links are still a primary search engine signal for passing around trust and authority. As a result, Google has cracked down in recent years to prevent abuse via updates such as the Google Penguin update.
Just add rel=“nofollow” to any web link you want to nofollow like so:
1 <a href="example.com" rel="nofollow">Some link</a>
Canonicalization is a way to let the robots know which url on your website you want to attribute to a given page within your store. Duplicate content can negatively impact your site with the search engines, so Canonicalization is a tool to aid the robots in determining which content should actually be treated the same on your store.
You should really be using Canonicalization for all pages within your store. On most stores, it is possible to get to the same exact content at different urls which may really confuse search robots and possibly result in a penalty for your store.
Common examples of duplicate content issues within stores include:
You can setup canonical urls by adding the following to the HEAD section of your HTML for each page like so:
1 2 3 4 5 <html> <head> <link rel="canonical" href="https://yourstore.com/some-widget" /> </head> </html>
To summarize, you’ve learned the following key techniques to help keep your store robot-friendly:
If you have any questions or would like to share your experiences with any e-commerce robot techniques, leave a comment below.