Is Your E-Commerce Store Robot Friendly?

TomE-Commerce TipsLeave a Comment

Man and Robot High Five

You’ve setup your store with painstaking detail, crafted the perfect combination of product offerings, and made it as human-friendly as possible. But are you also making sure the search engine robots are happy?

If you are not paying attention to the robots that crawl your store, it may cost you exposure to future shoppers (and therefore money). Have no fear, I’ve put together a list of the top the top 4 most important things to know so your store will keep the robots happy and coming back for more.

1. Robots.txt

What better place to start pleasing the robots than a file built specifically for them: Robots.txt.

The robots.txt file lets the robots know which sections of your site it can crawl, with two important caveats:

  1. Robots may ignore this file, such as malicious bots
  2. Since the robots.txt file is public, anyone can see which sections of your store don’t want crawled

When should you use robots.txt?

The robots.txt file can be used to keep well-behaved bots such as Google and Bing out of sections of your store that you do not want to appear in search engine results. This is basically the sledgehammer approach to excluding large portions of your site from the search engines.

If you have an entire directory or section you want to always exclude, then robots.txt may be the right tool for the job. However, use caution as the robots.txt file may prevent search engines from crawling pages. If a robot is unable to crawl a page on your store, it will not be able to read things such as the Meta Robots tag listed below.

How do you use robots.txt?

Create a robots.txt file and place it at the root of your store (i.e. – “http://example.com/robots.txt”). This file should be publicly accessible. Here is an example robots.txt that blocks access to a /secret directory.

2. Meta Robots

The Meta Robots tag let’s you control search-engine behavior at the page-level for your store. If the robots.txt is the sledgehammer, the meta robots is the scalpel.

There are two main components you can control with meta robots:

  • index / noindex – Should this page be included in the search index? index (default) means yes; noindex means no
  • follow / nofollow – Should robots follow links contained on this page? follow (default) means yes; nofollow means no

When should you use Meta Robots?

You should use meta robots within your store to exclude specific pages on your site from being indexed or crawled. For example, if you have a page that doesn’t provide much value from a search engine point of view, such as a sizing chart, you may choose to have robots skip this page during their crawl. Google in particular has penalized e-commerce stores for “thin content” issues starting back in 2012 it is important to make sure you are presenting your best content to the search engines in today’s search landscape.

The default for meta robots is “index, follow”. The most common use of meta robots outside of the defaults is “noindex, follow”, which means “Don’t index this page, but continue crawling the links on this page”.

Here are a few cases you may consider using meta robots within your store:

  • Product pages that you are not finished building.
  • Product pages that have little to no original content, such as manufacturer descriptions.
  • Category or tag pages that don’t have much content (e.g. – thin content).
  • Search results pages within your store.

How do you use Meta Robots?

The Meta Robots tag is placed in the HTML HEAD section of each page in your store’s website:

3. Rel Nofollow

If Meta Robots is the scalpel, that would make the Rel Nofollow approach the… needle? Rel nofollow let’s you control the way robots crawl your store for each web link within your page. In other words, the Rel Nofollow is a way to inform the robots which links you do NOT want them to follow. Robots may choose to ignore this directive, so consider it more of a suggestion than a rule.

When should you use Rel Nofollow?

In general, rel=”nofollow” technique has largely fallen out of favor and should not be used to control robots and crawling within your store. The Meta Robots technique should be used instead in this case.

However, there are two legitimate cases for adding rel=”nofollow” to outbound links from your store:

  1. Links to external Ads or paid advertisements
  2. Links to untrusted (i.e. – low quality or spammy) sites

Links are still a primary search engine signal for passing around trust and authority. As a result, Google has cracked down in recent years to prevent abuse via updates such as the Google Penguin update.

How do you use Rel Nofollow?

Just add rel=”nofollow” to any web link you want to nofollow like so:

4. Canonicalization

Canonicalization is a way to let the robots know which url on your website you want to attribute to a given page within your store. Duplicate content can negatively impact your site with the search engines, so Canonicalization is a tool to aid the robots in determining which content should actually be treated the same on your store.

When should you use Canonicalization?

You should really be using Canonicalization for all pages within your store. On most stores, it is possible to get to the same exact content at different urls which may really confuse search robots and possibly result in a penalty for your store.

Common examples of duplicate content issues within stores include:

  • Pages that allow extra parameters on the url to filter or sort the same content – (e.g. – /some-widgets?sort=recent and /some-widget?sort=price)
  • The same product available within different category pages (e.g. – /toys/lego and /boys/lego)
  • Custom parameters added to urls such as ad tracking parameters, or random user parameters – /some-widget?utm_source=burst

How do you use Canonicalization?

You can setup canonical urls by adding the following to the HEAD section of your HTML for each page like so:

Summary

To summarize, you’ve learned the following key techniques to help keep your store robot-friendly:

  1. Robots.txt – Enables you to block large sections of your store from crawlers
  2. Meta Robots – Enables you to control indexing and crawling at the page level within your store
  3. Rel Nofollow – Enables you to control how robots should treat links within your store
  4. Canonicalization – Enables you to inform robots of the correct url to attribute content to and prevent duplication issues

If you have any questions or would like to share your experiences with any e-commerce robot techniques, leave a comment below.

Sign up for my Shopify newsletter

Enter your email below to get actionable Shopify tips and information delivered weekly straight to your inbox.

Leave a Reply

Your email address will not be published. Required fields are marked *