What is a robots.txt file?
A robots.txt file is a file that helps search engines determine which web pages should and should not be accessed. Search engine crawlers from Google, Bing, and Yahoo typically honour the information in this file.
How does robots.txt work?
The main purpose of search engines is to crawl the web to discover and index web pages so that their contents can be shown to their searchers.
When visiting a website, a search engine crawler will start looking for a robots.txt file and read the file before proceeding through the page. The robots.txt file contains information that instructs the search engine which web pages to crawl and which pages should not be indexed. In the event that the robots.txt file does not have any instructions preventing a crawler’s activity then it will proceed to crawl the website.
Why is this important for SEO?
The goal of SEO is to get your web pages to rank in search engine result pages. If search engine crawlers do not discover and index your web page, then it cannot be displayed in SERPs. In large websites with thousands of pages, it can be tough for web crawlers to index everything. Crawlers typically have a set amount of time allocated to crawl a specific website. Thus, it is important to prioritize which web pages get indexed. By allowing important pages to be crawled and disallowing unimportant ones, you can maximize the total amount of important web pages to be shown in SERPs.
Robots.txt – Backlinko
Introduction to robots.txt – Google Search Central
Robots.txt – Moz
Nick Berns is a web developer & SEO specialist.