Should i use separate robots.txt file for each subdomain on a root domain?

Started by seenalal, 04-28-2019, 00:13:12

Previous topic - Next topic

seenalalTopic starter

my root domain in www.dontworry.com. I have a multiple of subdomains. Among them 2 are

(1) https://dontworry.com/realestate/ and (2) https://classified.dontworry.com/

so should i add robots.txt file for each of them???
i have domains which are in form as (1) i written above.

so what should i do?? please help
  •  


John - Smith

 it is recommended to have a robots.txt file for each subdomain to control the crawling and indexing of your website's content. The robots.txt file allows you to specify which areas of your website should be accessible to search engine crawlers.

For your specific case, you should create separate robots.txt files for (1) https://dontworry.com/realestate/ and (2) https://classified.dontworry.com/. This way, you can define different rules for each subdomain, if necessary.

To create a robots.txt file, you can simply create a text file named "robots.txt" and place it in the root directory of the respective subdomain. Make sure to properly configure the file with the desired directives.

Here's an example of how a robots.txt file for the first subdomain might look like:

User-agent: *
Disallow: /realestate/

And for the second subdomain:

User-agent: *
Disallow: /

The robots.txt file is a text file that tells search engine crawlers which parts of your website they should or should not crawl and index. It is placed in the root directory of your website, and search engine bots look for it before they start crawling your site.

In your case, since you have multiple subdomains, it's important to have separate robots.txt files for each subdomain. This is because each subdomain is treated as a separate website by search engines, and they will look for a robots.txt file specific to that subdomain.

For example, if you have the subdomain "https://dontworry.com/realestate/", you should place a robots.txt file in the root directory of that subdomain (i.e., https://dontworry.com/realestate/robots.txt).

The content of the robots.txt file may vary based on your requirements. Here are some common directives you can include:

1. User-agent: This directive specifies which search engine bots the rules apply to. "*" means it applies to all bots.

2. Disallow: This directive tells search engine bots which parts of your website they should not crawl. For example, if you don't want a certain directory or page to be indexed, you can specify it here. Use "/" to indicate the root directory.

3. Allow: This directive allows search engine bots to access specific directories or pages that were disallowed by the Disallow directive.

It's important to note that robots.txt file directives are only instructions to search engine bots; they are not enforced rules. Some well-behaved crawlers will respect the directives, but others may ignore them.

Once you have created the robots.txt files for each subdomain, verify that they are accessible by visiting the respective URLs (e.g., https://dontworry.com/realestate/robots.txt). You can also use tools like Google's Robots Testing Tool or fetch and render tools to check if the directives are correctly applied.

Remember to update your robots.txt files whenever you make changes to your website's structure or want to modify search engine crawling behavior.



It is okay if you include the sitemap links of subdomains in the robots.txt of the root file. as it is the root domain if the main robots.txt file will be crawled by crawlers, rest will be also done automatically! 


heenamajeed

3 Answers. On the off chance that your test organizer is arranged as a virtual host, you need robots.txt in your test envelope too. (This is the most widely recognized utilization). Be that as it may, in the event that you move your web traffic from subdomain through .htaccess record, you could adjust it to consistently utilize robots.txt from the base of your primary area.
  •  

amayajace


But if you move your web traffic from subdomain via .htaccess file, you could modify it to always use robots.txt from the root of your main domain. When the crawler fetches test.domain.com/robots.txt that is the robots.txt file that it will see. It will not see any other robots.txt file.