Difference between index status, indexed and Site:domain

Started by indiainfo, 11-02-2016, 04:21:43

Previous topic - Next topic

indiainfoTopic starter

Hi,

Index status in webmaster shows the no.of pages indexed in a site where as in sitemaps shows the no.of webpages submitted and no.of paged indexed. But the count of indexed pages showing different in Index status and sitemaps (indexed). Not only those two, If we use site: domain then we will get indexed pages here count of indexed pages showing different. In all the three areas showing different indexed urls. Can anyone explain the difference?
  •  


damponting44

Hi guys, I'm rather new to SEO, tried my best to look for resources before asking question here. Apologising first for any shallow question.

The thing is in webmaster tool, under "google index" - "index status",  I can see 65 of my URL has been indexed, and when I do a "site:domain name" search in google, it shows 54 pages in the result. There are some discrepancy but I'm Ok with it though.

My real concern is the sitemap status, it shows 64 URL submitted, not 0 of them were indexed. I did have a fully indexed sitemap status 2 weeks ago, until I did some major changes to the URL  format to make it more user friendly.  Although I've resubmitted the sitemap a few times and have made sure the URL contains are the updated version, they just won't get updated.

On top of that, I'm receiving periodical crawling error of my old URLs ( I've created 301 redirect for them respectively), that shows somehow Google bot is still crawling into the out dated URL instead of the updated ones, I'm not sure if it's related to the sitemap index status.

My biggest concern is, does the sitemap URL index status affect my site ranking & searching response, despite the pages have been indexed by Google?

Sorry for the long post, any help will be greatly appreciated.


apwebsolutions

The difference in the count of indexed pages between the "Index status" report in Google Webmaster Tools, the sitemap, and the "site:" command in Google search results can be attributed to various factors.

Firstly, it's important to understand that each of these measures uses different methodologies to determine the number of indexed pages.

The "Index status" report in Google Webmaster Tools provides data specifically for your website. It may not always accurately reflect the current number of indexed pages as it relies on Google's indexing process, which can take time to update.

On the other hand, the sitemap is a file you submit to Google, which contains a list of all the URLs you want to be indexed. The number of pages listed in the sitemap represents the number of pages you have submitted to Google for indexing. However, it doesn't guarantee that all those pages will be indexed.

The "site:" command in Google search results allows you to see the number of pages from your website that Google has indexed. This count may vary from what is shown in the other two areas because Google may choose not to index certain pages for a variety of reasons, such as duplicate content, low quality, or crawling issues.

Here are some additional factors that can contribute to the difference in the count of indexed pages:

1. Crawling Frequency: Google's crawlers may not visit all pages on your website at the same frequency. Certain pages may be crawled more often, resulting in faster indexing and a higher count in some areas.

2. Indexing Priority: Google prioritizes indexing of webpages based on factors such as relevance, authority, and popularity. Higher-quality pages or those with more backlinks might be indexed more quickly and have a higher count.

3. Page Restrictions: Some pages on your website may have restrictions that prevent them from being indexed, such as robots.txt directives or meta tags instructing search engines not to index specific content.

4. Duplicate URLs: If your website has multiple URLs with the same or very similar content, Google may choose to index only one of those URLs instead of all of them. This can lead to discrepancies in the count.

5. Time Lag: There may be a time lag between when a page is indexed and when it is reflected in the various reports. Changes in the count can occur as Google's indexing process updates.

6. Site Structure: The structure of your website can impact how Google indexes it. If your website has a complex navigation structure or technical issues, it may result in some pages being missed or not properly indexed.

7. Redirects and Canonical URLs: If you have implemented redirects or canonical URLs on your website, it can affect how Google indexes your pages. This can sometimes lead to discrepancies in the count of indexed pages.

8. Server Issues: If your website experiences server downtime or other server-related issues, it can impact Google's ability to crawl and index your pages accurately. This can lead to inconsistent counts across different measurement methods.

9. Personalized Search Results: When using the "site:" command in Google search results, keep in mind that the count you see might be personalized based on your search history and preferences. This can result in different counts for different users.

10. Data Processing Delay: The processes of collecting and analyzing data from various sources can introduce delays, leading to variations in the count of indexed pages across different reports.

11. Dynamic Content: If your website has dynamically generated content, such as user-generated content or dynamically generated URLs, it can affect the indexing process. Google may not always index all variations of dynamically generated pages, leading to discrepancies in the count.

12. Content Quality: Google prioritizes high-quality, unique, and relevant content for indexing. If some of your pages are deemed to have low-quality content or duplicate content, Google may choose not to index them, resulting in differences in the count.

13. Crawl Budget: Google allocates a crawl budget to each website, which determines how many pages it will crawl and index within a given time frame. If your website has a large number of pages, not all of them may be crawled and indexed, leading to variations in the count.

14. Language and Location: Different reports and measurement methods may take into account language and location settings. This can result in different counts since Google may index different versions of your website tailored to specific regions or languages.

15. Algorithm Changes: Google's search algorithms are constantly being updated and refined. These changes can impact how pages are indexed and displayed in search results, leading to differences in the count of indexed pages.

16. URL Parameters: If your website uses URL parameters, such as session IDs or tracking codes, it can result in multiple variations of the same page being indexed. This can affect the count of indexed pages, especially if some variations are excluded or not crawled by Google.

17. Page Speed: The loading speed of your webpages can impact how Google crawls and indexes them. Slow-loading pages may experience difficulties in being fully indexed, resulting in differences in the count.

18. Mobile-Friendly Design: With the increasing emphasis on mobile search, Google considers the mobile-friendliness of websites when indexing pages. If your website has mobile-specific versions or responsive design issues, it can impact the indexing process and lead to differences in the count.

19. Website Updates: If you frequently update or modify your website's content, it can take time for Google to discover and index those changes. This lag between updates and indexing can cause variations in the count across different measurement methods.

20. Manual Actions: If your website has been issued a manual action by Google for violating its guidelines, specific pages or even the entire website may be penalized and not indexed. This can result in a lower count of indexed pages in certain reports.

21. Duplicate Content: If your website has a significant amount of duplicate content, Google may choose to index only a subset of those pages. This can result in differences in the count between reports.

22. Temporary Indexing: Google might temporarily index certain pages and then remove them later if it determines that the content is low-quality or violates its guidelines. This can lead to variations in the count over time.

23. URL Changes: If you have recently made changes to your website's URL structure or implemented redirects, it can affect Google's indexing process. It may take time for Google to update its index and reflect the changes, resulting in discrepancies in the count.

24. Website Authority: Websites with higher authority and trustworthiness are more likely to have their pages indexed by Google. If your website is relatively new or lacks authority, it may experience differences in the count compared to more established sites.

25. Localization and Personalization: Google's search results can be personalized based on the user's location and search history. This means that the count of indexed pages shown to different users might vary depending on their preferences and search settings.

26. Blocked Resources: If certain resources on your webpages, such as JavaScript or CSS files, are blocked from being crawled by search engines, it can affect how Google indexes those pages. This can lead to differences in the count of indexed pages.

27. Data Processing Inconsistencies: Differences in the way data is processed across different reporting tools or algorithms can also contribute to variations in indexed page counts. Each tool may use different data sources or update frequencies, leading to discrepancies.

28. Reciprocal Links: If your website has a large number of reciprocal links (where you link to another website and they link back to you), Google may choose not to index those pages or lower their priority for indexing.

29. URL Canonicalization: If your website has multiple URLs that serve the same content but are accessed through different variations (such as with or without "www"), Google may consolidate them under a canonical URL. This can result in variations in the count of indexed pages.

30. Server Response Codes: If certain pages on your website return server response codes, such as 404 (page not found) or 301 (redirect), it can affect how Google indexes and counts those pages.

31. Malware or hаcked Pages: If your website has been compromised by malware or hаcking attempts, Google may flag and deindex affected pages. This can lead to discrepancies in the count of indexed pages.

32. Language Markup: If your website serves content in multiple languages and uses language markup, Google may index different language versions separately, resulting in variations in the count.

33. Content Accessibility: If certain pages on your website have restricted access, such as requiring login credentials or being blocked by robots.txt, Google may exclude them from indexing, leading to differences in the count.

34. Penalties or Filters: Websites that violate Google's guidelines or engage in manipulative practices may be subject to penalties or filters that impact the indexing process. This can result in a lower count of indexed pages in certain reports.

35. Local Search Differences: For websites that serve different content or have different versions based on location, the count of indexed pages can vary depending on the geographic area considered.

36. Content Changes: If you frequently make changes to your website's content, especially if it involves adding or removing pages, it can affect the count of indexed pages. Google may take some time to crawl and update its index, leading to discrepancies.

37. XML Sitemap Issues: If there are issues with your XML sitemap, such as incorrect URLs or missing pages, it can impact how Google indexes your website. Ensure that your sitemap is properly formatted and up to date to avoid discrepancies in the count.

38. Pagination: If your website has paginated content, such as blog posts or product listings spread across multiple pages, the count of indexed pages may vary depending on how Google interprets and indexes those paginated pages.

39. External Factors: Factors beyond your control, such as changes in Google's algorithms or updates to its crawling and indexing systems, can also influence the count of indexed pages. These external factors can result in fluctuations and differences in the reported counts.

40. Data Reporting Delay: The data presented in various reports may not be in real-time and could have a delay. This delay can lead to differences in the count of indexed pages between different reporting tools and methods.

pablohunt2812

I'll start with  your biggest concern "does the sitemap URL index status affect my site ranking & searching response, despite the pages have been indexed by Google? >> No, it does not :)
Long answer: Submitting a Sitemap through Google webmaster Tools helps us better understand the structure of your website and sometimes discover URLs that we wouldn't have caught through the normal crawling process. However, A Sitemap does not guarantee that Google will crawl or index all the URLs your referenced in it. It is just a hint.

Now, about your real concern "the sitemap status,[...] shows 64 URL submitted, not 0 of them were indexed" >> All the URLs listed in your XML Sitemap are non www. and they all permanently redirect to the www. version (which is your preferred domain). That explains why these specific URLs are not indexed.
In order to fix that, you'll need to specify the right URLs with www., resubmit your Sitemap and wait for it to be processed again.

Regarding Crawl error, you may want to read this blog post: http://googlewebmastercentral.blogspot.ch/2011/05/do-404s-hurt-my-site.html

I hope the above explanations will be helpful.