Robots.txt is not an appropriate or effective way of blocking sensitive or confidential material. It only instructs well-behaved crawlers that the pages are not for them, but it does not prevent your server from delivering those pages to a browser that requests them. One reason is that search engines could still reference the URLs you block (showing just the URL, no title or snippet) if there happen to be links to those URLs somewhere on the Internet (like referrer logs). Also, non-compliant or rogue search engines that don't acknowledge the Robots Exclusion Standard could disobey the instructions of your robots.txt. Finally, a curious user could examine the directories or subdirectories in your robots.txt file and guess the URL of the content that you don't want seen.
QUOTE: “Keyword Stuffed” Main Content Pages may be created to lure search engines and users by repeating keywords over and over again, sometimes in unnatural and unhelpful ways. Such pages are created using words likely to be contained in queries issued by users. Keyword stuffing can range from mildly annoying to users, to complete gibberish. Pages created with the intent of luring search engines and users, rather than providing meaningful MC to help users, should be rated Lowest.” Google Search Quality Evaluator Guidelines 2017
In particular, the Google web spam team is currently waging a PR war on sites that rely on unnatural links and other ‘manipulative’ tactics (and handing out severe penalties if it detects them). And that’s on top of many algorithms already designed to look for other manipulative tactics (like keyword stuffing or boilerplate spun text across pages).
TASK – If running a blog, first, clean it up. To avoid creating pages that might be considered thin content in 6 months, consider planning a wider content strategy. If you publish 30 ‘thinner’ pages about various aspects of a topic, you can then fold all this together in a single topic page centred page helping a user to understand something related to what you sell.
When I think ‘Google-friendly’ these days – I think a website Google will rank top, if popular and accessible enough, and won’t drop like a f*&^ing stone for no apparent reason one day, even though I followed the Google SEO starter guide to the letter….. just because Google has found something it doesn’t like – or has classified my site as undesirable one day.
QUOTE: “The score is determined from quantities indicating user actions of seeking out and preferring particular sites and the resources found in particular sites. *****A site quality score for a particular site**** can be determined by computing a ratio of a numerator that represents user interest in the site as reflected in user queries directed to the site and a denominator that represents user interest in the resources found in the site as responses to queries of all kinds The site quality score for a site can be used as a signal to rank resources, or to rank search results that identify resources, that are found in one site relative to resources found in another site.” Navneet Panda, Google Patent
QUOTE: “We are a health services comparison website…… so you can imagine that for the majority of those pages the content that will be presented in terms of the clinics that will be listed looking fairly similar right and the same I think holds true if you look at it from the location …… we’re conscious that this causes some kind of content duplication so the question is is this type … to worry about? “
QUOTE: “Google Webmaster Tools notice of detected doorway pages on xxxxxxxx – Dear site owner or webmaster of xxxxxxxx, We’ve detected that some of your site’s pages may be using techniques that are outside Google’s Webmaster Guidelines. Specifically, your site may have what we consider to be doorway pages – groups of “cookie cutter” or low-quality pages. Such pages are often of low value to users and are often optimized for single words or phrases in order to channel users to a single location. We believe that doorway pages typically create a frustrating user experience, and we encourage you to correct or remove any pages that violate our quality guidelines. Once you’ve made these changes, please submit your site for reconsideration in Google’s search results. If you have any questions about how to resolve this issue, please see our Webmaster Help Forum for support.” Google Search Quality Team
If you link out to irrelevant sites, Google may ignore the page, too – but again, it depends on the site in question. Who you link to, or HOW you link to, REALLY DOES MATTER – I expect Google to use your linking practices as a potential means by which to classify your site. Affiliate sites, for example, don’t do well in Google these days without some good quality backlinks and higher quality pages.
The leading search engines, such as Google, Bing and Yahoo!, use crawlers to find pages for their algorithmic search results. Pages that are linked from other search engine indexed pages do not need to be submitted because they are found automatically. The Yahoo! Directory and DMOZ, two major directories which closed in 2014 and 2017 respectively, both required manual submission and human editorial review. Google offers Google Search Console, for which an XML Sitemap feed can be created and submitted for free to ensure that all pages are found, especially pages that are not discoverable by automatically following links in addition to their URL submission console. Yahoo! formerly operated a paid submission service that guaranteed crawling for a cost per click; however, this practice was discontinued in 2009.
In addition to on-page SEO factors, there are off-page SEO factors. These factors include links from other websites, social media attention, and other marketing activities outside your own website. These off-page SEO factors can be rather difficult to influence. The most important of these off-page factors is the number and quality of links pointing towards your site. The more quality, relevant sites that link to your website, the higher your position in Google will be.