You may not want certain pages of your site crawled because they might not be useful to users if found in a search engine's search results. If you do want to prevent search engines from crawling your pages, Google Search Console has a friendly robots.txt generator to help you create this file. Note that if your site uses subdomains and you wish to have certain pages not crawled on a particular subdomain, you'll have to create a separate robots.txt file for that subdomain. For more information on robots.txt, we suggest this Webmaster Help Center guide on using robots.txt files13.
QUOTE: “7.4.3 Automatically Generated Main Content Entire websites may be created by designing a basic template from which hundreds or thousands of pages are created, sometimes using content from freely available sources (such as an RSS feed or API). These pages are created with no or very little time, effort, or expertise, and also have no editing or manual curation. Pages and websites made up of autogenerated content with no editing or manual curation, and no original content or value added for users, should be rated Lowest.” Google Search Quality Evaluator Guidelines 2017
“Doorways are sites or pages created to rank highly for specific search queries. They are bad for users because they can lead to multiple similar pages in user search results, where each result ends up taking the user to essentially the same destination. They can also lead users to intermediate pages that are not as useful as the final destination.
In particular, the Google web spam team is currently waging a PR war on sites that rely on unnatural links and other ‘manipulative’ tactics (and handing out severe penalties if it detects them). And that’s on top of many algorithms already designed to look for other manipulative tactics (like keyword stuffing or boilerplate spun text across pages).
QUOTE: “Content which is copied, but changed slightly from the original. This type of copying makes it difficult to find the exact matching original source. Sometimes just a few words are changed, or whole sentences are changed, or a “find and replace” modification is made, where one word is replaced with another throughout the text. These types of changes are deliberately done to make it difficult to find the original source of the content. We call this kind of content “copied with minimal alteration.” Google Search Quality Evaluator Guidelines March 2017
As of 2009, there are only a few large markets where Google is not the leading search engine. In most cases, when Google is not leading in a given market, it is lagging behind a local player. The most notable example markets are China, Japan, South Korea, Russia and the Czech Republic where respectively Baidu, Yahoo! Japan, Naver, Yandex and Seznam are market leaders.
Try and get links within page text pointing to your site with relevant, or at least, natural looking, keywords in the text link – not, for instance, in blogrolls or site-wide links. Try to ensure the links are not obviously “machine generated” e.g. site-wide links on forums or directories. Get links from pages, that in turn, have a lot of links to them, and you will soon see benefits.
Inclusion in Google's search results is free and easy; you don't even need to submit your site to Google. Google is a fully automated search engine that uses web crawlers to explore the web constantly, looking for sites to add to our index. In fact, the vast majority of sites listed in our results aren't manually submitted for inclusion, but found and added automatically when we crawl the web. Learn how Google discovers, crawls, and serves web pages.3
Many search engine marketers think who you link out to (and who links to you) helps determine a topical community of sites in any field or a hub of authority. Quite simply, you want to be in that hub, at the centre if possible (however unlikely), but at least in it. I like to think of this one as a good thing to remember in the future as search engines get even better at determining topical relevancy of pages, but I have never actually seen any granular ranking benefit (for the page in question) from linking out.
When referring to the homepage, a trailing slash after the hostname is optional since it leads to the same content ("https://example.com/" is the same as "https://example.com"). For the path and filename, a trailing slash would be seen as a different URL (signaling either a file or a directory), for example, "https://example.com/fish" is not the same as "https://example.com/fish/".
When would this be useful? If your site has a blog with public commenting turned on, links within those comments could pass your reputation to pages that you may not be comfortable vouching for. Blog comment areas on pages are highly susceptible to comment spam. Nofollowing these user-added links ensures that you're not giving your page's hard-earned reputation to a spammy site.
Another excellent guide is Google’s “Search Engine Optimization Starter Guide.” This is a free PDF download that covers basic tips that Google provides to its own employees on how to get listed. You’ll find it here. Also well worth checking out is Moz’s “Beginner’s Guide To SEO,” which you’ll find here, and the SEO Success Pyramid from Small Business Search Marketing.