QUOTE: Each piece of duplication in your on-page SEO strategy is ***at best*** wasted opportunity. Worse yet, if you are aggressive with aligning your on page heading, your page title, and your internal + external link anchor text the page becomes more likely to get filtered out of the search results (which is quite common in some aggressive spaces). Aaron Wall, 2009
I’ve got by, by thinking external links to other sites should probably be on single pages deeper in your site architecture, with the pages receiving all your Google Juice once it’s been “soaked up” by the higher pages in your site structure (the home page, your category pages). This tactic is old school but I still follow it. I don’t need to think you need to worry about that, too much, in 2019.
While most of the links to your site will be added gradually, as people discover your content through search or other ways and link to it, Google understands that you'd like to let others know about the hard work you've put into your content. Effectively promoting your new content will lead to faster discovery by those who are interested in the same subject. As with most points covered in this document, taking these recommendations to an extreme could actually harm the reputation of your site.
I think the anchor text links in internal navigation is still valuable – but keep it natural. Google needs links to find and help categorise your pages. Don’t underestimate the value of a clever internal link keyword-rich architecture and be sure to understand for instance how many words Google counts in a link, but don’t overdo it. Too many links on a page could be seen as a poor user experience. Avoid lots of hidden links in your template navigation.
In 1998, two graduate students at Stanford University, Larry Page and Sergey Brin, developed "Backrub", a search engine that relied on a mathematical algorithm to rate the prominence of web pages. The number calculated by the algorithm, PageRank, is a function of the quantity and strength of inbound links. PageRank estimates the likelihood that a given page will be reached by a web user who randomly surfs the web, and follows links from one page to another. In effect, this means that some links are stronger than others, as a higher PageRank page is more likely to be reached by the random web surfer.
Webmasters and content providers began optimizing websites for search engines in the mid-1990s, as the first search engines were cataloging the early Web. Initially, all webmasters only needed to submit the address of a page, or URL, to the various engines which would send a "spider" to "crawl" that page, extract links to other pages from it, and return information found on the page to be indexed. The process involves a search engine spider downloading a page and storing it on the search engine's own server. A second program, known as an indexer, extracts information about the page, such as the words it contains, where they are located, and any weight for specific words, as well as all links the page contains. All of this information is then placed into a scheduler for crawling at a later date.
What about other search engines that use them? Hang on while I submit my site to those 75,000 engines first [sarcasm!]. Yes, ten years ago early search engines liked looking at your meta-keywords. I’ve seen OPs in forums ponder which is the best way to write these tags – with commas, with spaces, limiting to how many characters. Forget about meta-keyword tags – they are a pointless waste of time and bandwidth.
As of 2009, there are only a few large markets where Google is not the leading search engine. In most cases, when Google is not leading in a given market, it is lagging behind a local player. The most notable example markets are China, Japan, South Korea, Russia and the Czech Republic where respectively Baidu, Yahoo! Japan, Naver, Yandex and Seznam are market leaders.
QUOTE: “Google will now begin encrypting searches that people do by default, if they are logged into Google.com already through a secure connection. The change to SSL search also means that sites people visit after clicking on results at Google will no longer receive “referrer” data that reveals what those people searched for, except in the case of ads.
Don’t break Google’s trust – if your friend betrays you, depending on what they’ve done, they’ve lost trust. Sometimes that trust has been lost altogether. If you do something Google doesn’t like such as manipulate it in a way it doesn’t want, you will lose trust, and in some cases, lose all trust (in some areas). For instance, your pages might be able to rank, but your links might not be trusted enough to vouch for another site. DON’T FALL OUT WITH GOOGLE OVER SOMETHING STUPID
To avoid undesirable content in the search indexes, webmasters can instruct spiders not to crawl certain files or directories through the standard robots.txt file in the root directory of the domain. Additionally, a page can be explicitly excluded from a search engine's database by using a meta tag specific to robots (usually ). When a search engine visits a site, the robots.txt located in the root directory is the first file crawled. The robots.txt file is then parsed and will instruct the robot as to which pages are not to be crawled. As a search engine crawler may keep a cached copy of this file, it may on occasion crawl pages a webmaster does not wish crawled. Pages typically prevented from being crawled include login specific pages such as shopping carts and user-specific content such as search results from internal searches. In March 2007, Google warned webmasters that they should prevent indexing of internal search results because those pages are considered search spam.
The reality in 2019 is that if Google classifies your duplicate content as THIN content, or MANIPULATIVE BOILER-PLATE or NEAR DUPLICATE ‘SPUN’ content, then you probably DO have a severe problem that violates Google’s website performance recommendations and this ‘violation’ will need ‘cleaned’ up – if – of course – you intend to rank high in Google.
Some page titles do better with a call to action – a call to action which reflects exactly a searcher’s intent (e.g. to learn something, or buy something, or hire something. THINK CAREFULLY before auto-generating keyword phrase footprints across a site using boiler-plating and article spinning techniques. Remember this is your hook in search engines, if Google chooses to use your page title in its search snippet, and there is a lot of competing pages out there in 2019.
Comparing your Google Analytics data side by side with the dates of official algorithm updates is useful in diagnosing a site health issue or traffic drop. In the above example, a new client thought it was a switch to HTTPS and server downtime that caused the drop when it was actually the May 6, 2015, Google Quality Algorithm (originally called Phantom 2 in some circles) that caused the sudden drop in organic traffic – and the problem was probably compounded by unnatural linking practices. (This client did eventually receive a penalty for unnatural links when they ignored our advice to clean up).
I think it makes sense to have unique content as much as possible on these pages but it’s not not going to like sync the whole website if you don’t do that we don’t penalize a website for having this kind of deep duplicate content and kind of going back to the first thing though with regards to doorway pages that is something I definitely look into to make sure that you’re not running into that so in particular if this is like all going to the same clinic and you’re creating all of these different landing pages that are essentially just funneling everyone to the same clinic then that could be seen as a doorway page or a set of doorway pages on our side and it could happen that the web spam team looks at that and says this is this is not okay you’re just trying to rank for all of these different variations of the keywords and the pages themselves are essentially all the same and they might go there and say we need to take a manual action and remove all these pages from search so that’s kind of one thing to watch out for in the sense that if they are all going to the same clinic then probably it makes sense to create some kind of a summary page instead whereas if these are going to two different businesses then of course that’s kind of a different situation it’s not it’s not a doorway page situation.”
You may not want certain pages of your site crawled because they might not be useful to users if found in a search engine's search results. If you do want to prevent search engines from crawling your pages, Google Search Console has a friendly robots.txt generator to help you create this file. Note that if your site uses subdomains and you wish to have certain pages not crawled on a particular subdomain, you'll have to create a separate robots.txt file for that subdomain. For more information on robots.txt, we suggest this Webmaster Help Center guide on using robots.txt files13.