How Google Works
So, How Does Google Work? Everyone knows Google searches for websites based on the keywords you type into its search bar, but how does it index these websites beforehand, how does it store them, and how does it choose which results to display to you?
In order to truly understand Search Engine Optimization (SEO) and Google Ads Pay Per Click (PPC) / Clickpay, you need to understand how content is crawled, indexed, and displayed by search engines, and there is no better search engine to analyze than the one most Americans use every day – Google.
1) A blogger creates a new post, someone just wrote a new tweet, you’ve added a new product to your e-commerce page, or you’ve added a whole new page, the point is some type of fresh content is created by a user.
2) Eventually, Google’s crawl bots who are constantly crawling the web reach the web page and crawl it to find the new content.
- Google has various crawl bots that usually follow links, so the rule of thumb is if there are no links leading to the new content externally or internally that content will not get crawled regularly. Only Google’s Deep crawl bots will catch it.
- If links to your site have a “nofollow” tag attached to the link, the crawl bots will not arrive at your site from those links.
- If links don’t lead to your site, there are other ways to alert their crawl bots about your site or new content. You can ping Google with your XML sitemap or
- The more links you have to your site from domains that google considers an “authority” on a subject, the higher your own domain’s authority rises. Until finally website becomes a domain authority.
- Google will not crawl your site if you tell them not to in a file called robots.txt
- This file can also give Google instructions on how to crawl the site.
3) Once your website is crawled it’s indexed within seconds. But how does it get indexed?
- The data and content are broken up and stored as cached information on Google’s servers at their secretive data centers.
- Page titles, H1 tagged elements, meta tag titles, link data are in an index.
- The page content is placed in a separate index for long-tailed and obscure searches. More weight is placed on a bolded text that uses the “” element, or a keyword you have isolated than is placed on plain text.
- When you use Google as a search engine you are not using it to search the web, you are using it to search Google’s cached stockpile of indexed pages.
4) Google estimates the domain’s overall authority based on links.
5) Pages go through rigorous examination by Google and are combed through with their policies and policy updates, then penalties are applied. Once finished each page will have many pieces of data attached.
- Spam reports are solicited by Google. They also receive notifications to remove pirated work.
- Google’s search quality and webspam teams review and refine its algorithm.
- The quality of searches is rated by 15,000+ testers in remote locations.
6) A user runs a search query in Google’s search bar for something
- In most user searches through Google you are unknowingly placed into test groups.
- this data from those tests are used to refine the algorithm.
7) Goole’s algorithms decide which synonyms if any, are qualified to include with the keywords from the user search.
8) The initial result set is created for display.
- Localization applied: local websites are promoted within the search results.
9) PageRank and Authority Sort the current result set, while content that’s duplicated is removed.
- Google finds relevant ads based on keywords, ad match type, and user location.
- Advertisements are loaded at the beginning before the generic search results as well as other areas. They have an AD placed in the top left corner signifying that it’s an ad. There’s usually 4 to 5 of them.
- Advertisers operating outside the guidelines can get in serious trouble, and even get their accounts banned.
- If the keyword has low search volume or ad generates too few clicks (click-through rate), those ads may be disabled for this query.
- Special consideration is given to companies that Google tends to favor, like Amazon for instance.
- Most ads that are created have their content set in stone, however, you can create dynamic ads with dynamic keyword content to make the ad seem more relevant.
- Google also provides you with extensions so add things like location, telephone number, similar products, alternate email addresses, and different links.
- If the ads have a high enough click-through rate (when people see the ad and a high enough % of people will click it) some of the advertisements will appear on the first page before the search results.
- Some don’t break-through, while others are placed further back in the results.
10) Google’s filters are applied to the result set. RankBrain, another one of Google’s Algorithms, is a machine learning system that helps Google better decipher the meaning behind queries, and serve best-matching search results in response to those queries.
- Websites users have visited in the past are promoted in the results.
- Websites who black hat their way into the top of the search results are banned if they are caught.
- Ex: An example would be a user who uses excessive anchor tags in order to boost the website’s SEO score.
- Once your website is banned from Google if you can try and clear it up with them, but if it’s something serious enough Google won’t stop at your website, they will ban your account and other websites tied to it.
- Local Interconnectivity: If pages in the result set are well linked with other high ranking pages in the result set, their pages will get a boost.
- User bias: Websites the user has visited before may receive a boost in rank.
11) Updates are applied.
- Panda Update – a Google update that tries to diminish those websites which are purely created to rank in the search engines. The first Panda update was released in 2011 and Google has re-run this update periodically.
- Penguin update – a Google update that judges the links websites got from other sites. If the links turn out to be artificial (e.g. created by buying or exchanging), Google no longer assigns link value. The first Penguin update was rolled out in 2012. Google has re-run this update several times and it is now said to be run continuously.
- Hummingbird Update – a Google update that laid down the groundwork for voice-search. It pays more attention to each word in a query, ensuring that the whole search phrase is taken into account, rather than just particular words. The Hummingbird update was released in 2013. Unlike the Panda and Penguin Updates, this was not an extension, but instead, it was a full change in the core algorithm.
- SSL – July 1st, 2018. Google had been announcing for months it was going to start giving serious penalties to those websites which were not encrypted with SSL Encryption by July 1st.
- How SSL works – you visit a website, your browser detects SSL encryption and requests a certificate. The certificate delivers your browser the website specs. Your browser determines what level of encryption to request and once that is determined encrypted communication begins being passed back and forth so your communications cannot be eavesdropped on.
- Possum Update – a Google update that was released in 2016. After Possum, Google has shown more varied results depending on the physical location of the searcher and the phrasing of the query.