There’s much confusion about how Google rank pages.
Everyone is guessing how it works, that’s why many think Google use dwell time when it’s not stated.
There are over 200 factors used in Google Ranking factors, according to Sundar Pichai. This makes it even more challenging to know what ranking factors to look after.
If we want to be ranked, we know that we should create excellent content matching search intent of the search engine users.
One of the greatest misconceptions about Google Rankbrain is how it works for SEO content.
Some stated that the algorithm behind it is LSI — an old algorithm to make relationships between words — but it’s absolutely not.
We also know that it’s one of the factors that are used to make a website rank.
But what exactly is this algorithm?
Previously they introduced us to Google Panda, Google Hummingbird, and Google Penguin to rank high-quality contents or punish duplicate or poor contents.
Google Rankbrain is a bit different. Let’s see why.
What is Google Rankbrain?
First and foremost, it’s a Google algorithm who uses Machine Learning —mathematical, mostly statistical models to make predictions — to understand search queries.
We have first to understand that Google Search’s goal is to match search queries with public web pages all over the internet.
The domain where computers are learning to understand human languages is called Natural Language Processing(NLP).
A simplified way to understand NLP goes like this :
- Human: Writes/Speakes a query to a computer
- Computer: Extracts words from the query
- Computer: Analyses the words
- Computer: Gives the expected results back to the Human
This process is what every NLP algorithm does. It’s simple yet very complicated in practice for engineers because language is a living thing. A word doesn’t mean the same depending on the country even with the same language.
Software engineers wrote many rules about what to show when a query arrives from the search engine. But the rules they can write is limited. Additionally, since there are many rules to follow the system gets more complex and hard to adapt to new events.
But what does Machine Learning add to their work?
Machine Learning is a way to invert the process of how software developers create rules.
Engineers feed a sample data to a mathematical model to “train” him — learn the rules in our data — and he gives him completely new data to see if he has “learned” to effectively generalize the rules he saw in the sample data.
The process looks like this:
- Split your data with the 70% to the train the model, 30% to test it
- Give the training data(70%) to the model to learn the rules
- Test the model with the test data(30%), he has never seen before
- If he can predict well the rules inside the test set, that means the model is ready!
In the context of the Google search algorithm, the role of Rankbrain is to understand new queries coming from Google search but only what he has never seen before. The difficulty is to decide what to show to the searcher when the search engine has never seen the query before.
How does Rankbrain decide what to show?
He tries to answer never-seen-before queries by looking at the previous search data and draw a relationship with the words between the data and the unknown query with a technique called Word Embedding.
“If RankBrain sees a word or phrase it isn’t familiar with, the machine can make a guess as to what words or phrases might have a similar meaning and filter the result accordingly, making it more effective at handling never-before-seen search queries.” – Greg Corrado, Research Scientist at Google AI
Before we talk about what Word embedding is, we have to understand why.
We have two kinds of languages:
- Formal Languages: Mostly programming languages, can be fully specified. There are reserved words which are defined and the correct ways to use them are specified and can’t be broken.
- Natural Languages: They are not designed. They emerge, so there’s no formal specification. Natural languages involve vast numbers of terms that can be used in ways that introduce all kinds of ambiguities, yet can still be understood by other humans. It’s always evolving. The meaning of a word can drastically change.
A Language Model is an algorithm who can predict the next word of a text, and Word Embedding uses Language Modeling techniques to draw relationships between words.
Word Embeddings are a way for Google to look at a text. It can be a query, a page, a site, and understand the words in those better.
It can understand when a word or a sentence could be added, and this will help Google Rankbrain decide what to allow to rank for new queries because he can take into account the context and meaning of words in any query.
The name of the Word Embedding used this case is named BERT, it was released in Open Source by Google on November 2019.
Can we optimize for Rankbrain?
Yes! You can!
Optimizing for RankBrain is actually super easy, and it is something we’ve probably been saying for fifteen years now, is – and the recommendation is – to write in natural language. Try to write content that sounds human. If you try to write like a machine then RankBrain will just get confused and probably just pushes you back. – Gary Illyes, Google
When you write, write like another human being.
Cover your subject like another person would talk about the content you are writing.
Apart from that, you can’t do anything to make Ranbrain like your content. However, it’s an opportunity to cover the subject as deep as possible.
Covering your blog post idea with in-depth research can create more relationships about your topic.
But remember that Google Ranbrain will not make you rank higher but give you the possibility to be listed for unknown queries. All the other factors will be taken into account to know if your content should be positioned first or last depending on the quality of your content and the search intent.