Former Googler: Google ‘using clicks in rankings’

Does this mean clicks are a direct ranking factor? No. In fact, BERT and MUM are making user data less important.

Chat with SearchBot

“Pretty much everyone knows we’re using clicks in rankings. That’s the debate: ‘Why are you trying to obscure this issue if everyone knows?'”

That quote comes from Eric Lehman, a former 17-year employee of Google who worked as a software engineer on search quality and ranking. He left Google in November.

Lehman testified last Wednesday as part of the ongoing U.S. vs. Google antitrust trial.

If you haven’t heard this quote yet, expect to hear it. A lot.

But. That’s not all Lehman had to say. Google’s machine learning systems BERT and MUM are becoming more important than user data, he said.

  • “In one direction, it’s better to have more user data, but new technology and later systems can use less user data. It’s changing pretty fast,” Lehman said, as reported by Law360.

Lehman believes Google will rely more heavily on machine learning to evaluate text than user data, according to an email Lehman wrote in 2018, as reported by Fortune:

  • “Huge amounts of user feedback can be largely replaced by unsupervised learning of raw text,” he wrote.

User vs. training data. There was also a confusion around “user data” vs. training data” when it came to BERT. Big Tech on Trial reported:

“DOJ’s attempt to impeach Lehman’s testimony also seemed to backfire. In response to a DOJ question about whether Google had an advantage in using BERT over competition because of its user data, Lehman testified that Google’s ‘biggest advantage in using BERT’ over its competitors was that Google invented BERT. DOJ then put up an exhibit titled ‘Bullet points for presentation to Sundar.’ One of the bullets on this exhibit said the following (according to my notes): ‘Any competitor can use BERT or similar technologies. Fortunately, our training data gives us a head-start. We have the opportunity to maintain and extend our lead by fully using the training data with BERT and serving it to our users…’

This likely would have been an effective impeachment of Lehman if “training data” meant some kind of user data. But after DOJ concluded its re-direct examination, Judge Mehta asked Lehman what “training data” referred to. Lehman explained it was different from user search data.”

What is it like to compete against Google?

Sensitive Topics. Lehman was also asked by DOJ attorney Erin Murdock-Park about a slide from one of his slide decks on “Sensitive Topics” that instructed employees to “not discuss the use of clicks in search…”

According to reporting from Big Tech on Trial (via X), Lehman said “we try to avoid confirming that we use user data in the ranking of search results.”

The reporter X post says “I didn’t get great notes on this, but I think the reason had something to do with not wanting people to think that SEO could be used to manipulate search results.”

Google = liars? Since discovering this testimony, SEOs have been quick to use Lehman’s quotes as definitive proof that Google has been lying about using clicks or click-through rate for all of its 25 years.

The question of whether Google uses clicks was the first question asked last week during an AMA with Google’s Gary Illyes at Pubcon Pro in Austin. Illyes answer was “technically, yes,” because Google uses historical search data for its machine-learning algorithm RankBrain.

Technically yes, translated from Googler speak, means yes. RankBrain was trained on user search data.

We know this because Illyes already told us this in I am Gary Illyes, Google’s Chief of Sunshine and Happiness & trends analyst. AMA on Reddit in 2018. He said RankBrain:

  • “uses historical search data to predict what would a user most likely click on for a previously unseen query.”

RankBrain was used for all searches, impacting “lots” of them, starting in 2016.

So how is Google Search using clicks? The fact that Google tracks every click in Search does not mean clicks are necessarily used as a direct ranking factor. In other words, if site A gets 100 clicks and site B gets 101 clicks, then site B automatically jumps up to Position 1.

Much like how Google employs people (quality raters) to rate the quality of its search results, Google has said they use click data for evaluating experiments and personalization.

“…we take a subset of users and force the experiment, ranking and/or ux, on them. Let’s say 1% of users get the update or launch candidate, the rest gets the currently deployed one (base). We run the experiment for some time, sometimes weeks, and then we compare some metrics between the experiment and the base. One of the metrics is how clicks on results differ between the two.

– Gary Illyes, I am Gary Illyes, Google’s Chief of Sunshine and Happiness & trends analyst. AMA.

In a 2017 interview, Illyes said clicks are a “very noisy signal”:

“Clicks in general are a very noisy signal. I worked on trying to make observations from click data. It’s like a Gordian knot. Because there are tons of people who are scraping the results and trying to fetch ranking data, and for whatever reason, they also decide to click on things automatically. Links. It’s just a huge mess.

When we have controlled experiments, then obviously we have to look at click data. Before we launch a ranking change, typically what we do is to isolate 1% of the users and give them modified search results, modified by the new ranking algorithm or a piece of the algorithm and see how they like the new results. And in these instances, we do look for long clicks, short clicks, and so on. But in general, as I said, it’s a huge mess.”

Let Me Google That For You – An Interview with Gary Illyes

And here’s an exchange between Illyes and Search Engine Land co-founder Danny Sullivan (now at Google) from 2015:

Sullivan: Okay. How about click through rate? We know you measure what clicks are going on. Is that part of the algorithm?

Illyes: So… We use… clicks… in a few different ways. The main things we use clicks on are evaluation and experimentation. These are the two main things. There are many many people who are trying to induce noise in clicks. One would be Rand Fishkin. Using those clicks directly in ranking, would be pretty…

Sullivan: is Rand just clicking on stuff to mess things up?

Illyes: I think what he’s doing is hiring people to click and stuff, experiment etc. Using clicks directly in ranking would not make too much sense with that noise.

Sullivan: But do you use it at all?

Illyes: Okay, yes. In certain cases. Okay, let me give you an example. In certain cases it makes sense to use clicks directly. For instance if you have personalized results, and you search for apple, the first time you searched for apple we would most likely serve you a disambiguation box. Do you mean the company, or the fruit? If you had clicked on Apple the company in the past, we know you are most likely interested in Apple the company. The second time you click on Apple the company, we become more convinced that’s what you’re looking for.

If you’re a programmer, after a few searches, your searches will be dominated by the programming language results.

Sullivan: So you’re using it for personalization?

Illyes: Yes exactly, the thing [click through rates] is about personalization, if you want to mess up your own search results by randomly clicking on stuff, go ahead.

AMA with Google Search SMX 2015: Danny Sullivan and Gary Illyes

Why we care. Does Google use clicks? Clearly, yes. But again, probably not as a direct ranking signal (though admittedly I can’t say that with 100% certainty as I don’t work at Google or have access to the algorithm). I know clicks are noisy and easy to manipulate. And for many sites/queries, there simply wouldn’t be enough data to evaluate to make it a useful ranking signal for Google.

Dig deeper. The biggest mystery of Google’s algorithm: Everything ever said about clicks, CTR and bounce rate

Additional reading. In Google Patents Click-Through User Feedback on Search Results to Improve Rankings (2015), Bill Slawski described a patent explaining how Google might rank pages based on user feedback (clicks) in response to rankings for those pages.


About the author

Danny Goodwin
Staff
Danny Goodwin has been Managing Editor of Search Engine Land & Search Marketing Expo - SMX since 2022. He joined Search Engine Land in 2022 as Senior Editor. In addition to reporting on the latest search marketing news, he manages Search Engine Land’s SME (Subject Matter Expert) program. He also helps program U.S. SMX events.

Goodwin has been editing and writing about the latest developments and trends in search and digital marketing since 2007. He previously was Executive Editor of Search Engine Journal (from 2017 to 2022), managing editor of Momentology (from 2014-2016) and editor of Search Engine Watch (from 2007 to 2014). He has spoken at many major search conferences and virtual events, and has been sourced for his expertise by a wide range of publications and podcasts.

Get the must-read newsletter for search marketers.