Extractive Summaries – New Google Algorithm

Google published a scientific article dedicated to the new algorithm, which is able to take the content of different sites and on its basis create “logically coordinated” articles.

Generating original content, the new algorithm can answer user questions without redirecting them to other sites.

How the new algorithm works

At the first stage, the new algorithm generalizes web content using an algorithm that “extracts” site content, and then cuts off irrelevant parts – similar to algorithms used to create featured snippets.

The results generated by this algorithm are called “Extractive Summaries” in the article, because they consist of content extracted from web pages.  In fact, these reports represent a selection of the most important proposals relevant to the user’s question.

At the second stage, the new algorithm uses another kind of algorithm called Abstractive Summary, which is a form of paraphrasing. The disadvantage of artificial paraphrasing is the fact that almost a third of such reports contain fake facts.

According to the article, Google researchers have found a way to combine the best aspects of both approaches. They use Extractive Summaries to extract the most important facts from web documents and then apply the Abstractive Summary to paraphrase this content .The result is a new document based on information found on the Internet.Thus, Google creates its own version of Wikipedia.

Featured Snippets is the first step

Blocks with answers are an example of Extractive Summarization described above. A new two-phase algorithm can be applied to books, open databases, as well as any public web pages.

In the research, Wikipedia topics were used as search queries and Google search results as the source of “retrieved reports”. The algorithm then paraphrased this content to create completely new articles. The algorithm also ran a parallel test, generating a second set of articles using only the references mentioned by Wikipedia.

Results of the experiment

Summing up, the researchers note that the experiment was successful .Google can create its own content by aggregating the contents of web pages, thereby answering the user’s question and not redirecting it to other sites.

However, there is no mention as to when Google begins to apply this algorithm.