We study the problem of estimating the total volume of queries of a specific domain, which were submitted to the Google search engine in a given time period. Our statistical model assumes a Zipf's law distribution of the population in the reference domain, and a non-uniform or noisy sampling of queries. Parameters of the distribution are estimated using nonlinear least square regression. Estimations with errors are then derived for the total number of queries and for the total number of searches (volume). We apply the method on the recipes and cooking domain, where a sample of queries is collected by crawling popular Italian websites specialized on this domain. The relative volumes of queries in the sample are computed using Google Trends, and transformed to absolute frequencies after estimating a scaling factor. Our model estimates that the volume of Italian recipes and cooking queries submitted to Google in 2017 and with at least 10 monthly searches consists of 7.2B searches.
Estimating the Total Volume of Queries to Google
Ruggieri S.
2019-01-01
Abstract
We study the problem of estimating the total volume of queries of a specific domain, which were submitted to the Google search engine in a given time period. Our statistical model assumes a Zipf's law distribution of the population in the reference domain, and a non-uniform or noisy sampling of queries. Parameters of the distribution are estimated using nonlinear least square regression. Estimations with errors are then derived for the total number of queries and for the total number of searches (volume). We apply the method on the recipes and cooking domain, where a sample of queries is collected by crawling popular Italian websites specialized on this domain. The relative volumes of queries in the sample are computed using Google Trends, and transformed to absolute frequencies after estimating a scaling factor. Our model estimates that the volume of Italian recipes and cooking queries submitted to Google in 2017 and with at least 10 monthly searches consists of 7.2B searches.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.