Google Reveals Two New Web Crawlers

Google has introduced two new web crawlers designed to scrape image and video content specifically for research and development purposes.

These new crawlers are variations of the GoogleOther crawler first launched in April 2023. The original GoogleOther crawler is used for fetching publicly accessible content for various product teams within Google, often for one-off research and development crawls. The two new versions, named GoogleOther-Image and GoogleOther-Video, specialize in crawling non-text binary data—specifically images and videos.

The introduction of these crawlers is straightforward: GoogleOther-Image will exclusively crawl image content, while GoogleOther-Video will focus on video content. For those who may want to prevent these crawlers from accessing their sites, Google has provided user agent tokens that can be added to the robots.txt file. The user agent tokens for each are as follows:

GoogleOther-Image:

– User agent tokens: GoogleOther-Image, GoogleOther
– Full user agent string: GoogleOther-Image/1.0

GoogleOther-Video:

– User agent tokens: GoogleOther-Video, GoogleOther
– Full user agent string: GoogleOther-Video/1.0

Additionally, Google has updated the user agent strings for the original GoogleOther crawler to reflect the specific technologies in use, such as the version of Chrome. This makes it easier for publishers to identify and authenticate these crawlers in their server logs.

With these updates, Google aims to continue enhancing its research and development capabilities by efficiently scraping image and video content. Publishers who might not want their media to be accessed can block these new crawlers easily via the provided user agent tokens.