Looking across languages: Mass translation of the world’s news

Kalev H. Leetaru
George Washington University; Founder of GDELT project

Imagine a world where language was no longer a barrier to information access, where anyone can access real-time information from anywhere in the world in any language, seamlessly translated into their native tongue and where their voice is equally accessible to speakers of all the world’s languages. Authors from Douglas Adams to Ethan Zuckerman have long articulated such visions of a post-lingual society in which mass translation eliminates barriers to information access and communication. Yet, even as technologies like the web have broken down geographic barriers and increasingly made it possible to access information from anywhere in the world, linguistic barriers mean most of those voices remain steadfastly inaccessible. What would it look like if one simply translated the entirety of the world’s news coverage each day in real-time using massive machine translation?

For the past two years the GDELT Project (http://gdeltproject.org/) has been monitoring global news media, identifying the people, locations, counts, themes, emotions, narratives, events and patterns driving global society. Working closely with governments, media organizations, think tanks, academics, NGO’s, and ordinary citizens, GDELT has been steadily building a high resolution catalog of the world’s local media, much of which is in a language other than English. Enabling GDELT to look across this material holistically required construction of a massive machine translation infrastructure capable of real-time translation of the world’s daily journalistic output in 65 languages. This talk will explore what it looks like to use massive computing power to look across the world’s information, cultures, and languages, to try and better understand the driving forces of global society and what it means to be human.