Q&A with Vered Shwartz – Lost in Automatic Translation

June 10, 2025

How do language technologies like ChatGPT, Siri, and Google Translate shape our everyday lives—and what do they miss? In this engaging book, LangSci member Dr. Vered Shwartz (Assistant Professor, UBC Computer Science) explores the hidden costs and surprising consequences of relying on AI to navigate an English-speaking world. We spoke with Vered about the opportunities and risks of these tools, what remains uniquely human in communication, and how both developers and users can push for more inclusive and responsible technologies. Her book will be available in hardcover, paperback, and ebook.

Read the Q&A below as we look forward to the release of her book Lost in Automatic Translation: Navigating Life in English in the Age of Language Technologies. The book will be released in July 2025 (available in hard-copy, paperback, and e-book). Pre-order it now:

Get the book!

About the Book: Lost in Automatic Translation

1. What motivated you to explore the impact of language technologies on how we communicate, especially across languages and cultures?

I’ve been working on natural language processing for over a decade, predominantly developing tools for English. These technologies have made an incredible progress since I started, first with automatic translation maturing and becoming fairly reliable for most language pairs, and more recently with large language models (LLMs) enabling smooth interaction between users and computers in natural language and performing a range of tasks for us. As a non-native English speaker, I went through a parallel process of improving my English, acquiring anything from vocabulary through figurative expressions to cultural references after having moved to the US and then Canada. Having gone through this process as both a user of language technologies and a researcher studying them motivated me to explore this question.

2. Where do you see the greatest risks and the greatest opportunities in language technologies?

We are experiencing an exciting era in which language technologies are increasingly used and deployed. LLMs are already being used in fields such as education, medicine, and law. There is a lot of hope about them automating tasks, generating ideas and knowledge, and freeing us to work less or work on more interesting things. But we are yet to see their long-term impact on the job market, for example whether they would make us obsolete or also create new jobs, and society at large, for example their impact on human connection. Among my more immediate concerns are the technical limitations that LLMs still suffer from, such as their “hallucination” problem, their limited reasoning abilities and overconfidence, and societal biases. I’m worried we’re rushing to deploy them in sensitive domains and are tempted to cut costs by replacing people with LLMs. Instead, we should take the time to use them responsibly to augment people where appropriate.

3. In your view, what is still uniquely "human" about communication that technology struggles to replicate?

I believe that interactions with language technologies are mostly transactional and that people likely don’t feel the same way as they do when interacting with other people, even when these technologies behave in a seemingly human way such as expressing empathy or using fillers such as “uh” to sound more human. But to give a more concrete example, pragmatics is one of those things that are still more present in human interactions. Human conversations have a certain situational context, such as where and when they take place, the relationship between the participants, the cultural background of each participant, previous interactions, and more. Human interaction is efficient, so we leave a lot of implied meaning unsaid. Language technologies for the most part don’t have this context and things are often “lost in translation”. My favorite example is translating a cake recipe from Hebrew to English that called for “preheating the oven to 180 degrees”. The recipe omitted the implied Celsius unit, which the intended audience in Israel could infer. When I used Google Translate to translate the recipe for my Canadian partner, it translated it perfectly to English to “preheat the oven to 180 degrees”. It worked just as expected, but had he been less experienced with baking, he would have underbaked the cake at 180 degrees Fahrenheit, which is the implied unit in recipes for him.

4. Are there things users, especially non-native speakers, can do to improve their experience with language technologies, or is the responsibility primarily on developers?

That’s a great question. I think the developers have a huge responsibility in presenting a realistic picture of what these technologies can do, what they still can’t do, and what their risks are. Unfortunately, the developers have a financial incentive to sell us these technologies as a working product. So even when they tell us about their limitations, they do so in way that minimizes it. For example, the disclaimer about hallucinations is a tiny print at the bottom of the page that says “ChatGPT/Gemini can make mistakes” that most people not already familiar with this problem would not notice or take seriously. Another example is the focus of some companies on the “existential threats” from LLMs; they are trying to convince us that their technologies are so powerful that they are a threat to humanity, distracting us from the actual harm that is already happening from deploying these products with all their limitations in real-world applications. As users, we need to be careful, curious, and skeptical when we use these technologies, and to find ways to use them that are productive to us without overly trusting them. To enable that, I think we would need to educate people about AI and teach critical thinking skills in schools.

5. If you could redesign one aspect of today's language technologies to better reflect the diversity of voices, what would you change?

One of the key enablers of progress in language technologies was the availability of massive amounts of data to train these models. When this data is unavailable, things are far less impressive. For example, translation mostly works well for language pairs with a lot of online data, such as English and French, but it is far worse for low-resource languages such as Igbo. Modern LLMs are trained on multiple languages, and you can interact with them in multiple languages, but the quality is not equal – because most of the training data, which is web text, is in English. Since most of the English text online comes from users in the US, another side effect is that LLMs learn about the world through a North American lens. One way to present LLM users with diverse perspectives could be to train multiple specialized LLMs, prompt them in multiple languages to get different perspectives, and synthesize that to the user through a user interface that allows some form of exploration of different perspectives. It’s not perfect, but it’s better than presenting one perspective or generating a wall of text that users wouldn’t want to read.

Don't forget to pre-order her book!

To learn more about Dr. Vered Shwartz's research, particularly her insights into the causes and solutions of biases in AI, check out our earlier Q&A with her as part of the LangSci Meets AI series:

LangSci Meets AI Series #3: Dr. Vered Shwartz

Q&A with Vered Shwartz – Lost in Automatic Translation

About the Book: Lost in Automatic Translation

1. What motivated you to explore the impact of language technologies on how we communicate, especially across languages and cultures?

2. Where do you see the greatest risks and the greatest opportunities in language technologies?

3. In your view, what is still uniquely "human" about communication that technology struggles to replicate?

4. Are there things users, especially non-native speakers, can do to improve their experience with language technologies, or is the responsibility primarily on developers?

5. If you could redesign one aspect of today's language technologies to better reflect the diversity of voices, what would you change?

UBC Language Sciences

About UBC

UBC Campuses

UBC Sites

Q&A with Vered Shwartz – Lost in Automatic Translation

About the Book: Lost in Automatic Translation

1. What motivated you to explore the impact of language technologies on how we communicate, especially across languages and cultures?

2. Where do you see the greatest risks and the greatest opportunities in language technologies?

3. In your view, what is still uniquely "human" about communication that technology struggles to replicate?

4. Are there things users, especially non-native speakers, can do to improve their experience with language technologies, or is the responsibility primarily on developers?

5. If you could redesign one aspect of today's language technologies to better reflect the diversity of voices, what would you change?

First Nations land acknowledegement

UBC Language Sciences