Categories: Culture

Pirate AI | Eurozine

ChatGPT and generative AI truly burst onto the scene in late 2022. There’s a good chance that the intersection between books, publishing and AI first crossed your radar sometime in 2023, quite possibly after The Atlantic’s investigation revealed hundreds of thousands of pirated books had been used to train this technology. It was all going so fast.

In January 2024 the Society of Authors in the UK surveyed its members about generative AI and its impact on them professionally. The results, published in the spring, certainly gave pause for thought. Among the respondents, more than 40 per cent of translators reported reduced income due to generative AI, and more than 75 per cent of translators expected generative AI would negatively affect their future income.

Yet my own AI-related stomach drop moment came in February 2024 when I read an op-ed in the Swedish newspaper Aftonbladet provocatively headlined ‘AI will replace all of us translators’. In it, Kalle Hedström Gustafsson described his panic at the rapid encroachment of AI into the world of translation and the likely and imminent decline of the profession. Accusing the translation profession and interested bystanders of burying their heads in the sand, he made a sensational claim. He declared that, with minimal editing, AI is already producing publishable outputs for some types of literature. And, more damning, that technology’s dominance across all types of literature is inevitable and imminent.

I recognize a doom-monger when I read one, but my stomach hadn’t exaggerated – much of what he said rang true. If the robots were coming for the translators into Swedish, then they would presumably be coming for me working out of Swedish too. Feeling queasy for weeks, for the first time ever I devoted hours to thinking seriously about careers I might explore outside of translation. I was bound to be out of a job soon – why hadn’t I given this a moment’s thought until now?

‘Pirates Used to Do That to Their Captains Now and Then’, illustration of a dead captain left on shore, originally published in Janvier, Thomas, November, 1894. ‘Sea Robbers of New York’, Harper’s Magazine. Image via Wikimedia Commons

The tech

In the halcyon days of yore when I started out at the beginning of the 2010s, Google had already been providing an online translation option of one type or another for a decade. By 2012 the service had reached 200 million monthly users and had been using a statistical machine translation (SMT) approach for six years. Google engineer Franz Och said ‘What all the professional human translators in the world produce in a year, our system translates in roughly a single day.’ He invited the reader to ‘imagine a future where anyone in the world can consume and share any information, no matter what language it’s in.’

No surprise that it was a running joke with colleagues starting out on their own translation journeys that our translating days were already numbered.

Yet, our inevitable doom didn’t arrive – at least not as quickly as we had expected. This was perhaps partly because in the world of commercial translation, CAT (computer assisted translation) tools had been commonplace for years. In lay terms, the software cuts source texts up into segments (most usually by sentence) and then the human translator translates each segment, which is saved in a translation memory. This can subsequently be consulted and leveraged by the software and its user. CAT tools provided the translation industry with a major productivity boost, but it wasn’t altogether clear if machine translation would deliver the same shot in the arm. In short, the quality of SMT was simply not good enough. But the creep of these (online) machines into a commercial translator’s daily existence was nevertheless palpable.

Then in the mid-2010s, a new breakthrough: the rise of neural machine translation (NMT). NMT was a harbinger of what was to come – that machine translation suggestions were getting better. A lot better. Or at least, that was the case for language pairs like Swedish and English, which are relatively close linguistically, and with large corpuses to build these systems. Rates weren’t going up. Deadlines were getting shorter. After the oddity of COVID-19’s early days, the 2020s seemed to be characterized by this onward technological march. And then ChatGPT was launched on 30 November 2022 and the world went a bit mad.

Rumblings in Scandi books

Unsurprisingly, the arrival of fully fledged generative AI has done nothing to improve the situation for commercial translators. But there’s no impact on literary translators, right? Wrong.

There was obviously something in the air in 2022. The first hint that a publisher had caught on to the opportunities offered by the machine-translation-based race to the bottom arrived in my inbox early that summer. It came from a major Danish publisher for whom I had previously worked, and who was now attempting to push into translated e-book and audiobook foreign literature markets. The covering e-mail read ‘we have recently started developing ideas that could help make the translation pipeline more efficient’. The accompanying survey focused almost entirely on the use of CAT tools and, more importantly, on the post-editing and use of machine translation output.

Not long after that, a colleague told me about their own experience with another major Scandinavian publisher using a similar ‘take-over-the-world’ approach. Media coverage at the time made it clear that the company was in a budgetary hole. Where they had previously been funding human translators at good rates, they began to outsource and offshore these operations to translation companies outside the publishing space in jurisdictions where the likeliest outcome was a race to the bottom and the use of whatever wizardry could be found online for free.

Scandinavia has a strong culture of books and literacy, and a robust publishing market that has been increasingly innovative in the twenty-first century. While the e-book has never quite taken off in Scandinavia as it did elsewhere (possibly because Amazon has only recently entered the market), the region has been a pioneer in the world of audiobook streaming, the development of cutting edge ‘hybrid’ publishing (authors part-pay to publish) and the professionalization of the foreign rights scene in a way that rivals the cut and thrust of the Anglo-American agenting world. By autumn 2022, a new literary agency had launched in Sweden promising to help those authors it took on to secure publication in Sweden, but more importantly to make the jump abroad, with euphemisms for ensuring cost-effective translation that hinted at potential shortcuts circumventing ‘expensive’ translators. By early 2023 Swedish media was covering publisher Lind & Co’s decision to use AI to translate genre fiction into Swedish, with translators on social media aghast.

Back to 2024

Kalle Hedström Gustafsson’s gloomy piece in Aftonbladet was an omen. Just a day after reading it, I received a call from a Swedish literary agent asking whether I could assist in the 36-hour turnaround of a large chunk of text translated from an early draft using AI. My stomach dropped again. The professionalisation of the Scandinavian rights industry has been heavily driven by the use of long English-language sample translations of original works, and the production of these is a valuable source of work for translators like me. This time, I was genuinely busy and could turn down the project without a moment’s thought.

LBF and spring

Of course, the reason for the rush had been that it was just days before the London Book Fair (LBF). I arrived feeling glum. The front page of The Bookseller magazine on day one hardly improved matters. It reported that literary scouts were ‘pivoting’ to the use of AI in their operations, including the preparation of samples. Later on, standing on the fringes of the packed Literary Translation Centre venue listening to a panel discussing the issue of ‘translation and AI’, I overheard two LBF visitors walking past wondering why there was so much of a crowd. ‘It’s about translation and AI’ one said to the other. ‘Just use it’ the other quipped, leaving this eavesdropper reeling.

Having put the hubbub of LBF behind me, some of the nausea that I felt in late February and early March began to dissipate. My ego had been flattered by the kindness of those who bought and appreciated my translated words. The op-ed was pushed to the back of my mind as I received commissions for me-generated translations from clients. But then came another request – for a ‘skilled translator’ to edit a full book translated using DeepL Pro to get it to ‘tip-top quality’. After over-thinking this at length – after all, these sample translations are expensive, and a chance to save costs is perhaps a wise business decision by a literary agent – I settled for politely declining without any reason.

Then – bam – the Scandinavian tendency towards exploring new platforms struck again. In May, the launch of a new publisher, Aniara Press, was announced. This Swedish company would handle the translation, publication and distribution of books across seven different languages in fourteen different markets – all with the help of AI-generated translations and post-editing. Prospective authors were reassured by the founder that there would be translators and editors around the world standing by to ‘check’ AI translations of their works. All a bit unsettling.

At cross-purposes

Part of the problem is unclear terminology. ‘AI’ is often thrown around without clarifying whether we’re referring to large language models (LLMs) and generative AI, like ChatGPT, or to other, non-generative, analytical or task-focused AI. In the context of translation, we are more likely to use ‘AI’ to refer to task-focused AI in the form of statistical machine translation and NMT. But not always. Even within this one sphere, everyone is at cross-purposes.

In a piece for the 2024 summer issue of the UK Society of Authors’ magazine, The Author, translator Ruth Ahmedzai Kemp explores issues around literary translation and AI. Drawing on a range of informants, she reaches some interesting conclusions. It’s noteworthy that she collates experience with AI from inside the translation profession, typifying the way that translators are engaging with a multitude of tools and activities when they say ‘AI’.

A Kazakh-English translator in Kemp’s article highlights the advantages and disadvantages of doing their work in a CAT tool with a machine translation option available for consultation – similar to having an old school dictionary open in front of you that you are able to flip through very quickly and effectively. Another translator, working with French, describes the pitfalls of post-editing texts where the source has been fully machine translated. Yet another translator, working with German, highlights that the very term ‘post-editing’ is nebulous – there’s a perception among cash-strapped hopeful publishers that once the machine has done the heavy lifting, a human can add a little polish and be done quicker and at a fraction of the cost. However, it is laborious if it is to be thorough, as there is a need to review the entire text against the source.

There’s an all too common assumption by non-translators that we can simply ‘feed’ a book to machine translation or AI and accept whatever we’re given in return. Hence, it is bound to be quick and cheap. Translators often assume that they will be handed one of these ‘translations’ and then asked to check it – a very unsatisfactory workflow that is boring and time-consuming to boot. Yet, in the instances in Sweden described by Hedström Gustafsson, and in the case of my Danish publisher mentioned above, publishers are likely to be cutting out the translator altogether – they’re simply getting an editor to polish the target text without consulting the original.

This leads to the problem of quality and perception of quality. Hedström Gustafsson insists that AI can translate most things well. The Swedish publisher Kristoffer Lind largely agrees, his company only uses machine translation on genre fiction at present. There has been significant discussion around this approach, noting that it is often used for works and authors who would otherwise simply remain untranslated. Yet, the Swedish translator Johanna Svartström argues that most translations turned out by AI are ‘amateurish’. She disputes Gustafsson’s view that all that remains to be done is a final polish.

Once again, they are arguably referring to different things. Gustafsson (and Lind) are suggesting that AI produces translations good enough to allow a professional editor (without necessarily having any source language knowledge) to turn the output into a text that is publishable. It may not be a good translation, but it will be a readable book. Svartström, meanwhile, is almost solely focused on the quality of the translated output.

Perceptions around what ‘quality’ is matter too. Roy Youdale questions whether what we are seeing is a mirage, referring particularly to generative AI such as ChatGPT. As he puts it, these tools prioritize fluency over all else and tend to make stuff up. This ‘mirage’ is what is frequently referred to elsewhere as ‘hallucinations’ (i.e., the tendency for AI to get stuff wrong). In a peer-reviewed article provocatively titled ‘ChatGPT is bullshit’ published in the journal Ethics and Information Technology this June, the authors argue that LLMs like ChatGPT do more than just ‘hallucinate’ – they are in fact bullshit machines that are designed to churn out untruths.

All this begs the question, what are readers looking for from a translated book? A smooth finish or something that represents the original?

What about copyright?

In her piece, Kemp explored the situation around copyright for literary translators and their translations in this new era, presenting a robust argument in favour of translators retaining copyright even when working with NMT or AI. She noted not only that ‘the process remains a complex, creative one where the human translator balances two parallel texts: one fixed, and one emerging’, but also that what is key is that ‘a human – and indeed a trained, experienced bilingual human translator – was in control’ of any tools used, ‘and remains in control (in copyright and moral terms) of the translated text post-submission.’

This contrasts with the recent view taken by Denmark’s Agency for Culture and Palaces, which oversees literary policy matters in the country. The Agency’s statement was in response to an inquiry from the Danish Translators Association (part of the Danish Authors’ Society) about post-editing. The specific context was the emergence of a practice that saw publishers translate full books using AI and then having them heavily edited by monolingual editors prior to publication; these editors were then credited as ‘translators’. The issue raised was whether these ‘translators’ were entitled to receive public lending right (PLR) payments for their input. The government’s response in June 2024? A hard no. Post-editors were not creators of works and were not entitled to PLR cash.

Quality and translation experts

Franz Och concluded in 2012, ‘for nuanced or mission-critical translations, nothing beats a human translator’. This is as true today in the world of LLMs and ChatGPT as it was when the technology was still SMT-based.

In contrast to the dismissive comment I overheard at the LBF this year, I also experienced positive encounters. One literary agent I spoke to said of their latest client’s new novel: ‘just as an experiment, we tried running the first few chapters through AI, but it just wasn’t good enough.’ Phew. Another agent told me they were avoiding AI precisely because the investment of money and time into a high-quality human-translated sample was a key pillar in their sales pitch – they believed so much in the book and its author they were willing to spend top dollar to show it to prospective buyers. Phew again. Yet another agent shared the splendid news with me that they had just sold one of their authors to a UK publisher (this writer’s first outing in English), attributing the cause of this to the sample translation I had delivered to them. Not unlike the translator Frank Wynne, who noted while accepting the French-American Foundation’s 2024 Translation Prize, that he is ‘all in favour of AI translation if you simply remove the “A”, and leave the “I”.’

Long-term AI cynic, Ed Zitron, is sceptical about the real-life applications of AI. While much of his writing offers hope to the jaded translator fearing for their career, his analysis of what the public want from their consumed media is apt. ‘The assumption is that audiences are stupid, and ignorant, and “just won’t care,” and I firmly disagree – I think regular people will find this stuff deeply offensive.’

A tool?

You may recall the Writers Guild of America’s (WGA) 2023 strike action against the Alliance of Motion Picture and Television Producers, which disrupted a number of productions and led to a shutdown lasting several months. Unlike the situation for literary translators in Europe, the WGA negotiates agreements with studios on a collective basis and individuals are effectively required to join the union in order to work. While the strike centred on various issues, AI was one of them. In its negotiated resolution, WGA secured undertakings stating that AI can’t be used to rewrite literary material and that AI-generated material can’t be used to undermine a writer’s credit. Importantly, writers may choose to use AI as a tool when writing but can’t be forced to use it. Studios also have to disclose if materials they supply are AI-generated and prevent the dissemination of writers’ materials for training AI.

Ruth Ahmedzai Kemp, who supports the use of AI, exudes an air of pragmatic optimism in her article for The Author: she doesn’t see an end to human translation but rather believes that human-machine symbiosis represents ‘an evolution in professional roles’ … ‘in a context where we’ll always need human, bilingual insight, instinct and intuition.’ She advocates for the emergence of a market for ‘human-crafted translation, for international literature with a human connection’. As Kemp notes in a separate piece, ‘even with machine translation as an aid, literary translation is a badly paid form of demanding, highly skilled labour; if tools can speed up our work that should be to our benefit not our detriment in terms of pay.’

Swedish publishing commentator Sölve Dahlgren agrees. He notes that the winners will be competent publishing professionals who adapt to technological change – after all, while ‘screwdrivers may have been replaced by power tools, good carpenters are still in demand’.

Beyond quality

Frank Wynne delivers a compelling view: ‘If we entrust our art to machines, they will in time, perhaps, create a simulacrum of art that is adequate. But adequate is a poor substitute for human.’ While I tend to agree, having written that ‘for many there is a deep-seated desire to translate and I also believe there is an audience that desires human-translated content’, I consider this a privileged position for established translators.

We might cry from the rooftops about the value of human translators, but what can be done in practical terms? It’s tempting to focus on the quality of the output from AI-driven ‘bullshit machines’, but this is somewhat of a red herring. Hedström Gustafsson suggests we move away from discussions of quality and soul, and instead grapple with the thorny issue found in every industry threatened by AI: is there intrinsic value in humans performing certain types of work, and if so, what is this? The real issue at stake – at least for us translators – is our livelihood. We can argue that the quality isn’t up to scratch, and that readers want human-translated content, but ultimately, if no one will pay, that doesn’t matter.

And in the meantime, we need to continue advocating for ourselves both to readers and publishers, we need to hold the crooks who stole our work to build this technology to account, we need the New York Times to win its lawsuit against OpenAI and Microsoft, we need robust unions to support translators and all other writing creatives, we need organizations like the CEATL to survey the situation across borders and markets so that we can better respond, we need to develop AI licensing schemes that are fit for purpose, and we desperately need regulation of AI. And we need to do all that while we worry about the future and hustle to make a living now.

We’d better hope that Ed Zitron is right and that our readers do care.

Source link

Washington Digital News