查看原文
其他

交互式机器翻译产品的商业逻辑|访谈

Andrew Joscelyne 机器翻译观察 2022-04-24

本文来自近日TAUS(Translation Automation User Society/翻译自动化用户协会)对Lilt(交互式机器翻译产品)公司CEO的专访。读者可一窥Lilt交互式机器翻译产品的商业逻辑。划重点:

1、机器翻译和语言处理方面已经达到了拐点。

2、自从20世纪50年代有机器翻译以来,人们就不喜欢后译后编辑,无论如何,我们认为这种给客户机器翻译+低价后编辑是一种有缺陷的模式。

3、我们已经造出能翻译随机句子的机器,但还没有成功地翻译完整的文档。

4、我们的竞争对手是今天碎片化的市场。

以下全文由yeekit机器翻译和观察君共同翻译。仅供交流讨论参考,不代表任何机构立场。

Lilt, the San Francisco-based translation platform, recently received a major new round of funding from Sequoia. TAUS talked with about the origins, mission and working model of this technology-centered company, and how this approach will enable Lilt to overcome what he identifies as the major challenge: industry fragmentation. 

总部位于旧金山的翻译平台Lilt 最近获得了红杉资本新一轮的大额投资。我们与之讨论了这家以技术为中心的公司的起源、使命和工作模式,以及这种方法将如何使 lilt 能够克服他所认为的主要挑战:行业碎片化。

1. What led you to start Lilt in the first place?

1. 是什么让你创立了Lilt?

Fifteen years ago I was working in the Middle East and noted that of all the books published in Arabic translation from English, the majority of the content was pulp fiction, but almost no works of biography or textbooks on a wider range of subjects. In other words, there was very little access to useful information from around the world. 

15年前,我在中东工作时注意到,所有从英语翻译成阿拉伯语出版的书籍,大部分内容都是低俗小说,但几乎没有涉及更广泛主题的传记或教科书。 换句话说,世界各地很少有机会获得有用的信息。

Later in 2011, I met my co-founder when we were working on Arabic at Google Translate during the Arab Spring, and we were surprised to learn that much of the translation industry was not using machine translation technology, so we decided to work on a solution. 

2011年晚些时候,阿拉伯之春期间,我在谷歌翻译公司(Google Translate)研究阿拉伯语时遇到了我的联合创始人。我们惊讶地发现,很多翻译行业大部分人并没有使用机器翻译技术,因此我们决定研究一个解决方案。

2. Why use a technology-driven solution for what has been a very human-centered service?

2、为什么要在一个以人为中心的服务体系内使用技术驱动的解决方案呢?

We see our mission as evangelizing “Information access.” We believe this is worth focusing on now that the next half a billion people will be coming online in the coming years. There are two key factors that condition this ambition – one is the state of the technology, which has now reached an inflection point in the case of machine translation and language processing and which people have been using to help translators since 2009. The other is the world has now figured out how to assemble large groups of people to deliver services of various kinds. 

我们的使命是传播"信息获取" 之道。我们相信这是值得关注的,因为接下来的5亿人将在未来几年接入互联网。 有两个关键因素为这个使命创造了条件—— 一个是技术的现状,在机器翻译和语言处理方面已经达到了拐点,自2009年以来人们一直在使用它来帮助翻译。 另一个是大众现在已经知道如何整合大量的人群来提供各种各样的服务。

The larger question, of course, is why this technology solution has not really been used effectively until now. This is due to the fragmented state of the industry, and the difficulty of building a sustainable business. 

当然,更大的问题是,为什么这种技术解决方案直到现在才真正得到有效应用。 这是由于该行业的分散现状,以及建立一个可持续的业务的难度。

The underlying cause of this fragmentation is that there are no inherent economies of scale in a labor-intensive business. So you need to use technology to produce the economies of scale that enable you to go further and differentiate. This means first solving the business problem of serving customers. We believe that technology is now poised in a way that helps us solve the next problem - delivering on this vision of information access. 

这种分散的根本原因是在劳动密集型企业中没有固有的规模经济。 因此,你需要使用技术来创造规模经济,使你能够更进一步和差异化。 这意味着首先要解决服务客户的业务问题。 我们认为,现在的技术已经成熟,可以帮助我们解决下一个问题——实现信息获取的愿景。

So our business strategy is based on building a strong translator community around a technology core that is specifically tailored to localization and translation problems. 

因此,我们的业务战略是基于一个强大的翻译社区设定的,这个社区的基础是旨在解决本地化和翻译问题的核心技术。

3. How do you operationalize this business model?

3、你如何运作这种商业模式?

Our main task is to create a good work environment for our suppliers. After all, translators would go away and do something else if the work became mostly sweatshop labor. We are very concerned to build a community of people who are ready to embrace the technology and work enthusiastically to achieve their own ends. 

我们的主要任务是为供应商创造一个良好的工作环境。 毕竟,如果工作变成了血汗工厂模式,译者就会离开去做别的事情。 我们专注于建立一个拥抱技术并热情工作以实现自己目标的人组成的社区。

This involves, for example, being outspoken about post-editing as a dumb practice. Ever since the 1950s when MT began, people have disliked post-editing, and in any case we think this approach of giving customers MT plus a price reduction is a flawed model. They might as well do it themselves. But training a large-scale MT system involves high-level training skills in data management. So that is our added value. 

例如,这涉及到将译后编辑视为愚蠢之举的看法。 自从20世纪50年代有机器翻译以来,人们就不喜欢后译后编辑,无论如何,我们认为这种给客户机器翻译加低价后编辑是一种有缺陷的模式。 他们还不如自己动手。 但是训练一个大规模的机器翻译系统需要高水平的数据管理。这就是我们的附加值。

We also know that the fully-automated translation problem is a strong AI problem and is not going to be solved soon. So for business applications you need a human in the loop. But simply using an API doesn’t work well enough. You need to build systems to personalize translation to customer needs, yet which also respond to user practices. 

我们还认为,全自动翻译问题是一个重大的人工智能问题,这个问题短期内不会得到解决。因此,对于业务应用,这个闭环需要人工介入。 但是仅仅使用 API 行不通,你需要构建系统实现个性化翻译以满足客户的需求,同时还要对用户的实践进行反馈。

Our system is unique because it learns all the time, so we don’t think in terms of “retraining” an engine. We have customers in production who have never retrained. Our baseline model is trained on general domain data that we collect, and domain-specific data is initialized off that. These processes are persistent and learn as they are used, which means that every time a segment is updated it goes into the system. Algorithms are not the differentiators - it’s the data. 

我们的系统是独一无二的,因为它一直在学习,所以我们不会考虑对引擎进行"再训练"。 我们的客户在生产过程中从未接受过再训练。 我们的基线模型是根据我们收集的一般领域数据进行训练的,特定领域的数据是根据这些数据进行初始化的。 这些过程是持续不断的,并且在使用过程中不断学习,这意味着每次更新一个句段时,句段都会进入系统。让一切显出区别的不是算法,而是数据。

4. What sort of team do you need to handle this agenda?

4、 你需要什么样的团队来做这个事情?

One analog for what we are is a company trying to build an electric vehicle. Just as a specific technology – the electric engine - stands at the heart of the EV, so we assume that MT is the always-on core technology at Lilt. This is very different from the sort of supplier that dips their toe into MT when they think they can use it successfully. 

对于我们来说,一个类似的例子就是一家公司试图制造一种电动汽车。 正如一种特定的技术——发动机——处于电动汽车的核心位置,我们认为机器翻译是Lilt的核心技术。 这与那些自认为可以用好机器翻译时就会尝试一下的供应商是完全不同的。

So we have a core team of 7 or 8 research scientists and NLP community people responsible for the data pipelines and domain adaptation. A small team like this can both focus on MT automation and make a contribution to society. Around that, we have senior engineers, customer and service people, marketing people, and so on. In all Lilt has a staff of 25 today, and we are hiring experienced people who care about the overriding translation problem of information access and have the requisite domain knowledge. 

因此,我们有一个由7到8名研究人员和 NLP 技术人员组成的核心团队,负责数据和领域优化。 像这样的一个小团队既可以专注于机器翻译自动化,也可以为社会做出贡献。 围绕这一点,我们有高级工程师、客户和服务人员、营销人员等等。 现在Lilt总共有25名员工,我们正在招聘有经验的员工,他们关心信息获取过程中棘手的翻译问题,并且具备必要的领域知识。

5. What do you make of the ‘human parity’ argument in advancing translation automation?

5. 你如何看待推进翻译自动化达到"人类水平"论点?

I disagree with this concept. Primarily because the quality-testing regime that has been used ever since the first Georgetown experiments in the 1950s evaluates sentences in isolation, and this is still used today. The fact is that we are so far able to build machines that can translate random sentences, but we haven’t yet successfully translate complete documents. So we need a better evaluation scheme to identify when MT is suitable for specific cases. 

我不同意这个概念。 这主要是因为自从20世纪50年代乔治敦大学第一个实验以来一直使用的质量评估体系对句子进行孤立评估,这种体系至今仍在使用。 事实上,到目前为止,我们已经造出能翻译随机句子的机器,但是我们还没有成功地翻译完整的文档。 因此,我们需要一个更好的评估方案,以确定什么时候机器翻译适合于特定案例。

We believe that it is thoughtless to apply a given technology to a whole profession without any differentiation. The fact is that translators are writers at heart. And at Lilt, we build bilingual augmented writing tools (i.e. predicting the next words) for them, with a more attractive interface. This dates back to an idea from Martin Kay in the 1980s. We also think that translators want to be paid by the hour rather than by their output, rather as coders are. 

我们认为,不加区分地将一种特定的技术应用于整个行业是欠考虑的。 事实上,译者本质上就是作家。 在 Lilt,我们为译者构建了双语增强写作工具(即预测下一个单词) ,辅以更加友好的界面。 这可以追溯到20世纪80年代马丁 · 凯的一个想法。 我们还认为,翻译人员希望按小时收费,而不是按输出多少字收费。

6. What sort of growth do you anticipate for Lilt’s language mix?

6、对于Lilt的语言组合方案,你预期会有什么样的增长?

To ensure a sustainable business, most translation suppliers aspire to a ceiling of 30 to 40 languages. But our mission should be to make it easier to translate into all languages. There are two problems here - data availability for low-resource languages for MT engines, and the fact that translators can be hard to find. For example, we have one customer who needs translation into Pashto and Igbo, so we are looking for data on these. 

为了确保业务的可持续性,大多数翻译供应商都渴望达到30至40种语言的上限。 但是我们的任务应该是使翻译成所有语言变得更加容易。 这里有两个问题——机器翻译引擎使用的低资源语言的数据可用性,以及很难找到翻译人员的事实。 例如,我们有一个客户需要翻译成Pashto和Igbo语,所以我们正在寻找这方面的数据。

We currently handle 29 language pairs into and out of English and support translation between all 29 languages. Our ambition is to cover any language that a business would need. 

我们目前支持29种语言对英语以及29种语言之间的互译。 我们的目标是涵盖企业需要的任何语言。

7. How do you continue to inspire your translator community while prioritizing your customers?

7、 你如何在优先考虑客户的同时继续激励你的翻译团队?

Our people take pride in their work, so instead of anonymizing our translators, we give them direct access to the customers for whom they work. Our mission is to build a tech company, so we want to build the tools and tech to make workflows more efficient. This means that it is less important for us to know where the human comes from. Breaking down the wall between customer and worker adds value to our business service. At the same time, it is very important to talk to our translators and build a community, as most of them work alone yet want to share our burden. So we need to invest in both sides of this equation – customers and community. 

我们的员工以他们的工作为荣,所以我们不是匿名翻译,而是让他们直接接触他们所服务的客户。 我们的使命是建立一个科技公司,用工具和技术使工作流程更有效率。 这意味着对我们来说,知道人来自哪里并不那么重要。 打破客户和译者之间的藩篱,就是我们的商业价值。 与此同时,与我们的翻译人员进行交流并建立一个社区也是非常重要的,因为他们中的大多数都是单独工作的,而且都想分担我们的重担。 因此,我们需要在这个等式的两边——客户和社区——加大投入。

8. Who is your main competitor?

8、你的主要竞争对手是谁?

I tell the team: our competitor is the fragmentation of the market today! Marc Andreessen says that bad markets always need great teams. We see the dynamics of our fragmented market as challenging, so we need to focus more on the underlying causes of these market challenges and less on our business competitors. 

我告诉团队: 我们的竞争对手是今天碎片化的市场! 马克•安德森(Marc Andreessen)表示,糟糕的市场总是需要伟大的团队。 我们认为,我们把分散的市场动态视为真正挑战,因此我们需要更多地关注这些市场挑战的深层原因,而不是我们的商业竞争对手。

9. Finally, Lilt’s key goal?

9、最后一个问题,Lilt的关键目标是什么?

We certainly say yes to the idea of becoming a unicorn! We naturally want to be a big sustainable customer-focused business that makes translation more affordable. But the real challenge is to be sustainable. So our recent fund-raising effort validates the progress we have made. But it does not prove yet that we are a successful company!

我们当然同意成为独角兽的想法! 我们自然希望成为一家可持续发展的、以客户为中心的大型企业,使翻译更便宜。 但真正的挑战是可持续性。 因此,我们最近的融资努力验证了我们所取得的进展。 但这还不能证明我们是一家功成名就的公司!

原文标题:How a Tech Company Tackles Global Information Access/科技公司如何应对全球信息获取挑战。来自TAUS官网,左下方可阅读原文。

您可能也对以下帖子感兴趣

文章有问题?点此查看未经处理的缓存