How large is the machine translation market? Is it $67.5 billion as Language Weaver recently claimed? Well, just as T&I Business has said previously noted about the size of the human translation industry and the size of the interpretation industry, it all depends on how you define it.
The key word used to define this estimate is "untapped." This is an estimate of the "untapped" machine translation market in 2011, which definition is dramatically different from the definition used to estimate the human translation industry size and interpretation industry size mentioned above.
Until this "mind boggling" number (as Language Weaver CEO Mark Tapling describes it), industry estimates related to the machine tranlation market had always been described in the hundreds of millions (Example 1, Example 2), still even a far cry from fom the human translation market pegged at $14.25 million by Common Sense Advisory. So, where does a number like $67.5 billion come from? To assuage our curiosity, Tapling was kind enough to explain the calulations behind this estimate in a podcast that accompanied this bold announcement.
Here is a summary of Language Weaver's calculations for the untapped machine translation market in 2011:
1,800 billion GB of new digital content in 2011 [according to IDC]
x 30% produced in a workplace [thus implying longer term value]
540,000 billion GB of new digital content with longer term value
540,000 billion GB of new digital content with longer term value
x 0.000001 [one ten-thousandth of 1% - a small # with no explanation]
x 12.5 gigawords per GB [implied, but not mentioned by Tapling]
6.75 million gigawords to translate [Tapling rounds down to 6.7]
6.75 million gigawords to translate [in 2011 "untapped" MT market]
x $0.0001 per word [one 100th of a cent for high-volume MT]
$67.5 billion "untapped" market for machine translation.
So there you have it. That's the math behind Language Weaver's estimate for the 2011 "untapped" machine translation market. And it all sounds much more believable when you know the thinking behind it.

4 comments:
You are off by a factor of 10 on the number they forecast.
The biggest flaw with this is that just because the data exists digitally - does not mean somebody cares to translate it.
The real MT market is less than $50M in 2008. So to $67.5B all I can say is Oh Really ?1?!!
I still think that he was counting Zimbabwe dollars.
Thanks for catching the misplaced decimal, Curtis. [now fixed]
There really doesn't seem to be a solid explanation for the multiplier of 1/10,000th of 1%. On the surface, it appears to be nothing more than a really small percentage to make the number to make the rest of the calculation sound more believable, and I don't know if a more accurate multiplier exists.
There is also some ambiguity and wiggle room granted by defining this as "untapped" market size.
What numbers are you using to get 50 million? And what is your reasoning behind those numbers?
I think it would be great to publicize some solid alternative numbers, whether or not they define the market in the same way.
There is a pretty substantial multi-page discussion on this subject in the Linked In Automated Language Translation Group at http://is.gd/rhVb but you need to be a member.
The $50M is obtained by adding up the sales (as in actual reported revenue) of all the MT companies.
This forecast has been the cause of much hilarity in the machine translation industry as it it really is equivalent to saying something like:
If every organic being in the solar system (and the next closest 3) were to buy a Ferrari you would sell 67.5M Ferraris.
However, common sense will dictate that maybe everybody does not want to buy a Ferrari, and even if they had the money they may prefer a minivan or something else.
This "forecast" is so silly that it begs to be mocked, and many were mocking it at the AMTA conference in Hawai last year.
Other analogies that might clarify this further:
If Haiti were to grow to triple the size of the US economy because the voodoo market really took off
Or if you and I made a magic juice that every human and animal on the planet preferred to water and was willing to pay us $10/day then you and I would be richer than Gates and Buffet added together in 2 years
A lil bit out there, no? Get the point?
Curtis, here is my response to the discussion thread you posted. I've posted this comment on LinkedIn.
Does this old discussion thread include the same illogical disagreements that we see every time a new estimate for “market size” comes out?
Looking at this logically, we must consider the following: Are we trying to answer the same question (or estimate the size of the same market type) that Tapling and Language Weaver (LW) were answering? If not, then is there a better question that we should be answering?
We need to remember that the size of the machine translation “market” depends heavily (if not completely) on how you define that market.
Some of us in the industry are stating that LW is wrong, but our “evidence” is irrelevant to the definition that LW uses. LW estimated the “untapped” machine translation markets will total at least $67.5 billion in 2011. And, to paraphrase the way Don DePalma put it, this is not a solid estimate, it is a guess at the opportunity that could be targeted.
Some of us argue that LW is wrong because software sales are smaller. BUT the LW estimate appears to have included potential service revenues, not just software sales.
Some of us imply that LW is wrong because the value would be larger based on current human translation rates. BUT LW was defining the untapped market size based on potential MT rates.
If we think LW’s numbers are wrong based on LW’s definition, then we should look at the calculations explained in Tapling’s podcast and argue for a different total: http://tandibusiness.blogspot.com/2008/09/how-large-is-machine-translation-market.html
Otherwise, if we still disagree then we should argue that $67.5 billion is not the number we should focus on (which is certainly debatable). If we think we should focus on a different number, we can say that without labeling LW’s number as wrong, when it is “wrong” only because we moved target. We can more accurately make one of the following statements:
- It would be better to focus on (and estimate) the actual revenue that will be generated by MT service providers in 2011
- It would be better to focus on (and estimate) revenue from software sales in 2009, which is $X for this reason and that reason.
- It would be better to focus on (and estimate) total size of digital content, and then estimate that X% will be translated if quality/satisfaction of major raw MT engines is up to a certain level, but only x% will be translated by MT if that level is not met. And MT rates will likely be XX for one reason and another.
Again, the size of the machine translation market depends on how you define it. This seems to happen every time someone comes out with a new number to estimate market size of some language industry segment.
For example, when CSA first came out with their solid estimates of translation industry size, some critics said CSA was wrong because the “market” was a different size when calculated based on the critics’ different definitions. CSA came out and gave more background to explain what was included and what was excluded from its defined market size, and everyone saw the logic behind it. CSA was right, and the critics may also have been right about their own estimates for a different definition of the market.
It is ironic that these arguments among linguists can go on for so long before someone notes that each side's argument is based on a different and distinct definition of the same term or phrase.
See examples of similar definition problems here:
http://tandibusiness.blogspot.com/2005/12/size-of-t9n-l10n-industry-based-on-how.html
http://tandibusiness.blogspot.com/2008/08/worlds-most-translated-author.html
http://tandibusiness.blogspot.com/2006/02/how-large-is-interpretation-industry.html
Post a Comment