This is from Quora. My answer is below…
When is machine learning better than crowdsourcin
While it depends on the complexity of the task and the desired quality of the results, if you need to guarantee high quality results, it is best to augment ML with crowdsourcing.
In fact, over the next few years, I think you can expect to see an increasingly symbiotic relationship between crowdsourcing and machine learning.
- Crowdsourcing is an infant industry, the inner dynamics of which are poorly understood and for which there is only the scantest academic research. The innovation that's happening at companies like Crowdflower (disclaimer: the CEO is an advisor to my company) and SpeakerText (disclaimer: this is my company) is just barely scratching the surface of what can be done by applying ML and advanced statistics to crowdsourcing processes. Expect this space to heat up in a massive way over the next five years.
- Crowdsourcing is the most cost efficient way to get labeled training data, and for certain kinds of ML problems––most notably speech-to-text––the size, quality and relevance of your training data set matters more than the algorithm itself. Historically, speech companies budgeted millions of dollars just to acquire a sufficient set of training data. Now, because of the internet and things like Mechanical Turk, you can easily tap into cheap, on demand labor and even turn what was once considered "work" into a game played by people across the globe.
- ML + Crowdsourcing = Supervised Learning. Quite simply, if you first attempt to solve a problem with an ML algorithm––like say speech-to-text––and then assign a human to correct any errors, you can––if you do it right––not only reap huge short term efficiency gains (vs pure human effort), but also improve machine accuracy over the long term by adding more and more labeled training data into the system. The result: constantly increasing labor efficiency, lower costs, and smarter machines––coupled with extremely high quality results.
These ideas are fundamental to the technology that we've built behind SpeakerText. If we're wrong, then the company is fucked. If we're right, well, I'll let you figure it out…and we'll see what happens.