Cloud computing has in many ways made the relatively new practice of “crowdsourcing” feasible. Of course crowdsourcing isn’t a new idea, since it’s essentially just using a huge number of people to accomplish a task that would be impossible or very difficult for one or a few. The pyramids were built with crowdsourced labor. What cloud computing has done, is allow us to efficiently unify the intellectual efforts of a large crowd by removing barriers to communication, data collection, organization, and analysis.
Cloud Crowdsourcing in Hard Science
Most predictably cloud computing brings scientists and academics who are separated by vast distances into a common forum where they can simultaneously interact with the same system to solve problems. Doctors, IT analysts, Physicists, or any other analytical scientist group can share data and cooperate to develop new technologies or solve problems quickly and efficiently. Additionally they can use a cloud system to automatically harvest, federate, and analyze vast data sets to help people pre-empt possible issues in real time.
Another way the cloud has been harnessed to capture human brain power is gamification, turning problem solving into a game in order to attract people to dealing with it. Hard sciences are benefitting from this by allowing many non-specialized people from all over the world work on issues that would ordinarily take vast amounts of time, specialized education, and effort to complete. For example pharmaceutical companies have used this to create a game that approximates the functions of real chemical bonds in molecular structures. The result is a puzzle game that solves medical puzzles and produces new drugs.
Cloud Crowdsourcing in Computational Speech Processing
Another very interesting way that the cloud is being used is in computational speech processing. Until now speech recognition and production has been about teaching a computer to understand the grammatical rules of a language and developing listening software that can detect and parse spoken language (which is incredibly complicated, hence why speech recognition programs are so notoriously finicky).
Now with Cloud computing it’s possible to scale up processing power, data storage, and, most importantly, data input. This means that for the first time it’s technically possible to build a statistical language and acoustic models that are not rule-based, but instead match entire words and phrases to existing data on the server. That means that the programmer wouldn’t need to code huge complicated language processing program, but rather relatively simple acoustic matching software.
Of course this requires a vast amount of data against which to match incoming speech, which is where crowdsourcing comes in. Every time a user inputs speech, and the program reaches the correct interpretation it can save the new input into its databanks, so that next time that it hears something similar it will be able to match it more accurately.
Google is already applying this basic idea to speech translation, by simply storing existing translations between languages and regurgitating them when prompted, then allowing users to input “better” translations if they have them. Because of this Google Translate automatically becomes a more sophisticated translation tool every time someone uses it.