For years now, even before ChatGPT placed artificial intelligence firmly at the forefront of the public imagination, A.I. has been slowly taking hold across industries from medicine to aerospace. However, the technology hasn’t quite lived up to its potential. Not even close.
A recent study found that only 11% of firms using A.I. have gained financial benefits. Even technology giants have struggled. IBM’s $20 billion diagnostic A.I. system Watson Health diagnosed cancer more accurately than doctors in laboratory experiments but flopped in the field. It became a commercial and reputational disaster for the storied American company.
The failure could hardly be blamed on a lack of technical expertise. IBM had tasked an army of engineers to work on Watson. Our extensive research on the challenges of developing A.I. in various commercial settings points to a surprising cause: Watson was developed and brought to market in a way that works well for traditional I.T.–but not A.I.. This is due to a fundamental difference between conventional software and A.I.: While the former processes data, A.I. continually learns from the data and becomes better over time, transcending even its intended capabilities if nurtured properly.
Practices analogous to the best parenting styles can accelerate A.I. development. We prescribe an A.I. development approach that is based on nurturing and learning and has been implemented in more than 200 A.I. projects for industrial and other customers.
Let it learn from mistakes
Children do not learn to cycle by watching an educational video but by clambering onto a bike and stepping on the pedals, learning valuable lessons from each painful fall–before long, the magic happens.
The same logic applies to A.I. Many companies like IBM think they should collect vast amounts of data to perfect the algorithms before deployment. This is misguided. Putting A.I. to work in the real world, rather than sequestering it in controlled environments, helps generate more data that in turn feeds back into the development process.
Although early deployment is inherently riskier, it also initiates a continuous feedback loop through which the algorithm is enriched by new data. Further, it is important that the data stems from both standard and difficult or atypical situations that, taken together, support comprehensive A.I. development.
ChatGPT is a great example. The chatbot was released to the public by OpenAI last November while still wet behind the ears, albeit more for reasons concerned with getting ahead of the competition. In any case, the gamble worked: Not only has ChatGPT become a worldwide phenomenon, leaving the likes of Google’s Bard scrambling, but its early launch also garnered millions of users and generated vast amounts of data for OpenAI to push out GPT-4, an improved version of the bot, only months later.
Another example is Grammarly, whose finessing of its writing assistance system with the help of user feedback showcases the power of continuous A.I. improvement and adaptation, particularly in the complex and context-sensitive realm of languages.
Similarly, Apodigi, a frontrunner in the digitalization of the pharmacy business, launched in June 2020 an A.I.-assisted pharmacy app that can be described as learning on the job. Called Treet, the app proposes medication based on doctors’ prescriptions which a pharmacist then reviews and tweaks. The pharmacist’s responses coalesce into a stream of continuous feedback which finetunes the algorithm and contributes to better recommendations that address the complexities of each patient’s needs and preferences.
By comparison, IBM developed and tested Watson Health extensively in the laboratory and pushed out the diagnostic tool to market without incorporating continuous learning from real-world data. This traditional build-test-deploy process proved inadequate for training A.I.
Keep it safe
Safety mechanisms that protect consumers and safeguard reputations are essential in A.I. development. For example, Tesla runs new versions of its self-driving software in the background while a human drives the car. Decisions made by the software, say turning the steering wheel, are compared to the driver’s. Any significant deviation or unusual decisions are analyzed, and the A.I. is retrained if required. Simulator environments such as AILiveSIM allow for full-scale AI systems to be safely and comprehensively tested before their deployment in the real world.
A.I. developed for creative applications arguably needs stronger guardrails. Analogous to children mixing with bad company and learning undesirable habits, A.I. could be exposed to training data that are riddled with biases and discriminatory content.
To preempt this, OpenAI, for one, employs an approach called adversarial training to train its A.I. model to not be fooled by rogue inputs from attackers. This method involves exposing chatbots to adversarial content that threatens to overcome the bot’s standard constraints, enabling it to recognize rogue content and avoid falling for them in the future.
In the ideal A.I. development cycle, developers log all user reactions and behavior to feed further development of the algorithm without questioning the accuracy or value of a recommendation or prediction. The Netflix A.I. content recommender, for example, simply notes whether a user watches the recommended content and the viewing duration. The algorithm learns from each response to make a better recommendation next time.
The developers of Watson Health could have achieved better outcomes if they had subscribed to this principle. Rather than programming the algorithm to solicit doctors’ evaluation of the A.I.-generated recommendation, they could have trained the system to simply record doctors’ prescriptions. Integrating Watson Health into patient information systems would also have immersed it in a feedback loop for continuous training based on actual cases and patient outcomes.
User feedback provides excellent training data for vertical applications with a specific focus.
But rather than relying on humans to label data, developers should think of ways to automate the process. For example, connecting a vehicle’s front camera feed with the steering wheel can automatically create labels for winding roads and feed into AI models learning to drive a car on complicated routes.
In fact, developers should deploy many automated data collectors and design explicit feedback loops for learning at scale. In the above example of driving assistance development, many vehicles can cover a wider variety of situations than just a few. A vehicle cutting in front of a Tesla triggers a video upload from the previous few seconds preceding the event. The system feeds the footage into Tesla’s deep neural network that learns the various signals, such as a gradual movement towards the lane divider, that predict the cut-in and take appropriate action like slowing down. In contrast, traditional car companies are often mired in a fixed mindset, developing and deploying driving assistance software with little automated feedback collection or data updating.
Just as children do not stay in kindergarten forever; the training methodology for A.I. should be continuously upgraded. But too often, A.I. developers focus on the latest developments in A.I. algorithms and individual use cases rather than engineering the system to cover a large number of use cases and data streams.
Going one step further, companies can develop a simulation environment that generates synthetic data and allows for faster development cycles. For example, Tesla captures data from its fleet of cars to feed a simulator that simulates complex traffic environments, resulting in new synthetic training data.
Tero Ojanpera, Ph.D., has been a professor of practice on intelligent platforms at Aalto University since 2021. He is also co-founder and executive chairman of Silo AI, collaborating with numerous leading global corporations. Previously, he was Nokia’s CTO, chief strategy officer, and head of research. Treet is a former customer of Silo AI. AILiveSIM is a technology partner of Silo AI.
Timo Vuori, Ph.D., has been a professor of strategic management at Aalto University since 2013 and a visiting scholar at INSEAD from 2013-2015.
Quy Nguyen Huy, Ph.D., has been a professor of strategy at INSEAD since 1998 and chair of the school’s Strategy department from 2010 to 2012. He is known for his pioneering work linking social-emotional and temporal factors to the organizational processes of strategic change and innovation.
The opinions expressed in Fortune.com commentary pieces are solely the views of their authors and do not necessarily reflect the opinions and beliefs of Fortune.
More must-read commentary published by Fortune:
Read the full article here