Business & Society

Synthetic Data: a new Business Model?

By Falk Hedemann / 04.06.2018

Data is the raw material for AI systems. But what do companies do that do not generate enough data?

Large technology companies such as Google, Facebook or Amazon have a common success factor: They are data-driven and consistently build their business models using artificial intelligence, algorithms and deep learning. However, this can only be transferred to the established companies of the old economy in a few cases. They must first transform themselves digitally and convert their business models to the data-driven approach. Even large companies initially lack the necessary data to benefit immediately from the possibilities of artificial intelligence. Because algorithms and deep learning in particular require large data sets. The more data that is available, the better the results.

The data-driven companies therefore have a clear advantage here: Their intelligent systems receive more input and can learn to recognize the important patterns more quickly. With smaller amounts of data, the learning progress not only takes longer, but is also less precise. This also makes competition more difficult for the otherwise agile small and medium-sized enterprises and leads to an increased susceptibility to errors under the pressure of digitalization. Those who have enough data, on the other hand, can actively challenge digitalization themselves and consolidate or even expand their position through new business models. But what should you do if you do not have enough data of your own?

Synthetic data feeds AI systems

Artificial data for artificial intelligence? Admittedly, this sounds a little bizarre, but it has already become a veritable business model. In fact, synthesized data has been used in scientific co-simulations for a long time now. For example, when it comes to new components for highly complex systems with different levels, such as when digitalizing the power supply system or setting up a digital transport system with autonomous vehicles. Such a simulation is then completely based on data from another simulation. These data are generated by special algorithms that are capable of imitating real data. Together with the existing data packages, this creates a sufficiently large amount of data to trigger a learning effect in the AI application.

But not only the quantity plays an important role for the necessary training of the AI systems − the quality is also very important. Data from simulations have the advantage that the framework conditions are strictly defined. This makes them highly valid, because they do not first have to be analyzed and put into context.

What can synthetic data be used for?

Among other things, synthetic data can be helpful for the AI-supported evaluation of image and video material in areas such as healthcare, security, robotics, logistics or production. This often produces so much data that people can no longer evaluate it or it would simply take too long. In addition, the data often lacks context, so that they first have to be analyzed at great effort.

This is why there are now the first companies to specialize in synthesizing data. One of these companies is the start-up TwentyBN from Berlin. The four founders have developed their own “data factory”, which produces high-quality videos for learning systems. What’s special about them: The videos are already enriched with predefined labels, so that they no longer have to be analyzed by the AI system, but can serve as a basis for further training. The data thus basically provide the pattern and enable the AI systems to learn quickly and reliably.

LDV Capital from New York follows a similar approach. Here, too, categorized synthetic data for deep learning systems are used to train them for the evaluation of image material. As such, they solve the “cold start problem” for start-ups from different industries: They all need a sufficiently large data base with a high quality and context-relevant data sets that train their algorithms.

One of LDV Capital’s customers is AiFi. The start-up from Silicon Valley wants to make cashless shopping possible by means of artificial intelligence and evaluates extensive image and video material from the shops to this end. We are already familiar with the principle from Amazon Go, but AiFi does not want to act as a dealer itself. Instead, it wants to make the technology available to all retailers.

This shows not only the great potential of synthetic data and the resulting business models, but also that the data giants by no means have an unassailable advantage. Synthetic data therefore bring democracy to the data-based business of the future.

About the author

Falk Hedemann

Falk Hedemann is a freelance journalist and content editor. He is not only a passionate writer for specialist media, but also brings his journalistic experience to companies' digital communication and content projects in a variety of ways. His round trip is rounded off by the colourful world of content as co-editor of the UPLOAD magazine.