Givaudan has a super efficient way to let panelists evaluate flavors and rate them on a hedonic scale from one (I dislike this extremely) to 9 (I love it extremely). A total of 40 flavors was synthesized in an experiment and evaluated by 70 people (the data contained 70x40 experiments with 8 ingredients and one KPI being sensory evaluation score, i.e. “liking”. The kicker in sensory science is that if you mixed the same ingredients in the same proportions, but different volumes - the liking results can differ wildly! E.g. I may love a little bit of vanilla, but with twice as much I might hate it.
So the company wanted to know which ingredients drive liking and how, and what should be the optimal flavor composition to maximize liking. Another issue is that ingredients might influence liking in completely non-linear ways, i.e. when I keep increasing the volume of one ingredient the average liking might slowly go up, but then unexpectedly fall down.
What to do? Apply non-linear predictive analytics, of course.
By building gazzilions of predictive models for each panelist we created models of liking, identified which ingredients drive liking for each person, and how (apparently, different people are driven by different things, AND in different directions. The propensity to like flavors for you and me might be driven by vanilla in both cases, but I might love it and you might hate it!
So when we turned people’s data into robust predictive models, we could evaluate the liking scores on ANY flavor, consisting of an arbitrary composition of the eight ingredients. HOW COOL IS THAT? As long as our models showed a reliable trustability measures we could use them to predict opinion scores of our (cyber)-panel on thousands of flavor compositions in a fraction of a second. MANY DAYS of experimentation were saved there, but not just that.
We could use the models to identify most promising flavor compositions which both maximize the overall opinion score of the panel AND minimize the deviation in opinions. These were the experiments that Givaudan wanted to test of actual people, and so, the data-driven flavor design was born.
Of course, knowing the models has given us the possibility to segment the panelists by their propensity to like products, as well as by product features (flavor ingredients in this case) that drive their liking.We published extensively on this topic - let me know if you want to read more.
And here is the main point of this story: We used the very same approach of identifying the KPI drivers, building non-linear predictive models, and segmenting the people for an entirely different problem, domain and industry - video quality prediction! In this case, the client wanted to identify how much distortion he can introduce to a video until the viewer (e.g. digital television subscriber) notices it, and how much is enough to drive the viewer nuts and make him stop watching. The data contained a list of 800 samples of different video sequences with their quantified features (37 features in total, including the type - static news, active sports, animation, etc., and the level and location of the distortion in the video), and the KPI being a video quality metric - an average rating constructed by a representative expert panel.
Guess what - the same approach worked for a different data ser, different problem, completely different models, AND the best known metric for video quality estimation was created in the process!
Think about it: The main issue to solve almost any data science problem is not so much in figuring out which algorithm to apply and the exact parameters to use (this has been solved), but rather in setting up the problem and transforming it such that it becomes easy (almost trivial for those who know what they are doing) to solve. The algorithms exist, the technology exists, the theory has been developed. The challenge is in the formulation of the problem, any problem. INCLUDING YOURS.