Breakthrough Technology - Try us
We own and develop technology for mining critical information from data, telling which factors impact performance, and how to achieve the desired performance within the constraints of your business. We've been shipping predictive analytics and data modeling workflows with minimal development cost and discovery time since 2005.
For example, our solutions can predict how your customers will be satisfied with your product, who will be likely to buy, which variables to look at when building an investment portfolio, which factors influence your firm's profitability, how to minimize energy consumption, how to balance the load of a cluster of virtual machines, and how to push new materials to the market earlier than the competition?
The technology behind DataStories is based on our own brews of deep learning methods with a solid foundation in statistical learning theory, combined with state-of-the art machine learning methods. In addition we spent more than 10 years of research on computational analysis of the mutual information content in high-dimensional data.
All of the above are relatively computationally expensive methods, but the rewards are more than worth it and result in deep understanding and insights about the underlying system generating the data. Deep learning and mutual information content do not assume that the relationships among data must be linear, and are aware of the fact that variables may not be influencing your key performance indicators (KPIs) individually, while they can have a significant impact in combination with other drivers.
The foundations of our approach to data can be described in the following theses:
- Deep knowledge almost never can be found in the data alone. There are many fantastic data visualization tools out there, but just looking at data is not enough - a lot of additional context comes from computational modeling (to understand what happens "in-between the points").
- Optimal combinations of factors leading to optimal values of the KPIs are almost never observed in the data - breaking out of data is critical to search for optima.
- Simple analysis which looks at relationships one variable at a time is incomplete, as deep insights almost always require searching for relationships among combinations of factors
- Linear modeling is only sufficient if underlying processes are linear. Most natural laws are not linear in the coefficients.
- Specifying the direction of influence for variables only make sense if the relationships are linear. Our favorite quote: "Dividing problems into linear and non-linear is like dividing the universe into bananas and non-bananas".
- A factor matters for predicting the KPI if it’s presence or absence matters.
- In real-world systems a factor can be unrelated to the performance indicator alone, but critically important when combined with another factor. Exploring combinations is critical, albeit computationally intensive.
- Problems in data analytics are extremely diverse and each one is different, however all of them require to find what drives the KPIs, how they drive the KPIs, and what are the exact relationships and what-if scenarios.
- Outliers other than measurement errors can only be identified by using models. Observations can look just fine if one variable at a time is considered, but they suddenly can turn into outliers if two or more factors are considered together.
- Most of the time there are many paths to solutions. The single model almost never exists (only conditionally to modeling assumptions. Often there are many different combinations of factors that can predict the KPIs (especially in problems where factors are correlated with each other, non necessarily to a high degree). Our algorithms deliver the smallest possible set of factors necessary and sufficient to predict the KPI.