Kyle Polich, Principal Data Scientist, DataScience, Inc.
We can consider the year 2016 as the year of the data scientist. Not only is it the “sexiest” job of the 21st century, according to Harvard Business Review, but our data-driven economy has created such a void in the business world that data scientists now play a crucial role in driving business decisions and results. However, a highly skilled data scientist is much more than his or her ability to crunch and analyze numbers. The data scientists needed by organizations today are those with deep technical knowledge as well as the ability to communicate technical concepts to a non-technical audience. In a recent workshop put on by DataScience Inc. and the West Big Data Innovation Hub, we honed in on this very topic in great detail. Decision makers in any business are rarely well-versed in the programming languages used by data scientists, and often lack the highly technical knowledge base to interpret complex information, data models and graphs on their own. What business leaders and decision makers want is a simplified version of a data scientists’ findings. In other words, “How does this data matter?”
The Power of Four
Making business sense out of complex data is, like most things, a skill that data scientists master over the course of years in the field. Not every CEO, for example, will understand a technical concept in the same way as a tech-savvy CIO might, but data scientists can easily circumvent this challenge by following what we call the four D’s of data science:
– Diagnostics: Find the simplest summary statistics to describe a complex topic using an anecdote that the audience can relate to. If the client is a baseball fan, compare the statistics to a double play, for example.
– Discussion: How we convey ideas to decision makers who need the information requires that we understand their base of knowledge first, and understand the decisions they need to make second. Help guide their understanding of the subject based on their business goals and explain how it relates to those goals.
– Demonstration: Prove to your audience that your product or offering can indeed achieve the results it claims through an actual demonstration or by illustrating previous successes.
– Data visualization: The simplest visualizations are often the most effective. While cool visuals and effects are fun to see or interact with, they can detract from the key message and ultimately make things harder to understand for the audience.
At the workshop, we broke down several examples of how to use data in storytelling, not only to communicate the value of the data to decision makers but to engage broader audiences for public awareness. In the same vein, data scientists can disguise statistics within storytelling. In fact, a data-driven narrative is shown to spark more engagement and discussion than other methods. Content marketers and journalists for one are using data-driven stories to communicate trends, issues and ideas to their audience. A similar practice can be used by data scientists to communicate an organization as well.
Overcoming Communication Challenges
In practice, a data scientist may run into communication challenges outside his or her control. Standing in opposition to archaic business practices is a difficult task when the company culture is resistant to change, no matter what the data implies. While working at a large yellow-pages company years ago, I experienced this challenge firsthand.
The company was looking to take the leap into the digital age at a time when the leadership team had operated successfully under the previous ad-space model, and had limited experience in a data-driven, digital economy. At the time, it was difficult to communicate what the data scientists were trying to do and how this translated from the company’s standard ad-space buying practices. The digital team presented data, linear models, charts and graphs that only confused and overwhelmed the audience. It was a valuable experience for us all, and we came away with an important lesson: We needed to make the data easy to understand in order to help our audience make important business decisions. Using the four d’s formula, we came up with a simple solution. We summarized the value of the data and technical plan with a simplified scenario and anecdote that our audience understood.
Of course, there’s always a chance that we may oversimplify the data, which can lead to an unwanted outcome as well. At one company, for instance, the sales team saw a lot of traffic going to its online white pages. They took the page down assuming it would drive more traffic to the main site. However, once the white pages disappeared so did the lead gen. A data scientist can help formulate the right diagnostics and KPIs to monitor, and measure the impact decisions and changes have on them. While a p-value might be the key statistic a data scientist wants to analyze, not all business users will be trained to interpret it. Expressing risk in context, in a metric they understand (perhaps average revenue per user) would more successfully convey the importance of the message.
Presenting complex data is not just about technical knowledge. It’s a skill that requires the right amount of balance between what we know to be true and how helpful it is to the person on the receiving end of the information. Over simplifying might lead to skewed results or a biased understanding of the data. Too little information might not answer the important questions decision makers have, while giving up too much information puts the decision maker at risk of misunderstanding the information if at all. It takes time and experimentation to get it just right.
Kyle Polich is the Principal Data Scientist at DataScience, Inc.
Kyle leverages his data science acumen in solving challenging core business problems for DataScience, Inc. clients. Polich specializes in applying probabilistic reasoning, decision theoretic approaches, and machine learning to large, data-driven questions. With deep expertise in artificial intelligence, Polich is well known within the data science community and publishes a bi-monthly podcast (dataskeptic.com), where he discusses data science and machine learning topics for a subscriber base of 2,000+ practitioners.
Subscribe to Data Informed for the latest information and news on big data and analytics for the enterprise, plus get instant access to more than 20 eBooks.
Source: Data Informed