Howard Friedman and Akshay Swaminathan on Winning with Data Science

Data science is becoming increasingly prevalent across a variety of industries, yet many businesses lack the tools or understanding to succeed in this evolving landscape. Howard Steven Friedman and Akshay Swaminathan address this situation in Winning with Data Science: A Handbook for Business Leaders. Taking an engaging narrative approach, the book covers the fundamental concepts without getting bogged down in complex equations or programming languages. The authors provide relatable scenarios and real-world applications, helping readers grasp and apply complex concepts in their own contexts. From understanding the basics of data science to navigating the challenges of communication and decision making, it serves as a comprehensive guide for anyone seeking to harness the power of data science in their organization.

Q: Regardless of industry, all are, or will soon be, using data science, yet few organizations have the tools to understand what that means, let alone how to be successful. How does your book help people and businesses with this paradigm shift?

Howard Friedman and Akshay Swaminathan: Our book prepares people and businesses by arming them with the basic vocabulary needed to understand data science and providing specific questions and frameworks. For example, if you are working in a finance company and need a model to predict risk or fraud, this book will help you understand a little about what the data science teams are going to do for you, as well as what questions you should ask to ensure that you are getting x the high-quality product that you need.

Q: It is unusual to see a business book written as a narrative. Why did you choose to write Winning with Data Science in this way?

Friedman and Swaminathan: Our goal was to write not a textbook but a book that people can relate to easily. The narrative approach, with two people moving up the corporate ladder in different industries, allows the reader to empathize and, we believe, learn more easily. It also presents realistic workplace projects, ranging from evaluating the efficacy of a medication to prioritizing overdue account recovery efforts, to help them understand how their own industries might use data. Our goal is engagement and learning, and we are convinced that the dialogue and storytelling will help readers become better data science customers.

Q: People may be further surprised to learn that you wrote the book by leveraging your own personal experiences and discussing how these characters would evolve. How did this process help you to develop the story? Where did you find inspiration?

Friedman and Swaminathan: We both have deep experience working in the private sector. Akshay has worked in health care for a number of years in a variety of data science roles, while Howard has spent decades working and consulting in consumer finance, health care, private equity, and other areas. The fictional characters in Winning with Data Science, Steve and Kamala, were inspired by our experiences. The challenges that they face very much reflect some of the challenges that we met working with customers and the solutions found to bridge communication gaps.

Q: Beyond your fictional account, can you provide real-world examples of data science in action that are particularly successful or instructive? How does it add value to a business?

Friedman and Swaminathan: There are countless examples. Our book focuses on the stories of Kamala and Steve in health care and finances because we have deep experience in those industries. Kamala is evaluating new research showing that a medication is not as effective in the real world as previously thought. Should her insurance company continue covering the expensive treatment or encourage their customers to use another medication? This is just one example of the applications in health care. Others might include drug discovery, personalized medicine applications, treatment suggestions, risk analysis, diagnosis in medical imaging, medical data collection, digital health monitoring, image recognition, and outcomes prediction.

Steve has a number of projects that include using customer data to increase the efficiency of the account recovery team and creating a model that can predict whether a new credit application is fraudulent or not. Financial companies use data in a lot of ways, including credit scoring, fraud detection, customer churn prediction, cross-sell, and recommendation engines. While the focus is on health care and finance, readers will see their own work needs reflected and make the leap to how to apply data in their own field. For example, retail and telecom leverage data for very similar applications to those in the finance industry, as well as some applications that are more industry unique.

Q: You write that key decision makers risk getting distracted by the buzz around deep learning or artificial intelligence or large language models. What is an effective way to prioritize?

Friedman and Swaminathan: It is easy to get trapped by Fear of Missing Out when everyone else is talking about a buzzy innovative technology or trend. This can lead to the common mistake of starting with a tool and then looking for a problem, whether the tool is an AutoML software solution, an LLM, or a deep learning application. This invariably ends with difficulties related to a potential mismatch of problem and solution. To avoid this, always start with the question, “What problem am I really trying to solve?” It opens the conversation regarding what data sources are needed, the requirements for the solution (timeliness, costs), and what types of solutions would be appropriate. If the solution turns out to be one of the “hot topics,” that is fine, but often the first analysis could point to a different solution.

Q: What is your message to businesses who may not have the resources to invest in a trained data scientist but still want to take advantage of the benefits of data science?

Friedman and Swaminathan: You do not need to hire a team of PhDs to begin dipping your company’s toes in data science. You do need data, some trained human resources, some software, and a problem you are trying to solve. Your company is probably already collecting some data from a combination of internal and external sources, though you may not be leveraging it yet. To start, the human resources can be either external consultants or internal talent, skilled analysts. The software can be basic analysis software or AutoML software. This can be very powerful and allow a good analyst to build world-class predictive models with insights into features and tracking of model performance and data drift.

But the thing that will really make a difference is culture shift to make your workforce more facile with data—they need to understand what data you’re collecting, how it is stored, how it can be used, what types of questions it can answer. It’s also important to nurture a customer mindset. Think about hiring a carpenter to redo your kitchen. You don’t have to build the cabinets yourself, but you do need to provide instructions about what types of cabinets you want, where you want the stove to go, what type of finishes you prefer. The key to success is preparing your managers and leaders to think like a data science customer.

Q: The growing pains Steve and Kamala experience come about, in part, from the fact that data science is relatively new and the role of “data scientist” is often nebulous or slapped on to existing roles. How can hiring managers and HR professionals avoid this?

Friedman and Swaminathan: The phrase “data scientist” is overused and at this stage, it often has little meaning. You might see a job posting for a “data scientist,” when what is really required is a “data engineer” or “data analyst.” Until there is more awareness and standards built around these job titles, a better tactic is to focus on the specific skills needed. These can range from mathematics, statistics, program evaluation, programming, data engineering, and data visualization to other critical elements. Hiring managers should work with the HR professionals to create a list of necessary skills and require evidence of them in the form of work experience, completed projects, and education. Companies should implement ways to assess skill levels to distinguish candidates who only appear great on paper from those who can deliver value.

Q: Steve and Kamala are occasionally frustrated because it feels like they aren’t on the same page as their data scientist colleagues. How do they overcome this disconnect in the story, and what is your advice to people who face similar issues in their work?

Friedman and Swaminathan: Communication disconnects are frustrating for everyone involved. In our experience, the first sources of disconnects are definitions—different people often have different definitions of the same word and find themselves talking over one another. Achieving clarity on definitions is a good first step. Discussing expectations is another critical early step. Clearly communicating what the customer expects in terms of deliverables, time, cost, resources, and other considerations and getting feedback from the project team is necessary to avoid disconnects.

Q: Your book also dives into questions about data sources and quality of data. Where does all of this data come from? And how much of it is useful?

Friedman and Swaminathan: If we think about this from a company perspective, there are internal and external data sources. Internal data include transactional data, Customer Relationship Management Data, communications history, employee data, productivity and operations data, website data, unstructured internal data (emails, meeting notes, documents). External data can include industry and market trend data, competitor data, social media data, government data, and aggregated data sold by third parties. The deeper question is how much of it is useful, and that is where the data scope, accuracy, relevance, timeliness, availability, and costs are most relevant. Companies need to make detailed assessments to determine whether a data source is useful and if so, whether it is cost effective to leverage.

Q: In the book, Steve uses the analogy of a “data science toolbox.” What are the essential elements of the new data science toolbox, and what do you say to workers who feel overwhelmed at the thought of having to learn these new skills?

Friedman and Swaminathan: It’s unreasonable to expect a project manager or data science collaborator to be well-versed in all the tools and techniques, especially when the fields of AI and ML are advancing rapidly each day. Instead of trying to learn all the tools, focus on understanding the main types of questions or problems and the general approach to tackling them. For example, descriptive or exploratory data analysis can help generate hypotheses or answer basic questions like “What does the data look like?” or “What are the key features and characteristics?” Statistical inference attempts to answer questions like “Are there significant differences between groups?” or “Is there a relationship between variables?” Predictive analytics aims to build models that can make predictions or forecasts based on historical data.

Q: You note that many of us fall prey to the “the myth of the single study,” which can introduce bias into data. How can there possibly be biases in the world of data science?

Friedman and Swaminathan: Despite all the glamor and hype, if you look at the words that data scientists use to describe data, it paints a different picture: “this data is messy,” “I need to clean this dataset,” “garbage in, garbage out.” These are just some of the common phrases to describe the nature of real-world data. In the book we talk about how data is presented in misleading ways or used inappropriately in ways that perpetuate biases.

One common pitfall is relying on a single study or dataset as the sole source of truth, neglecting the potential for sampling bias or limited generalizability. Additionally, there’s the risk of algorithmic bias, where machine learning models may perpetuate existing societal biases present in training data. Human biases can also creep in during data collection, interpretation, or model development.

Bias in data matters because there are real-world implications, ranging from rendering your product useless to creating bad press for your company. Data scientists and collaborators should be vigilant, conduct thorough validation, and continually question their assumptions to mitigate these biases.

Q: Whether it is signing up for a loyalty card at a grocery store or registering a kid for soccer, we give away data every day. How can we safeguard against the misuse of data in an increasingly networked world?

Friedman and Swaminathan: First, individuals can practice data mindfulness by sharing only necessary details and reviewing privacy settings. Business leaders must prioritize data ethics, ensuring responsible data collection and usage while fostering a culture of transparency. For data teams, robust cybersecurity measures, regular audits, and employee training are essential. Collaboration among individuals, businesses, and data professionals, underpinned by strong data protection regulations, is the key to striking the right balance between data utility and privacy.

Q: Companies are increasingly concerned with their Environmental, Social, and Governance (ESG) impact. How can data science help businesses achieve ESG goals?

Friedman and Swaminathan: By leveraging data analytics, companies can measure and manage their environmental footprint, track social responsibility initiatives, and ensure robust governance practices. Data-driven insights pinpoint areas for improvement, leading to informed decisions that reduce waste, promote diversity, and strengthen ethical governance. Furthermore, data science aids in reporting ESG metrics transparently, fostering trust among stakeholders. It’s the bridge that connects business strategies with sustainable practices, aligning profitability with purpose to drive positive change in our world.

Leave a Reply