What is Data Science?

Channel: IBM Technology Published: 2022-06-13 1,031 words Source: auto_caption

Transcript

let's talk about data science and some of the other related terms you may have heard such as predictive analytics machine learning advanced analytics and others so let's start with the textbook definition of data science so data science is the field of study that that involves extracting knowledge and insights from noisy data and then turning those insights into actions that our business or organization can take okay so let's dig into it a little bit more and discuss what are the different areas that are covered by data science so really data science is the intersection between three different disciplines we start with computer science then we also cover the area of mathematics and then what i think is the most important is business expertise so the intersection of these three disciplines is data science and true data science initiatives involve collaboration across all these three different areas okay so now let's touch on the different types of data science that you can do now what we need to understand here is that we have different data science methods for different questions that we might ask in an organization and these questions can vary by complexity and the value that we get out of them so let's chart them here by complexity and value okay so the first one that we have here is descriptive analytics so this is really about what is happening in my business right and it involves having accurate data collection to make sure that we know what's happening so a a good question we could ask here is well did sales go up or down the next level is diagnostic analytics and this is more about why did something happen so why did sales go up or down and it involves drilling down to the root cause of our problem now the next one that we have is predictive analytics so this is about what is likely to happen next right so what will our sales performance be next quarter and it involves using historical patterns in our uh in our data to predict outcomes in the future and then finally we have prescriptive analytics so this is about what do i need to do next what is the recommended best action for a particular outcome so question we could ask here is what do i need to do to improve sales by 10 right okay so now we can talk about how data science is done and who actually does it so let's look at the data science life cycle and the first thing that we always must start with is business understanding so this is really critical to make sure that we're asking the right question before we go down a lengthy data science initiative and this is where you can see the having the business expertise and the domain expertise can be incredibly critical to make sure that we're asking the right questions okay so once we've defined that we can move on to data mining so this is this is the process of actually going out into our data landscape and procuring the data that we need for our analysis so once we've done that we can move on to data cleaning so the the reality of the marketplace is that once we when we find data it's probably not in the best format that we need it in and it probably has uh some some issues with it right it might have rows that have missing values it might have duplicates in it so there's some preparation and cleaning that we have to do before it's ready for our analysis so once we've done that cleansing we can move on to exploration okay so this is the part of the process that allows us to use different analytical tools that can start helping us answer some of the the types of questions that i mentioned here earlier and if we actually want to get into some of these higher value questions like predictive and prescriptive then we must start using advanced analytical tools such as machine learning tools that leverage massive amounts of computing power and massive amounts of high quality data to make predictions and prescribe actions for the future now once we've done our exploration and perhaps our advanced analytics what do we do next well we need to visualize our insights and outcomes of our analysis okay now i want to quickly touch on who does what in this life cycle so in an organization you may have roles like a business analyst you might have data engineers and then you might have data scientists so business analysts are obviously involved in formulating the questions they have the domain expertise they can help with the business understanding but they're also involved with visualizing our insights in a way that's useful for the business right and then we have folks like data engineering folks so these are the people that can help us find the data clean the data and then also help with some of the exploration we move on to our data scientists so these are the people that will really help us with the exploration they'll help us with the advanced machine learning techniques and they'll also assist in the visualization so you can see there's there's some overlap between the roles and that's why it's critical to have collaboration across these roles and what you also start seeing nowadays in the marketplace is that sometimes business analysts have to do some machine learning they have to help out with exploration data scientists sometimes need to go and find the data on their own so there's a lot of overlap and these different roles must collaborate with each other okay so i hope you can see now how the data science life cycle can help us take noisy data turn it into knowledge and insights and then turn it into meaningful action for our business thank you if you have questions please drop us a line below and if you want to see more videos like this in the future please like and subscribe