What really Data Science and Data Scientist is?
About Data Science & Data Scientist
Hello and Welcome to my Blog. This is Vijay. Oh Yes, Vijay Simha Reddy. This Blog is about what really Data Science is? What a Data Scientist need to have?
Bored of reading Books, Googling, Searching Online through various platforms on What is Data Science? What a Data Scientist should do? What skills he should acquire? How and who can enter into this Domain? And you are still confused? If you are really looking for this, you have entered into the Right Site.
Here I won't give any Professional definitions, thingummies and won't make you look afraid of the topic and thus won't make it look complex.
What is Data Science:
Basically, Data Science comprises of :
1)Mathematics
2)Statistics
3)Computer Science
The knowledge of these three is used to derive insights from data in various forms i.e., structured and unstructured data.
1) Data Science begins with:
Where to start Data Science with? This is a big question for many people.
Data Science begins with STATISTICS.
STATISTICS is a branch of MATHEMATICS.
A Data Scientist should have a very good knowledge of Statistics and good command on Maths.
However one can learn Statistics when he is good at Maths.
Data Science starts with Statistics, Statistics comes from Mathematics.
One cannot become a Data Scientist without knowing Statistics.
![]() |
|
2) Data Science Prolongs with:
i)Machine Learning
ii)Deep Learning
iii)Artificial Intelligence
A coder/programmer/person/learner/student who learns and has knowledge on these four mini domains of Data Science i.e., Statistics, Machine Learning, Deep Learning, and Artificial Intelligence becomes a Data Scientist.
i)What is Machine Learning:
Here comes our most awaited part of Data Science: MACHINE LEARNING
Machine Learning can be coded in two languages: R & Python
Among these two Python is Simple, easy to learn and code, more efficient, suggested one.
Machine Learning can be explained in two parts:
a) Data Cleaning/Exploratory Data Analysis:
Here the Data is cleaned using various Machine Learning techniques. It is known as Exploratory Data Analysis(EDA), in which data cleaning is performed. (FYI, Statistics will be used here).
Data is of two types: Unstructured Data and Structured Data.
What is Unstructured and Structured Data:
Unstructured Data is any 'information' (information is raw data) we collect in various forms which cannot be understood by the machine.
Structured Data is designed by cleaning/mining the Unstructured Data which can be understood by the machine.
Our final aim is to make the Machine understand the data we give as input. So we clean the data which we call unstructured, then after cleaning the Data becomes as Structured Data which Machine can understand.
b) Predicting the Model:
Model is built, built model is trained and tested and predictions are done on it using different Machine Learning Algorithms.
This is a simple glance at Machine Learning.
Deep Learning and Artificial Intelligence are learned in further blogs after exploring Machine Learning.
So this is all about it. In this session, we learned about what is Data Science, what domain knowledge a Data Scientist should have. Hope you liked it. Do like and comment on the article if you gain the knowledge from this article and motivate me to write to like this.
Good work...
ReplyDeleteSuper,this content is wonderful for beginners and who want to enter this field.
ReplyDeleteGood start! Keep going, dear!
ReplyDeletegood ra
ReplyDeleteGood Work Vijay, You have a nice way of writing... Keep it Up
ReplyDelete