MMS • RSS
Article originally posted on Data Science Central. Visit Data Science Central
I was at crossroad to choose my next leap. Cloud, AWS, Agile, Blockchain and many other were the words I heard from the experts and how its boom in the industry. I was confused on what subject and technology should I consider for comprehensive study for my professional growth. I have been working on Business Intelligence tools for quite some time and had inclination towards data and facts that could be supported by data, understanding what data means and uncovering the insights from the data. So, I learnt of Data Science and intended to explore this field.
What is Data Science to me?
Data Science to me is, collecting data from diverse sources to my business interest, investigating it to find the crux by applying some scientific and statistical methodology and visually presenting results to the world. Data science is an art to paint a picture from the different available colors and brushes. Colors are relevant to different form of data and brushes are the Data Science tools used for investigation.
According to Wikipedia, Data Science is a “concept to unify statistics, data analysis, machine learning and their related methods” in order to “understand and analyze actual phenomena” with data. Data science is an interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from data in various forms, both structured and unstructured.
With conceptually understanding Data Science, I had few questions –
Why we need it, how does this impact our life and what it means to my future?
With the use of internet, mobile apps and expansion of “Internet of Things” the data generated at each stage of process is increasing day by day. Such a huge data can be source of business to an organization be it sales, marketing, hiring or at the process level, if the data is analyzed to extract meaningful insight. In today’s competitive world, it’s a need for business to understand the generated data to make quicker and better decisions, in providing better service and communicating to business consumers and customers efficiently.
For example, Bank maintaining the account and purchase history of the customer, providing the graphical representation of the spending, identifying and predicting the patterns and suggesting additional services based on the analysis. Resulting in building healthy relationship with customers.
Application of Data Science does not limit to any specific domain or industry. If you have sufficient data generated and have questions about your business, Data Analytics can bring answers to these questions. Having said, the benefits of Data Science vary to each organization depending on the business need and problems.
Data Science can add value to business by the insights obtained from statistical analysis.
Organizations have started understanding the power of utilizing data science to uncover the hidden meaning of their data, and so the demand for the trained data scientist is will grow in coming years. Data science is an emerging field and trained professional are the need of an hour.
All above is fine, but What will I gain from learning Data Science?
Existence of Data Science has been since ages, however Organization are now exploring and utilizing its benefits. Meaning, the opportunity will arise for Data Scientist. It seems like a dream job with attractive pay, but along with it comes the huge responsibility and expectations from Data Scientist to do some miracle. You would get to master Data skills, gain domain knowledge, be a team player and more importantly understand the business.
With Data Science you open your path to Machine Learning and Artificial Intelligence.
Who can be Data Scientist?
“A data scientist is someone who is better at statistics than any software engineer and better at software engineering than any statistician.”
Anyone who can:
- Think analytically on the business problem. Understand and ask right questions on the business problem.
- Be curious about the given data.
- Have interest and enjoy statistics and programming.
- Is willing to be constantly in learning mode, learning about the latest technologies in the area of Data wrangling and management.
- Willing to dirty hands with messy and huge data (structured and unstructured) and apply analytic powers to identify the solutions on business problems.
What should be my first step?
You have already taken your first step towards Data Science by reading this blog. 🙂
You should start learning on the business values gained with the help of Data Science. Once this interest you, you will start relating this learning with your business data. You will then wonder how to get it working. And this is your first step towards Data Science. Your learning starts here.
Exploring the skills required to be Data Scientist, various tools required for doing the job would be the next step.
As a Data Scientist some of the base essential skills you should start with: Python, Fundamental of Statistics, Database Management System, Query Language, ETL, BI Visualization and more important Communication Skills.
Being from Infrastructure Management, I had to spend some time understanding basics of Python. I tried to interpret how Python is used in Data Science and could conclude that it is not expected to be Python programmer writing thousand lines of code. As Data Scientist, main task is to process and analyze data and identify the patterns. Python provides various libraries to ease our job. Start getting acquainted with these libraries, as these would be used regularly in Data Science job. R and SAS are other popular languages for data analysis. R language is more likely preferred when performing advance data analysis task. Python and R are open source programming language, while SAS is enterprise tool offering huge statistical functions and good GUI for quick learning.
One should have good hands-on experience on Visualization tools. Some for reference are Cognos BI, Tableau, QlikView and Watson Analytics.
Love for Statistics
If you never loved Statistics in high school, start loving it now. I was never fond of Statistics during my high School but when I decided my journey towards Data Science, I had to start Statistics all over again. Get the concepts and fundamental of Statistics thorough. I started with a book Fundamental of Statistics. At first, I found it difficult to understand until I find good mentor. I would suggest find someone who can explain statistics in simpler way and then go back to books. If you could not find any mentor or coach, refer some online material and content. YouTube videos can be your savior. It is important learning step on the path to be Data Scientist.
Tools for Data Scientists
Each job posting and project for Data Science would have lengthy list of technical skills expecting to be proficient in all mentioned skills. Skills on data technologies, scripting languages and statistical programming languages.
Few frequent tools mentioned below that Data Scientist are expected to know (at least one from each space). The list can expand depending on the business domains.
- Hadoop tools – Spark, Hive, Impala
- Knowledge of Big Data
- DBMS, Oracle, Hadoop, Mongo DB
- Business Intelligence reporting tools – OBIEE, Business Objects, Cognos, Tableau, MicroStrategy etc
The learning path for Data Science just doesn’t stop, it is continual process. Start with open source tools to do practice exercise while learning. Some of the Learning sites I continue to refer: Data Science Central , Cognitiveclass.ai and coursera.
Take an initiative to start your Data Science journey. Data Science is fun when data talks to you! Never Stop Learning!