Course Overview
Chapter 1: What is Data Science?
- Module 1:
- Name: Introduction to Data Science
- Description: In this module, you will view the course syllabus to learn what will be taught in this course. You will hear from data science professionals to discover what data science is, what data scientists do, and what tools and algorithms data scientists use on a daily basis. Finally, you will complete a reading assignment to find out why data science is considered the sexiest job in the 21st century.
- Learning Objectives:
- Define data science and its importance in today’s data-driven world.
- List some paths that can lead to a career in data science.
- Summarize advice given by seasoned data science professionals to data scientists who are just starting out.
- Articulate why data science is considered the most in-demand job in the 21st century.
- Describe what a typical day in the life of a data scientist looks like.
- Define some of the commonly used terms in data science.
- Summarize the impact of cloud technologies on data science.
- Identify some of the key qualities of a successful data scientist.
- Module 2:
- Name: Big Data and Data Science Skills
- Description: In this module, you will hear from Norman White, the Faculty Director of the Stern Centre for Research Computing at New York University, as he talks about data science and the skills required for anyone interested in pursuing a career in this field. He also advises those looking to start a career in data science. Finally, you will complete reading assignments to learn about the process of mining a given dataset and about regression analysis.
- Learning Objectives:
- Define Big Data and its distinguishing characteristics, such as velocity, volume, veracity, and value.
- Describe how Hadoop and other big data tools, combined with distributed computing power, are triggering digital transformation.
- List some of the skills required to be a data scientist and analyze big data.
- Explain what data mining is.
- Summarize the importance of establishing goals, data selection, preprocessing, transformation, and storage of data in preparation for data mining.
- Explain the difference between deep learning and machine learning.
- Describe regression and how it might be used to predict market behavior and trend analysis.
- Module 3:
- Name: Data Science Applications and Qualities
- Description: In this module, you will learn about the approaches companies can take to start working with data science. You will learn about some of the qualities that differentiate data scientists from other professionals. You will also learn about analytics, story-telling, and the pivotal role data scientists play in creating an effective final deliverable. Finally, you will apply what you learned about data science by answering open-ended questions.
- Learning Objectives:
- Describe the application of data science in healthcare.
- Explain how companies can start on their data science journey.
- Describe some of the ways in which data is generated by consumers.
- Describe how businesses such as Netflix, Amazon, UPS, Google, and Apple are using data generated by their consumers and employees.
- Compare some of the qualities that differentiate data scientists from qualities of other data professionals.
- Articulate the purpose of the final deliverable of a data science project and the role of storytelling in the final deliverable.
- Describe what the final report of a Data Science project should cover and how it should be structured for best results.
- Demonstrate your understanding of data science by articulating what data scientists do and what a data science report contains.
Chapter 2: Tools for Data Science
- Module 1:
- Name: Programming Languages and Data Science Tools
- Description: In this module, you will get an overview of the programming languages commonly used, including Python, R, Scala, and SQL. You’ll be introduced to the open source and commercial data science tools available. You’ll also learn about the packages, APIs, data sets and models frequently used by Data Scientists.
- Learning Objectives:
- Cite popular open source, commercial, and cloud-based tools used by data scientists.
- Explain the function of an API and list some common API-related terms.
- Discuss the characteristics of a dataset and the ways data can be structured.
- Identify some of the libraries used in data science and the types of functionalities a library can provide.
- Compare and contrast between machine learning, deep learning and their use cases.
- Discuss the different classes of machine learning models.
- Summarize the advantages of using a pre-trained model as opposed to training a model from scratch.
- Identify the languages, tools, and data used by data scientists.
- Access and explore data sets in the Data Asset Exchange (DAX).
- Examine deep learning models on the Model Asset Exchange (MAX) and interact with an image-detection deep learning model.
- Module 2:
- Name: Essential Data Science Tools: GitHub, Jupyter, and RStudio
- Description: In this module, you will learn about three popular tools used in data science: GitHub, Jupyter Notebooks, and RStudio IDE. You will become familiar with the features of each tool, and what makes these tools so popular among data scientists today.
- Learning Objectives:
- Describe the fundamentals of Jupyter Notebooks and the JupyterLab platform and why they are popular for data science projects.
- Cite some examples of how Jupyter is used in data science for recording experiments and projects.
- Use Jupyter notebooks to experiment with code cells, create data visualization slides, and reporting results using Markdown.
- Create a Jupyter notebook with Python and use it to run code, change kernels, and save your work.
- Define the R programming language and discuss its application in data science.
- Use RStudio packages to experiment with plotting.
- Explain versioning, branching and why they’re vital to the software development process.
- Create, edit. and upload new files in GitHub.
- Use GitHub to create a branch, commit changes, and initiate a pull request.
- Module 3:
- Name: IBM Data Science Platform: Watson Studio
- Description: In this module, you will learn about an enterprise-ready data science platform by IBM, called Watson Studio. You’ll learn about some of the features and capabilities of what data scientists use in the industry. You’ll also learn about other IBM tools used to support data science projects, such as IBM Watson Knowledge Catalog, Data Refinery, and the SPSS Modeler.
- Learning Objectives:
- Define the Watson Studio.
- Explain how Watson Studio can streamline data projects and make them easier to manage.
- List some of the primary assets included in Watson Studio.
- Create a project in Watson Studio and add an interactive Jupyter workbook to work with a real data set. Link GitHub to Watson Studio and explain when to use public vs. private repositories.
- Describe other tools essential to data science and their applications.
- Describe Data Refinery and how it benefits the data scientists who use it.
- Describe the IBM SSPS Modeler, its usage, and some of the capabilities it offers.
- Load and run a sample project in Watson Studio and examine the results.
- Use Watson Studio to run an autonumeric model, examine the results and get predictions for new use cases.
- Describe the use case for the IBM SPSS Modeler and list a few of its main features.
- Summarize the characteristics of the IBM SPSS Statistics software application and its use by data scientists.
- Explain how models are deployed into production environments.
- Summarize how Auto AI in Watson Studio benefits organizations who lack a fully staffed data science team.
- List the features included with IBM Watson and explain how it ensures fairness and explainability of model drift.
- Module 4:
- Name: Jupyter Notebook Creation and Sharing
- Description: In this module, you will demonstrate your skills by creating and configuring a Jupyter Notebook. As part of your grade for this course, you will share your Jupyter Notebook with your peers for review.
- Learning Objectives:
- Create and share a public Jupyter notebook, in any language, in Watson Studio and create markdown cells for peer review.
- Evaluate and grade final assignments submitted by fellow learners using the given rubric. Provide constructive feedback and offer ideas and suggestions that fellow learners can apply right away.
Chapter 3: Data Science Methodology
- Module 1:
- Name: Introduction to Data Science Methodology
- Description: In this module, you will learn about why we are interested in data science, what a methodology is, and why data scientists need a methodology. You will also learn about the data science methodology and its flowchart. You will learn about the first two stages of the data science methodology, namely Business Understanding and Analytic Approach. Finally, through a lab session, you will also obtain how to complete the Business Understanding and the Analytic Approach stages and the Data Requirements and Data Collection stages pertaining to any data science problem.
- Learning Objectives:
- List the six key stages of the Cross-Industry Process for Data Mining Methodology (CRISP-DM), an industry-standard data science methodology.
- Analyze the first four phases of CRISP-DM.
- Apply the first four phases of the data science methodology to a case study.
- Compose clearly defined questions that address a business problem.
- Analyze a case study to determine data requirements.
- Apply the data science methodology to a case study.
- Determine data content, data formats, and data sources prior to data collection and data preparation phases.
- Create a decision tree to classify outcomes in a case study.
- Identify appropriate data sources to address a business problem.
- Module 2:
- Name: Data Understanding, Preparation, Modeling, and Evaluation
- Description: In this module, you will learn what it means to understand data, and prepare or clean data. You will also learn about the purpose of data modeling and some characteristics of the modeling process. Finally, through a lab session, you will learn how to complete the Data Understanding and the Data Preparation stages, as well as the Modeling and the Model Evaluation stages pertaining to any data science problem.
- Learning Objectives:
- Prepare a data set by handling missing, invalid, or misleading data.
- Describe the purpose and characteristics of the data modeling process.
- Evaluate a decision tree model using a training and a test dataset.
- Build a decision to tree to determine the cuisine type for a data set of recipes.
- Summarize the processes of understanding data preparing data, modeling, and evaluation phases of the data science methodology.
- Module 3:
- Name: Deployment and Feedback in Data Science
- Description: In this module, you will learn about what happens when a model is deployed and why model feedback is important. Also, by completing a peer-reviewed assignment, you will demonstrate your understanding of the data science methodology by applying it to a problem that you define.
- Learning Objectives:
- Describe the deployment and feedback phases of the data science methodology.
- Determine when a model is ready to deploy.
- Devise a plan to elicit feedback from stakeholders involved in the data analysis process.
- Examine how feedback helps to refine a model.
- Assess the performance and impact of a data model.
- Explain why deployment and feedback should be an iterative process.
- Devise a business problem to be solved with data related to either email, hospitals, or credit cards.
- Demonstrate your understanding of data science methodology by applying it to a given problem. Construct responses that address each phase of the CRISP-DM based on a chosen business problem.
- Evaluate your peers’ final projects using the given rubric. Provide constructive feedback and offer ideas and suggestions that fellow learners can apply right away.
Chapter 4: Python for Data Science, AI & Development
- Module 1:
- Name: Python Basics: Types, Expressions, and Variables
- Description: This module teaches the basics of Python and begins by exploring some of the different data types such as integers, real numbers, and strings. Continue with the module and learn how to use expressions in mathematical operations, store values in variables, and the many different ways to manipulate strings.
- Learning Objectives:
- Demonstrate an understanding of types in Python by converting or casting data types such as strings, floats, and integers.
- Interpret variables and solve expressions by applying mathematical operations.
- Describe how to manipulate strings by using a variety of methods and operations.
- Build a program in JupyterLab to demonstrate your knowledge of types, expressions, and variables.
- Work with, manipulate, and perform operations on strings in Python.
- Module 2:
- Name: Python Data Structures: Lists, Tuples, Dictionaries, and Sets
- Description: This module begins a journey into Python data structures by explaining the use of lists and tuples and how they are able to store collections of data in a single variable. Next learn about dictionaries and how they function by storing data in pairs of keys and values, and end with Python sets to learn how this type of collection can appear in any order and will only contain unique elements.
- Learning Objectives:
- Describe and manipulate tuple combinations and list data structures.
- Execute basic tuple operations in Python.
- Perform list operations in Python.
- Write structures with correct keys and values to demonstrate understanding of dictionaries.
- Work with and perform operations on dictionaries in Python.
- Create sets to demonstrate understanding of the differences between sets, tuples, and lists.
- Work with sets in Python, including operations and logic operations.
- Module 3:
- Name: Python Fundamentals: Conditions, Branching, Loops, Functions, and Classes
- Description: This module discusses Python fundamentals and begins with the concepts of conditions and branching. Continue through the module and learn how to implement loops to iterate over sequences, create functions to perform a specific task, perform exception handling to catch errors, and how classes are needed to create objects.
- Learning Objectives:
- Classify conditions and branching by identifying structured scenarios with outputs.
- Work with objects and classes.
- Explain objects and classes by identifying data types and creating a class.
- Use exception handling in Python.
- Explain what functions do.
- Build a function using inputs and outputs.
- Explain how for loops and while loops work.
- Work with condition statements in Python, including operators and branching.
- Create and use loop statements in Python.
- Module 4:
- Name: Working with Data in Python: Files, Pandas, and NumPy
- Description: This module explains the basics of working with data in Python and begins the path with learning how to read and write files. Continue the module and uncover the best Python libraries that will aid in data manipulation and mathematical operations.
- Learning Objectives:
- Explain how Pandas use data frames.
- Use Pandas library for data analysis.
- Read text files using Python libraries including “open” and “with”.
- Utilize NumPy to create one-dimensional and two-dimensional arrays.
- Write and save files in Python.
- Module 5:
- Name: Data Collection in Python: APIs and Web Scraping
- Description: This module delves into the unique ways to collect data by the use of APIs and web scraping. It further explores data collection by explaining how to read and collect data when dealing with different file formats.
- Learning Objectives:
- Explain the use of the HTTP protocol using the Requests Library method.
- Describe how the URL Request Response HTTP protocol works.
- Invoke simple, open-source APIs.
- Perform basic web scraping using Python.
- Work with different file formats using Python.
- Explain the difference between APIs and REST APIs.
- Summarize how APIs receive and send information.
Chapter 5: Python Project for Data Science
- Module 1:
- Name: Python Data Science Project
- Description: In this module, you will demonstrate your skills in Python – the language of choice for Data Science and Data Analysis. You will apply Python fundamentals, Python data structures, and work with data in Python. By working on a real project, you will model a Data Scientist or Data Analyst’s role, and build a dashboard using Python and popular Python libraries using Jupyter notebook.
- Learning Objectives:
- Perform web scraping in Python to obtain data.
- Extract data by using a Python library.
Chapter 6: Databases and SQL for Data Science with Python
-
- Module 1:
- Name: Introduction to Databases and Basic SQL
- Description: Introduction to databases, cloud database instance creation, basic SQL statements. Hands-on practice on a live database.
- Learning Objectives:
- Demonstrate how to write basic SQL statements.
- Describe SQL and Databases.
- Create and execute
CREATE TABLE
,SELECT
,INSERT
, andDELETE
SQL statements with a live database.
- Module 2:
- Name: Relational Database Concepts and Table Manipulation
- Description: Exploration of database fundamentals, tables, relationships. Creation of a database instance, table manipulation using SQL.
- Learning Objectives:
- Compose and execute
CREATE TABLE
,ALTER
,DROP
, andTRUNCATE
statements hands-on on a live database. - Create a database instance on the Cloud.
- Describe basic relational database concepts including tables, primary keys and foreign keys.
- Compare and contrast Data Definition Language and Data Manipulation Language.
- Develop
ALTER
,DROP
andTRUNCATE
statements. - Explain the syntax of the
CREATE TABLE
statement.
- Compose and execute
- Module 3:
- Name: Data Retrieval and Manipulation with SQL
- Description: Using string patterns, ranges, sorting, grouping, nested queries, and multi-table data access.
- Learning Objectives:
- Group data into result sets.
- Use string patterns and ranges in SQL queries.
- Demonstrate how to use string patterns and ranges in SQL queries.
- Sort and order result sets.
- Compose sub-queries and nested select statements.
- Module 4:
- Name: Python Database Connectivity
- Description: Connecting to databases using Python, creating tables, loading data, querying, and analyzing data.
- Learning Objectives:
- Access databases from Python using SQL magic.
- Describe concepts related to accessing Databases using Python.
- Create tables and insert data using Python.
- Connect to a database using
ibm_db
Python library. - Analyze a data set using SQL and Python.
- Module 5:
- Name: Real-World SQL Data Analysis
- Description: Working with real-world datasets (Chicago), translating analysis questions into SQL queries.
- Learning Objectives:
- Invoke SQL queries using Python.
- Retrieve SQL query results and analyze data.
- Interpret and translate data analysis questions into SQL queries.
- Module 6:
- Name: Advanced SQL Techniques for Data Engineers
- Description: Advanced SQL techniques (views, transactions, stored procedures, joins).
- Learning Objectives:
- Generate joins to query data from multiple tables.
- Create stored procedures and invoke them from other code.
- Describe the significance of ACID transactions.
- Implement transactions in SQL statements.
- Compare and contrast the benefits and disadvantages of using stored procedures.
- Define views and describe the benefits they provide.
- Compare and contrast the different
JOIN
operators. - Create and query a view.
Chapter 7: Data Analysis with Python
- Module 1:
- Name: Data Understanding and Importing
- Description: Understanding data, using Python libraries for data import, basic data exploration.
- Learning Objectives:
- Analyze Python data using a dataset.
- Identify three Python libraries and describe their uses.
- Read data using Python’s Pandas package.
- Demonstrate how to import and export data in Python.
- Module 2:
- Name: Data Wrangling and Preprocessing
- Description: Handling missing values, data formatting, normalization, binning, categorical variable conversion.
- Learning Objectives:
- Describe how to handle missing values.
- Describe data formatting techniques.
- Describe data normalization and standardization.
- Demonstrate the use of binning.
- Demonstrate the use of categorical variables.
- Module 3:
- Name: Exploratory Data Analysis (EDA)
- Description: Descriptive statistics, grouping, correlation analysis, Chi-square test.
- Learning Objectives:
- Implement descriptive statistics.
- Demonstrate the basics of grouping.
- Describe data correlation processes.
- Describe why and how to apply the Chi-Squared test.
- Module 4:
- Name: Regression Analysis
- Description: Simple and multiple linear regression, polynomial regression, model evaluation.
- Learning Objectives:
- Transform data into a polynomial, then use linear regression to fit the parameter.
- Apply model evaluation using visualization in Python.
- Apply polynomial regression techniques to Python.
- Predict and make decisions based on to Python data models.
- Describe the use of R-squared and MSE for in-sample evaluation.
- Define the term “curvilinear relationship”.
- Module 5:
- Name: Model Evaluation and Refinement
- Description: Model evaluation, overfitting, underfitting, Ridge Regression, Grid Search.
- Learning Objectives:
- Describe data model refinement techniques.
- Explain overfitting, underfitting, and model selection.
- Apply ridge regression to regularize and reduce the standard errors to avoid overfitting a regression model.
- Apply grid search techniques to Python data.
- Module 6:
- Name: Real Estate Data Analysis Project
- Description: Final project: real estate data analysis and price prediction.
- Learning Objectives:
- Evaluate different data models by splitting data into training and testing sets.
- Perform polynomial transformation on data sets.
Chapter 10: Applied Data Science Capstone
- Module 1:
- Name: Capstone Project Introduction and Setup
- Description: Overview of the Falcon 9 landing prediction problem, tools setup.
- Learning Objectives:
- Develop Python code to manipulate data in a Pandas data frame.
- Convert a JSON file into a Create a Python Pandas data frame by converting a JSON file.
- Create a Jupyter notebook and make it sharable using GitHub.
- Utilize data science methodologies to define and formulate a real-world business problem.
- Utilize your data analysis tools to load a dataset, clean it, and find out interesting insights from it.
- Module 2:
- Name: Data Collection and Wrangling
- Description: Collecting data via RESTful API and web scraping, data wrangling.
- Learning Objectives:
- Create scatter plots and bar charts by writing Python code to analyze data in a Pandas data frame.
- Write Python code to conduct exploratory data analysis by manipulating data in a Pandas data frame.
- Create and execute SQL queries to select and sort data.
- Utilize your data visualization skills to visualize the data and extract meaningful patterns to guide the modeling process.
- Module 3:
- Name: Interactive Dashboards and Maps
- Description: Building interactive dashboards with Plotly Dash, interactive maps with Folium.
- Learning Objectives:
- Build an interactive dashboard that contains pie charts and scatter plots to analyze data with the Plotly Dash Python library.
- Calculate distances on an interactive map by writing Python code using the Folium library.
- Generate interactive maps, plot coordinates, and mark clusters by writing Python code using the Folium library.
- Build a dashboard to analyze launch records interactively with Plotly Dash.
- Build an interactive map to analyze the launch site proximity with Folium.
- Module 4:
- Name: Machine Learning for Landing Prediction
- Description: Using machine learning (SVM, Classification Trees, Logistic Regression) for landing prediction.
- Learning Objectives:
- Split the data into training testing data
- Train different classification models
- Optimize the Hyperparameter grid search
- Utilize your machine learning skills to build a predictive model to help a business function more efficiently.
- Module 5:
- Name: Capstone Project Delivery
- Description: Compiling results, delivering data-driven insights, peer review.
- Learning Objectives:
- Establish how to structure and build your data-findings report.
- Submit your final report for peer review.
- Review the work submitted by your peers
- Module 1:
Curriculum
- 1 Section
- 48 Lessons
- 96 Weeks
Expand all sectionsCollapse all sections