2 |
Data Science and Python:
Discovering the match between data science and python:
Considering the emergence of data science, Outlining the core competencies of
a data scientist, Linking data science, big data, and AI , Understanding the role
of programming, Creating the Data Science Pipeline, Preparing the data,
Performing exploratory data analysis, Learning from data, Visualizing,
Obtaining insights and data products, Understanding Python’s Role in Data
Science, Considering the shifting profile of data scientists, Working with a
multipurpose, simple, and efficient language, Learning to Use Python
Fast ,Loading data, Training a model, Viewing a result.
Introducing Python’s Capabilities and Wonders:
Why Python?, Grasping Python’s Core Philosophy, Contributing to data science,
Discovering present and future development goals, Working with Python,
Getting a taste of the language, Understanding the need for indentation, Working
at the command line or in the IDE, Performing Rapid Prototyping and
Experimentation, Considering Speed of Execution, Visualizing Power, Using
the Python Ecosystem for Data Science, Accessing scientific tools using SciPy,
Performing fundamental scientific computing using NumPy, Performing data
analysis using pandas, Implementing machine learning using Scikit-learn,
Going for deep learning with Keras and TensorFlow, Plotting the data usingmatplotlib, Creating graphs with NetworkX, Parsing HTML documents using
Beautiful Soup. |
10 |
3 |
Getting Your Hands Dirty With Data:
Understanding the tools:
Using the Jupyter Console, Interacting with screen text, Changing the window
appearance, Getting Python help, Getting IPython help, Using magic functions,
Discovering objects, Using Jupyter Notebook, Working with styles, Restarting
the kernel, Restoring a checkpoint, Performing Multimedia and Graphic
Integration, Embedding plots and other images, Loading examples from online
sites, Obtaining online graphics and multimedia.
Working with Real Data:
Uploading, Streaming, and Sampling Data, Uploading small amounts of data
into memory, Streaming large amounts of data into memory, Generating
variations on image data, Sampling data in different ways, Accessing Data in
Structured Flat-File Form ,Reading from a text file Reading CSV delimited
format, Reading Excel and other Microsoft Office files, Sending Data in
Unstructured File Form, Managing Data from Relational Databases, Interacting
with Data from NoSQL Databases, Accessing Data from the Web.
Conditioning Your Data:
Juggling between NumPy and pandas, Knowing when to use NumPy, Knowing
when to use pandas, Validating Your Data, Figuring out what’s in your data,
Removing duplicates, Creating a data map and data plan, Manipulating
Categorical Variables, Creating categorical variables, Renaming levels,
Combining levels, Dealing with Dates in Your Data, Formatting date and time
values, Using the right time transformation, Dealing with Missing Data, Finding
the missing data, Encoding missingness, Imputing missing data, Slicing and
Dicing: Filtering and Selecting Data, Slicing rows, Slicing columns, Dicing,
Concatenating and Transforming, Adding new cases and variables, Removing
data, Sorting and shuffling, Aggregating Data at Any Level.
Shaping Data:
Working with HTML Pages, Parsing XML and HTML, Using XPath for data
extraction, Working with Raw Text, Dealing with Unicode, Stemming and
removing stop words, Introducing regular expressions, Using the Bag of Words
Model and Beyond, Understanding the bag of words model, Working with n-
grams, Implementing TF-IDF transformations, Working with Graph Data,
Understanding the adjacency matrix, Using NetworkX basics. |
21 |
4 |
Data Visualization:
Visualizing Information:
Starting with a Graph, Defining the plot, Drawing multiple lines and plots,
Saving your work to disk, Setting the Axis, Ticks, Grids, Getting the axes,
Formatting the axes, Adding grids, Defining the Line Appearance, Working
with line style, Using colors, Adding markers, Using Labels, Annotations, and
Legends, Adding labels, Annotating the chart, Creating a legend.
Visualizing the Data:
Choosing the Right Graph, Showing parts of a whole with pie charts, Creating
comparisons with bar charts, Showing distributions using histograms, Depicting
groups using boxplots, Seeing data patterns using scatterplots, Creating
Advanced Scatterplots, Depicting groups, Showing correlations, Plotting Time
Series, Representing time on axes, Plotting trends over time, Plotting
Geographical Data, Using an environment in Notebook, Getting the Basemap
toolkit, Dealing with deprecated library issues, Using Basemap to plot
geographic data, Visualizing Graphs, Developing undirected graphs,
Developing directed graphs. |
11 |
5 |
Data Wrangling:
Wrangling Data:
Playing with Scikit-learn, Understanding classes in Scikit-learn, Defining
applications for data science, Performing the Hashing Trick, Using hash
functions, Demonstrating the hashing trick, Working with deterministic
selection, Considering Timing and Performance, Benchmarkin, with,timeit,
Working with the memory profiler, Running in Parallel on Multiple Cores,
Performing multicore parallelism, Demonstrating multiprocessing.
Exploring Data Analysis:
The EDA Approach, Defining Descriptive Statistics for Numeric Data,
Measuring central tendency,Measuring variance and range ,Working with
percentiles, Defining measures of normality, Counting for Categorical Data,
Understanding frequencies, Creating contingency tables, Creating Applied
Visualization for EDA ,Inspecting boxplots, Performing t-tests after boxplots,
Observing parallel coordinates, Graphing distributions, Plotting scatterplots
,Understanding Correlation, Using covariance and correlation, Using
nonparametric correlation, Considering the chi-square test for tables ,Modifying
Data Distributions, Using different statistical distributions, Creating a Z-score
standardization, Transforming other notable distributions. |
14 |