Data and AI Concepts

Table of Contents

I Preface

Back to TOC

Topics Covered: Why This Book?, Who Should Read This Book?, Scope of This Book, Outline of This Book

Note: This section is complete.

Click here to read it.

II Data and AI Foundation

Back to TOC

Topics Covered: Data and AI Introduction, Mathematics, IT/Programming, Business Domain

Note: This section is still being written…

In this section, I am going to build the foundation that is necessary to grasp before looking at components of data and AI platform.

And we will start from scratch, first we will cover the basic concepts of data and AI and how these fields are connected, then we will focus on core concepts of technology and business domain etc.

1 Data Concepts

Topics Covered: Data, Data Vs Information, DIKW Pyramid, Different Aspects of Data (Formats, Scope, Biases), Structured, Semi-structured and Unstructured Data, Data Usage (Scientific Research, Business Management, Finance, Governance), Data Analysis

Note: This lesson is complete.

Click here to read it.

2 Technology Concepts

Topics Covered: Technology, Information Technology, Data Structures and Algorithms, Data Processing and Storage, Data Models, Operational & Analytical Data, Databases, Data Warehouses, Streaming and Batch Data, ETL/ELT

Note: This lesson is under development.

Click here to read it.

3 AI Concepts

Topics Covered: Intelligence, Intelligent Agents, Applications (Web Search, Recommendation Systems, Self-driving Cars, Strategic Games), Aspects of AI (Search, Knowledge, Uncertainty, Optimization, Learning, Neural Networks, Language), Strong and Weak AI

4 From Data To AI

Topics Covered: Business Intelligence, Data Science, Machine Learning, Deep Learning, Artificial Intelligence

5 Cloud Computing

Topics Covered: Introduction, Public, Private and Hybrid Clouds, IaaS, PaaS and SaaS, Data and AI on Cloud, AWS, Azure and GCP

6 Business Domain

Topics Covered: Problem Solving, Problem Identification, Problem Definition, Prioritization, Root-Cause Analysis, Possible Solutions, Solution Evaluation, Cost-Benefit Analysis, Planning and Implementation

III Data and AI Components

Back to TOC

Topics Covered: Data Governance, Data Architecture, Data Ingestion, Data Storage, Data Engineering, Data Science, Data Visualization, Data Operationalization

7 Data Governance 

Topics Covered: Data Governance Basics, Why Data Governance is Important?, Aspects of Data Governance, How to do Data Governance?

8 Data Architecture

Topics Covered: Data Architecture Basics, Why Data Architecture is Required?, How to build Data Architecture?

9 Data Ingestion

Topics Covered: Data Ingestion Basics, Types of Data Ingestion, Tools for Data Ingestion

10 Data Storage

Topics Covered: Data Storage Basics, Types of Data Storage, Tools for Data Storage

11 Data Engineering

Topics Covered: Data Engineering Basics, Tools for Data Engineering, Building Data Pipelines

12 Data Science

Topics Covered: Data Science Basics, Overall Process, Algorithms, Tools for Data Science

13 Data Visualization

Topics Covered: Data Visualization Basics, Why Data Visualization is Important?, Tools for Data Visualization

14 Data Operationalization

Topics Covered: Operationalization Basics, Why Operationalization is required?, Tools for Data AI Operationalization

IV Data and AI Platforms

Back to TOC

Topics Covered: Open Source, AWS, Azure, GCP, Databricks, Snowflake

15 Open Source

Topics Covered: Building Data and AI Platform in Open Source

16 AWS

Topics Covered: Building Data and AI Platform in AWS

17 Azure

Topics Covered: Building Data and AI Platform in Azure

18 GCP

Topics Covered: Building Data and AI Platform in GCP

19 Databricks

Topics Covered: Building Data and AI Platform in Databricks

20 Snowflake

Topics Covered: Building Data and AI Platform in Snowflake

V Appendix

Back to TOC

Topics Covered: SQL, Python, UNIX and Shell Scripting, Data Structure and Algorithms

A Linear Algebra

Topics Covered: Scalars, Vectors, Matrices and Tensors, Multiplying Matrices and Vectors, Identity and Inverse Matrices, Linear Dependence and Span, Norms, Special Kinds of Matrices and Vectors, Eigendecomposition, Singular Value Decomposition (SVD), The Moore Penrose Pseudoinverse, The Trace Operator, The Determinant, Principal Component Analysis

B Multivariate Calculus

Topics Covered: Functions, Derivatives, Product Rule, Chain Rule, Integrals, Partial Derivatives, The Gradient, The Jacobian, The Hessian, Multivariate Chain Rule, Approximate Functions, Power Series, Linearization, Multivariate Taylor

C Probability

Topics Covered: Probability, Conditional Probability, Random Variables, Probability Distributions

D Statistics

Topics Covered: Statistics, Descriptive Statistics (Univariate, Bivariate, Multivariate Analysis, Function Models), Inferential Statistics (Sampling Distributions & Estimation, Hypothesis Testing, Correlation, Causation & Regression), Bayesian Statistics (Frequentist Vs Bayesian Statistics, Bayesian Inference, Test for Significance), Statistical Learning (Prediction & Inference, Parametric & Non-parametric methods, Prediction Accuracy and Model Interpretability, Bias-Variance Trade-Off)

E Operating System Basics

*Topics Covered: *

F Data Structures and Algorithms Basics

Topics Covered: Data Structures (Array, Linked List, Stack, Queue, Heap, Hashing, Binary Tree, Binary Search Tree, Graph, Matrix), Algorithms (Asymptotic Analysis, Searching and Sorting, Greedy Algorithms, Recursion, Dynamic Programming)

G Programming Basics

*Topics Covered: *

H Database Systems Basics

*Topics Covered: *


Topics Covered: SQL, Data Models, ER Diagrams, Tables, Temporary Tables, Selecting (SELECT, FROM, DISTINCT), Filtering (WHERE, AND, OR, IN, NOT, BETWEEN, NULLs, Wildcards), Ordering (ORDER BY, DESC), Aggregating (GROUP BY, HAVING, AVERAGE, COUNT, MAX, MIN), Subqueries, Joins (Cartesian, Inner, Outer <Left/Right>, Self), Sets (UNION, UNION ALL, INTERSECT), Aliases, Views, Subqueries (WITH AS)

J Python

Topics Covered: Programming, Installation, Basic Syntax & Variable Types, Data Types and Conversion, Basic Operators and Loops, Functions, Exceptions and Modules, Data Science Specific Modules (NumPy, SciPy, Pandas, MatPlotLib, Scikit-Learn)

K UNIX and Shell Scripting

Topics Covered: Operating System, Architecture, Basic UNIX Commands, Shell Scripting