Suggest edit — Comprehensive Technical Machine Learning Topics

Title

Name

Note

---
title: "Comprehensive Technical Machine Learning Topics"
visibility: public
---

# Comprehensive Technical Machine Learning Topics

Category: [[technical|Technical]]

[Read the original document](https://docs.google.com/document/d/1pVkFRu2GtM7kUMchfrAAJJgJ1NW-jYJljUGOjhPty-k/edit?usp=sharing&sa=D&ust=1596495076373000&usg=AOvVaw1N0SnxMp4k0L3ym5TUVxIU)

---

Machine Learning

1. Algorithms
   1. Linear Regression
      1. Derivation of Normal Form Equations / Gradient
      2. L1 / L2 / Elastic Net Regularization
      3.  Bayesian Linear Regression
   2. Logistic Regression
      1. Derivation from probability theory
      2. Derivation of Gradient, Hessian
      3. Multiclass vs. Binary logistic regression
      4. L1 / L2 / Elastic Neit Regularization
      5. Bayesian Logistic Regression
         1. Laplace Approximation
   3. Discriminant Functions
      1. Linear Discriminant Analysis
      2. Fisher’s linear discriminant
      3. The Perceptron Algorithm
   4. Neural Networks
      1. Chain rule, Backprop derivation
      2. Feedforward Neural Networks
         1. Layer Types
         2. Activations
         3. Softmax
      3. Convolutional Neural Networks
         1. Convolution Operation
         2. Pooling
         3. Assumptions & Properties
            1. Invariance to location of features (Equivariant to translation)
            2. Weight Sharing for efficiency
            3. Sparse weights / Sparse connectivity
               1. Data Locality (Infinitely strong prior on local interactions mattering)
      4. Recurrent Neural Networks
         1. Structure
      5. Word2Vec / Embeddings
      6. Autoencoders
      7. Batch Normalization
      8. Regularization
         1. Early Stopping
         2. Weight Decay
         3. Dropout
         4. Data Augmentation
      9. Bayesian Neural Networks
   5. Decision Trees
      1. Random Forests
         1. Properties around bias-variance tradeoff
         2. Ensemble Modeling
         3. Extremely Random Forests
      2. Gradient Boosting
         1. Nature of Boosting
      3. Feature Importance
      4. Regularization
         1. Pruning
         2. Bootstrap
         3. Randomness
      5. Decision Trees for Regression
   6. Maximum Margin Classifiers
      1. Support Vector Machine
         1. Kernel Trick
         2. For Regression
         3. Optimizers
            1. Quadratic Programming
            2. Pegasus
   7. Naive Bayes
   8. KNN
   9. Splines and Piecewise Polynomials
      1. MARS
   10. Principal Component Analysis        
      1. Maximum variance formulation
      2. Minimum error formulation
      3. Probabilistic PCA
      4. Kernel PCA
   11. Clustering
      1. t
   12. Generative vs. Discriminative Models
   13. Gaussian Processes
   14. Variational Inference
   15. MCMC
      1. Markov Chains
      2. Metropolis-Hastings
      3. Gibbs Sampling
   16. Graphical Models
      1. Bayesian Networks
      2. Markov Random Fields
      3. Inference over Graphical Models
   17. Optimizers
      1. Gradient Descent
         1. Momentum
            1. Nesterov Momentum
         2. RMSProp
         3. Adadelta
         4. Adagrad
         5. Adam
      2. Newton’s Method
         1. Newton-Rhapson
         2. Trust Region Methods
            1. Trust Newton
      3. BFGS
         1. L-BFGS
      4. Iteratively Reweighted Least Squares
      5. Conjugate Gradients
      6. Coordinate Descent
      7. Line Search
      8. Expectation Maximization
         1. Lloyd’s Algorithm
      9. Evolutionary Strategies
      10. Finite Differences
      11. Convex Optimization
         1. Linear Programing
            1. Simplex Method
               1. Branch, Bound and Cut
            2. Interior-Point Methods
         2. Quadratic Programming
            1. Karush-Kuhn-Tucker (KKT) System
      12. Forwards Backwards Algorithm
      13. Matrix Factorization
         1. LU Decomposition
         2. QR Factorization
         3. Cholesky Decomposition
         4. Singular Value Decomposition
         5. Non-Negative Matrix Factorization?
      14. MCMC
         1. Metropolis Hastings
         2. Gibbs Sampling
      15. CART
      16. Constrained Optimization
         1. Lagrange Multiplier Constrained Optimization
         2. Penalty Methods
      17. Automatic Differentiation
      18. Closed Form Solutions
         1. Normal Form Equations
2. Evaluation
   1. Loss Functions
      1. KL Divergence
      2. Cross Entropy
         1. Negative Log Lilkelihood
   2. Validation
      1. Cross Validation
         1. Leave one out
         2. Situations where valuable
      2. Proper validation methodology
   3. Regression
      1. RMSE
      2. MAE
      3. Median
      4. R2
      5. Visualization (Especially of large errors)
   4. Classification
      1. ROC Curve
      2. Confusion Matrix
      3. F1 Score
      4. Heat Map
      5. Overall accuracy rate
      6. Kappa Statistic
      7. Sensitivity
      8. Specificity
      9. AUC 
      10. Visualization (Especially of errors)
3. Testing
   1. t-test
   2. F-test
   3. A-B Testing
4. Conceptual
   1. Bias-Variance Tradeoff
   2. Curse of Dimensionality
   3. Parametric vs. Non-Parametric Algorithms
   4. Softmax and its properties
5. Classification Class Imbalance
   1. Model Tuning (Tune Parameters For Sensitivity)
   2. Alternate Cutoffs (Using ROC Curve)
   3. Adjusting Prior Probability
   4. Unequal Class Weights
   5. Down Sampling
   6. Up Sampling
   7. Alter Cost Function
   8. Dynamic Structure (Cascade of classifiers)
6. Hyperparameter Tuning
   1. Cross Validation
   2. Bootstrap
   3. Grid Search
   4. Random Search
   5. Bayesian Optimization
7. Procedural Data Science
   1. Preprocessing
   2. Exploratory Data Analysis
   3. Feature Evaluation
      1. Coefficients in Linear Models
      2. Random Forest Importances (variance for regression, information gain for classification)
      3. Pearson Correlation with Outcome
      4. Maximal Information Coefficient (MIC)
      5. Distance Correlation (code)
      6. Model with/without feature
      7. Randomly shuffle the feature between data points, check difference in model quality
      8. Lasso Automatic Selection
      9. Mean Decrease Accuracy (code)
      10. Stability Selection
      11. Recursive Feature Elimination

Computer Science

1. Data Structures
   1. Hash Table
   2. Linked List
   3. Graphs
      1. Adjacency List
      2. Adjacency Matrix
      3. Pointers and Objects
   4. Heap
      1. Fibonacci Heaps
      2. Priority Queue
   5. Binary Tree
   6. Binary Search Tree
      1. Balanced Binary Search Tree
         1. Red Black Trees
         2. AVL Trees
   7. Queue
   8. Stack 
   9. Dequeue
   10. Arrays
   11. Disjoint Set
2. Algorithms (Non-ML)
   1. Graph Algorithms
      1. Shortest Path
         1. Dijkstra
         2. Floyd-Warshall
         3. Bellman-Ford
         4. A*
      2. Search
         1. BFS
         2. DFS
            1. Topological Sort
            2. Strongly Connected Components
      3. Minimum spanning Tree
         1. Kruskal
         2. Prim
      4. Min-Cut / Max-Flow
         1. Ford-Fulkerson
         2. Maximum bipartite matching
   2. Recursion
      1. Fibonacci
   3. Dynamic Programming / Recursion
      1. Knapsack
      2. Traveling Salesman Problem
      3. Longest Common Subsequence
      4. Rod Cutting
      5. Matrix-chain multiplication
      6. Optimal Binary Search Trees
   4. Divide and Conquer
      1. Maximum-subarray
      2. Strassen's Algorithm
   5. Greedy
      1. Huffman Codes
      2. Matroids
         1. Task-scheduling
   6. Sorting
      1. O(n log(n))
         1. MergeSort
         2. Quicksort
         3. Heapsort
      2. O(n)
         1. Radix Sort
         2. Bucket Sort
      3. O(n2)
         1. Insertion Sort
         2. Selection Sort
         3. Bubble Sort
         4. Shell Sort
   7. Multithreaded
      1. Multithreaded Matrix Multiplication
      2. Multithreaded MergeSort
   8. Linear Programming
      1. Simplex
      2. Branch, Bound and Cut
   9. Fast Fourier Transform
   10. String Matching
      1. Rabin-Karp
      2. Knuth-Morris-Pratt
   11. NP Completeness
3. Testing
4. Programming Languages
   1. Functions
      1. Passing arguments by value / reference
      2. Main: Handling command-line options
      3. Return types and the return statement
      4. Overloading (Differences in the input parameters determine the function called)
      5. Polymorphism (Different behavior depending on class / type)
      6. Default Arguments
   2. Types
   3. Variables
      1. Val vs. Var
   4. Expressions
      1. Order of Evaluation
      2. Logical and Relational Operators 
      3. Assignment
      4. Increment / Decrement Operators
      5. Conditionals
      6. Type Conversions
      7. Implicit / Explicit Conversions
   5. Scope
   6. Constants
   7. Pointers, Arrays, References
   8. Compilation
   9. Namespaces
   10. Error Handling
   11. Regular Expressions
   12. Iterators
   13. Predicates
   14. Resource Management
   15. Garbage Collection vs. Reference Counting 
5. Concurrency
   1. Tasks and Threads
   2. Passing Arguments
   3. Sharing Data
   4. Waiting for Events
   5. Communication Tasks
6. Object Oriented Programming
   1. Classes (c++)
      1. Concrete Types
      2. Abstract Types
      3. Virtual Functions (Polymorphism)
      4. Class Hierarchies
      5. Copy and Move
      6. Constructor
   2. Objects (Instances of classes, determining their type)
   3. Mixins
   4. Inheritance
   5. Data structure framing of programming rather than logic / action based framing
   6. Immutable State
7. Distributed Computing
   1. Map-Reduce
   2. In-Memory Compute
   3. How to parallelize algorithms
8. Memory Workings & Optimization
   1. Pointers
   2. Bits
      1. Bit Manipulation
9. System Design

Mathematics
1. Differentiation
   1. Limits & Limit Rule
   2. Partial Differentiation
      1. Chain Rule over several variables
   3. Chain Rule
   4. Product Rule
   5. Quotient Rule
   6. Logarithmic Differentiation
   7. Gradient Computations
   8. Jacobian
   9. Hessian
   10. Newton’s Method
   11. Convexity
   12. Critical Points
   13. Lagrangian Multipliers
2. Integration
   1. U-substitution (inverse chain rule)
   2. Integration by parts (inverse product rule)
   3. Multiple Integration
3. Functions
   1. Exponential Functions
      1. Exponential Manipulation Rules
   2. Logarithm Functions        
      1. Logarithm Manipulation Rules
   3. Series
      1. Convergence / Divergence
      2. Special Series
      3. Power Series
      4. Taylor Series
   4. Functions of Several Variables
      1. Vector Functions
      2. Calculus over Vector Functions
4. Sequences and Series
   1. Taylor Series Approximation
   2. Summation Manipulation
5. Linear Algebra
   1. Vector Norms
   2. Projection
   3. Important Matrices
      1. Diagonal Matrices
      2. Positive Semi-definite Matrices
      3. Conjugate Matrices
      4. Triangular matrices
      5. Symmetric Matrices
      6. Orthogonal Matrices
   4. Inversion
   5. Trace
   6. Matrix Factorization
      1. LU Decomposition
      2. QR Factorization
      3. Cholesky Decomposition
      4. Singular Value Decomposition
      5. Non-Negative Matrix Factorization
   7. Gram-Schmidt
   8. Matrix Multiplication
   9. Vector Spaces
   10. Linear Independence
   11. Basis
   12. Linear Transformations
   13. Determinants
   14. Eigenvalues
   15. Eigenvectors
   16. Positive Definiteness, tests
   17. Pseudoinverses
   18. Cross Product

Probability Theory

1. Expectation, Mean, Variance
2. Conditional Probability
   1. Bayes Rule
3. Sum and Product Rule
4. Independence
5. Covariance, Correlation
6. Probability Mass Function, Probability Density Function, Cumulative Distribution Function
7. Distributions
   1. Discrete (Probability Masses)
      1. Binomial
      2. Bernoulli
      3. Multinomial
      4. Poisson
   2. Continuous (Probability Densities)
      1. Gaussian
         1. Conditional Gaussian
         2. Marginal Gaussian
         3. Mixtures of Gaussians
      2. Student’s T-distribution
      3. Beta
   3. Exponential Family
      1. Maximum likelihood for exponentials
      2. Conjugate priors
      3. Noninformative priors
8. Information Theory
   1. Cross Entropy
   2. Mutual Information
9. Limit Theorems
   1. Weak Law of Large Numbers
   2. Strong Law of Large Numbers
   3. Central Limit Theorem
10. Bayesian Statistical Inference
   1. Bayesian inference and the posterior distribution
   2. Point Estimation
   3. Hypothesis Testing
   4. Maximum a-Posteriori Rule
11. Classical Statistical Inference
   1. Binary Hypothesis Testing
   2. Significance Testing
12. Moment Generating Functions

---

*Source: [Original Google Doc](https://docs.google.com/document/d/1pVkFRu2GtM7kUMchfrAAJJgJ1NW-jYJljUGOjhPty-k/edit?usp=sharing&sa=D&ust=1596495076373000&usg=AOvVaw1N0SnxMp4k0L3ym5TUVxIU)*