CS480/680 Intro to Machine Learning

Boda Sadallah included in Deeplearning

12-14-2021 350 words 2 minutes

Lecture 12

Gausain process

infinite dimentional gaussian distribution

Lecture 16

Convolution NN

a rule of thumb: to have many layers with smaller filters is better than having one big filter, as going deep captures better features and also uses fewer parameters

Residual Networks

even after using Relu, NN can still suffer from gradient vanishing
the idea in to add skip connections so that we can create shorter paths

Lecture 18

LSTM vs GRU vs Attention

LSTM: 3 gates, one for the cell state, one for the input, one for the output
GRU: only two states, one for output, and one for taking weighted probablitiy for the contribution of the input and the hidden state
- takes less parameters
Attention: at every step of producing the output, create a new context vector that gives more attention to the importat input tokens for this output token

Lecture 20

Autoencoder

takes different input and generates the same output

used in:

compression
denoising
sparse representation
data generation

Lecture 21

Generative models

Variational autoencoders

idea: train the encoder to sample a fixed distribution ,
we want the network to sample a fixed distribution that is close to the distribution of the encoder, so that it generate similar outputs to the input, but not the same

GANS:

Lecture 22

Ensemble Learning:

the idea is to combine the hypothesis of several models to produce a better one

Bagging

   - choose the class the majority votes

Weighted majority

- decrease the weight of corrlelated hypothesis
- increase the weight of good hypothesis

Boosting

- idea: when an instance is missclassified  by hypothesis, increase its weight so that the next hypothesis is more likely to classify  it correctly

- can boost a weak learner

- makes weighted training set, so that it can focus on missclassified examples


- at the end generate weighted hypotheses based on the acc of each hypothsis

- Advantages:
    - no need to learn perfect hypothesis
    - can boost any weak learning algo
    - boosting is very simple
    - has good generalization

Contents

CS480/680 Intro to Machine Learning

Lecture 12

Gausain process

Lecture 16

Convolution NN

Residual Networks

Lecture 18

LSTM vs GRU vs Attention

Lecture 20

Autoencoder

Lecture 21

Generative models

Variational autoencoders

GANS:

Lecture 22

Ensemble Learning:

Bagging

Weighted majority

Boosting

Lecture 23

Normalizing flows

Lecture 24

Gradient boosting