Boda Blog

Voice Conversion

Boda Sadallah published on 11-21-2021 included in Deepdearning VoiceConversion

AutoVC

Speach Split

AutoPST

Sources

Python

Boda Sadallah published on 11-10-2021 included in Python

Inheritance

If you have inherited from parent class then you should call the parent class constructor if you overload it, or simply doesn’t overload it

Ex:

1
2
3
4
5
6
7
8
9
class parent:
    def __init__:

class child:
    def __init__:
        super().__init__()

class child2:
    ## or simply don't override the constructor and use the parent one

Multiple Inheritance

when we inherit from two or more classes, whatever class we inherited first(typed first in the list), would be the one to have pariority

Scribbles

Boda Sadallah published on 10-27-2021 included in Deeplearning Python

Transformers

To deal with sequential data we have to options:

1-D convolution NN
- processing can be parallel
- not practical for long sequences
Recurrent NN
- can’t happen in prallel
- have gradient vanshing problem of the squence becomes so long
- we have bottleneck at the end of the encoder
RNN with attention mechanism
- to solve the bottleneck problem, we make Encoder-Decoder attention
- Decoder utilzes:
  - context vector
  - weighted sum of hidden states (h1,h2, … ) from the encoder

Transformers

Encoder

first we do input embedding, and positional embedding
in self attention: we multiply q,w,v by a matrix to do lenear transformation
self attentoion: k _ q –> scaling down –> softmax –> _ v

multi-head attention

works as we use many filters in CNN
in wide attention: it takes every word and spread it multi-head attention
in narrow attention: it take every word and split it up across the multi-head
- but didnt we lose the adcantage of using multi-head as mutli prespectives, as we do with filters in CNN?

Positional info

positional encoding using the rotation sin/cos matrix
positional embedding

Residual connections

to give the chance to skip some learning parameters if it’s better to minimize the loss

Layer Normalization

in batch normalization
- ==> we normalize to zero mean and unity varince
- we calculate for all samples in each batch (for each channel )
in layer normalization
- ==> $y = \gamma * x + \beta $ where gamm and bata are trainable parametes
- calculates for all channles in the same sample
in instance normalization ==> calculate for one channel in one sample

Debugging ML Models

Understand bias-variance diagnoses

TTS Research

Boda Sadallah published on 10-14-2021 included in TTS Deeplearning

TTS

TTS can be viewed as a sequence-to-sequence mapping problem; from a sequence of discrete symbols (text) to a real-valued time series (speech signals). A typical TTS pipeline has two parts; 1) text analysis and 2) speech synthesis. The text analysis part typically includes a number of natural language processing (NLP) steps, such as sentence segmentation, word segmentation, text normalization, part-of-speech (POS) tagging, and grapheme-to-phoneme (G2P) conversion. It takes a word sequence as input and outputs a phoneme sequence with a variety of linguistic contexts. The speech synthesis part takes the context-dependent phoneme sequence as its input and outputs a synthesized speech waveform.

FastAi 2020

Boda Sadallah published on 10-06-2021 included in FastAi

Lecture two

P value:

determines if some numbers have realationship, or they are random (whether they are independat or dependant)

suppose we have the temp and R (transmitity) values of a 100 cities in China and we want to see if there's a relation between them.

then we generate many sets of random numbers for each parameter then we calculate the P value which would tell us what's the percentage this slope  is a random, and that ther's no relation

A P-value is the probability of an observed result  assuming that the null hypothesis (there's no relation ) is true


PS: P-value also is dependant on the size of the set u used, so they don't measure the importance of the result.

so don't use P-values

If the P value is > 0.5 then we sure that these daata have no ralation, and if  the p-value is so small, then there's a chance that the data have a relation

Lecture three

In the course video and book, we built a bear classifier, using data from Microsoft Ping Api.

Computational Linear Algebra

Boda Sadallah published on 10-02-2021

Lecture 1

1
import numpy as np

1
2
a = np.array( [[6,5,3,1], [3,6,2,2], [3,4,3,1] ])
b = np.array( [ [1.5 ,1], [2,2.5], [5 ,4.5] ,[16 ,17] ])

1
2
for c in (a @ b):
    print(c)

[50. 49.]
[58.5 61. ]
[43.5 43.5]

Lecture 2

Matrix decomposition: we decopose matricies into smaller ones that has special properties

Boda Blog

Voice Conversion

AutoVC

Speach Split

AutoPST

Sources

Python

Inheritance

Multiple Inheritance

Scribbles

Transformers

Transformers

Encoder

multi-head attention

Positional info

Residual connections

Layer Normalization

Debugging ML Models

TTS Research

TTS

FastAi 2020

Lecture two

P value:

Lecture three

Computational Linear Algebra

Lecture 1

Lecture 2

Singular Value Decomposition (SVD):

Some SVD applications:

Non-negative Matrix Factorization (NMF)