Autoencoders aren’t too useful in practice, but they can be used to denoise images quite successfully just by training the network on noisy images. We can generate noisy images by adding Gaussian noise to the training images, then clipping the values to be between 0 and 1.
“Denoising auto-encoder forces the hidden layer to extract more robust features and restrict it from merely learning the identity. Autoencoder reconstructs the input from a corrupted version of it.”
A denoising auto-encoder does two things:
You Only Look Once is a real-time object detection algorithm, that avoids spending too much time on generating region proposals.Instead of locating objects perfectly, it prioritises speed and recognition.
Architectures like faster R-CNN are accurate, but the model itself is quite complex, with multiple outputs that are each a potential source of error. Once trained they’re still not fast enough to run in real time.
Consider a self-driving car that sees this image of a street. It’s essential for a self-driving car to be able to detect the location of objects all around it, such as pedestrians cars, and traffic…
So let’s get started!
To know Kalman Filter we need to get to the basics. In Kalman Filters, the distribution is given by what’s called a Gaussian.
What is a Gaussian though?
Gaussian is a continuous function over the space of locations and the area underneath sums up to 1.
The Gaussian is defined by two parameters, the mean, often abbreviated with the Greek letter Mu, and the width of the Gaussian often called the variance(Sigma square). …
Facial key-points are relevant for a variety of tasks, such as face filters, emotion recognition, pose recognition, and so on. So if you’re onto these projects, keep reading!
In this project, facial key-points (also called facial landmarks) are the small magenta dots shown on each of the faces in the image below. In each training and test image, there is a single face and 68 key-points, with coordinates (x, y), for that face.These key-points mark important areas of the face: the eyes, corners of the mouth, the nose, etc.
Dataset used: We’ll be using YouTube Faces Dataset, which includes videos…
Classify beer bottles on the go!
I came across a beer label classifier competition and thought why not give ORB, SURF and SIFT a try to see which one of them would perform the best.
So here’s my attempt at classifying beer labels. I’d have to admit that scraping for the dataset has made me familiar with hundreds of beer companies that I never knew existed.
Okay, let’s jump right into coding!!
You’ll need “query” (test images) and “database” (training images). …
I know it is truly bothersome to install CUDA separately. It is really troublesome for a user who is not so familiar with Linux to set the path of CUDA.
Before proceeding, I suggest you back up your data because you never know when things go down south and may turn out different given the complex nature of the Linux graphics stack.
sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt-get update
sudo apt-get install nvidia-driver-410
Now do a
reboot to apply the changes.
Run “nvidia-smi” (without the quotes) which would give you the correct driver version — 410.xx like this:
Notice that installing…
Let’s learn by connecting theory to code!
Now as per the Deep Learning Book, An autoencoder is a neural network that is trained to aim to copy its input to its output. Internally, it has a hidden layer that describes a code used to represent the input. The network may be viewed as consisting of two parts: an encoder function “h=f(x)” and a decoder that produces a reconstruction “r=g(h)”.
Okay okay, I know what you’re thinking! Just another post with no proper explanation? No! That’s not how we’ll proceed here. Let’s take a breath & connect our theoretical knowledge to…
Custom data loading startled me too when I first started my computer vision journey & now that I think about it, it seems a few missing details here & there can make a lot of difference in the understanding, which would obviously reflect in the code. So, I decided to curate a little post to summarize what all I’ve learned & if it helps you even in the tiniest way possible, I’ll consider this post a win.
So, let’s get into it then.
I’ll be making a multilabel classification in Pytorch. The dataset used in this is from http://people.duke.edu/~sf59/Chiu_BOE_2014_dataset.htm
In this blog post, I’m going to show you how you can use the SCP command to copy your files or directories from your local machine to a remote server. So let’s say I have a remote server( shown below):
This one is my local machine,
because Attention Is All You Need, literally!
“One important property of human perception is that one does not tend to process a whole scene in its entirety at once. Instead, humans focus attention selectively on parts of the visual space to acquire information when and where it is needed and combine information from different fixations over time to build up an internal representation of the scene, guiding future eye movements and decision making” — Recurrent Models of Visual Attention,2014
In this post, I will show you how attention is implemented. The main focus would be on implementing attention in isolation…
A Machine Learning Research scholar who loves to moonlight as a blogger.