Deep Learning- A brief Introduction

Ever wondered how YouTube is able to give us relevant recommendations based on our taste in videos or how self-driving cars operate? All of this is possible because of Deep Learning.

Deep learning is a machine learning approach that teaches the machines to learn by examples and experience.

It is a technique where machines acquire skills without human intervention.

What is Deep Learning?
Why is deep learning so popular?
How does deep learning work?
Deep learning algorithms
Deep learning Applications
Limitations and Challenges
Conclusion

What is Deep Learning?

It can be described as a machine learning model which enables the computer to perform classifications based on images, text, sound, etc.

These deep learning models are trained with a large amount of data and neural network architectures which may contain multiple layers.

As a result, the deep learning models are able to achieve a state of the art accuracy, which may exceed human-level performance in some scenarios.

Machine Learning vs Artificial Intelligence vs Deep Learning: Are all of them are same?

Artificial Intelligence is a generic term that refers to procedures that enable computers to imitate human nature. Machine Learning can be described as a set of algorithms that are trained on data to increase their performance.

Whereas, Deep Learning is a machine learning technique that is inspired by human brain structure. It uses a multiple layered model framework called a neural network.

Deep Learning is a subset of Machine Learning, which is a subset of Artificial Intelligence. This can be understood by the Venn diagram given below:

Why is deep learning so popular?

Deep Learning models provide accurate results which enable consumer electronics to meet user expectations. Also, accuracy is important in the case of safety-critical products like self-driving cars.

Following are the reasons which make the deep learning models more accurate than ever:

Deep learning models are trained using massive amounts of labelled data like the self-driving car models are trained using zillions of images and videos.
Deep Learning consumes high computing power like good quality GPUs, powerful clusters and cloud computing. These high quality computing machines help the deep learning models to lower the training time.

Another reason why these models have gained so much popularity is that they do not require the feature extraction step.

The traditional machine learning models like Logistic regression, Decision trees, SVM, etc. cannot be used on raw data directly. They require a separate preprocessing step called feature extraction.

On the other hand, the artificial neural networks used in Deep Learning do not require the feature extraction step.

Basic Process of Deep Learning implementation approach

In other words, the feature extraction step is part of the process which takes place in the artificial neural network.

How does deep learning work?

Deep Learning models implement Artificial Neural Networks which imitate the way the human brain computes information.

The training process involves unknown elements in the input distribution to extract features and discover useful data.

This training process occurs on multiple levels results for accurate computations.

No model is considered perfect, we need to choose the algorithms depending upon the nature of the task to be performed.

Gaining a proper understanding of all the elementary algorithms is required to choose the relevant algorithm.

Deep learning algorithms

It is the fastest-growing tech and In order to implement it, learning about various models is mandatory.

There are two types of models in deep learning: super vised and unsupervised.

Supervised models are trained using examples of a labelled dataset, i.e. the algorithm can use an answer key to evaluate the accuracy of training data.

Whereas in unsupervised models, unlabeled data is used and the algorithms try to gather information by extracting features and patterns on their own.

Supervised Models

Convolutional Neural Networks (CNNs)

The Convolutional Neural Network or CNN is built to handle a large amount of complexity for pre-processing and computation of data.

It is an advanced and more powerful variation of the classic artificial neural networks. They were developed for image detection and for image classification problems.

When to use the CNNs:

While using Image Datasets
OCR document analysis
When the model requires high complexity in computing the output
When the input data is 2-D but can be transformed to 1-D internally for rapid processing.

Classic Neural Networks (Multilayer Perceptrons)

The singular nature of the Classic neural networks helps it to adapt to the elementary binary patterns through a series of input, resembling the learning patterns of a human brain.

Multilayer perceptron based Deep Learning model

When to use the Classic Neural Networks:

Classification problems where the set of real values is given as input.
Tabular dataset, in the form of rows and columns i.e. the CSV files.

Recurrent Neural Networks (RNNs)

Recurrent Neural Networks or RNNs were discovered to be used for predicting sequences. LSTM or Long short-term memory is a renowned RNN algorithm with various possible use cases.

When to use the RNNs:

One to one mapping: a single input mapped to a single output, example: Image classification.
One to many mapping: a single input mapped to a sequence of outputs, example: Image captioning i.e. multiple words from a single image.
Many to one mapping: A sequence of inputs produces a single output, example: Sentiment Analysis i.e. binary output from multiple words
Many to many mapping: A sequence of inputs produces a sequence of outputs, example: Video classification i.e. splitting the video into multiple frames and labelling each frame separately.

Unsupervised Models

Boltzmann Machines

Boltzmann machines, unlike the above models, do not follow any certain direction. Direction here means input layer→ hidden layer → output.

These machines have nodes connected to each other in a circular fashion of hyperspace like in an image.

When to use the Boltzmann Machines:

While working with a very specific set of data
To build a binary recommendation system
Monitoring a system

Self-Organising Maps (SOMs)

Self-Organising Maps or the SOMs use unsupervised data and help with reducing the random variables present in the model (dimensionality reduction).

The output produced is always two dimensional for a self-organising map.

When to use the SOMs:

When the data does not have an output or Y column
Creative projects like music, text, videos, etc. produced by Artificial intelligence Dimensional reduction for feature detection
Exploring the projects to understand the framework behind the dataset

AutoEncoders

The autoencoders work by automatically encoding the data based on the input values, followed by an activation function and then finally decoding the data as output.

When to use the AutoEncoders:

Feature or dimensionality detection
Building powerful recommendation systems
Performing encoding on massive datasets.

Deep learning Applications

Deep Learning, a subset of machine learning, has become a buzzword in the field of artificial intelligence.

It enables the computers to learn from past experiences and examples, helping them to solve complicated problems without human involvement.

What exactly are the problems being tackled by deep learning?

Following are a few major examples where deep learning plays an important role:

Healthcare: Deep learning is a fast-growing trend in the healthcare industry. The sensors and devices that provide real-time data about patients like overall health condition, heartbeat count, blood sugar level, etc. use deep learning. Apart from this, the pharmaceutical companies also implement these algorithms for disease detection, image segmentation, etc.

Virtual Assistant: Virtual assistants have various applications nowadays. They act like chatbots, online training instructors, etc. The main area of application of virtual assistants is speech recognition, text to speech recognition, and vice versa using natural language processing. All this is possible due to deep learning. Siri, Alexa, Cortana, Google Assistant, etc. are some of the most popular virtual assistants.

Social Media: Deep learning helps Twitter to enhance its performance. These models access and analyse a lot of data in order to learn about user preferences. Not only this, Facebook uses it to improve its user experience by recommending relevant pages, posts, friends etc. In addition to this, Instagram uses its models to prevent cyberbullying and eliminate controversial comments.

Chatbots: Chatbots help in solving customer problems in just a few seconds using Artificial Intelligence to chat via text or text to speech. Chatbots help in consumer interaction, marketing on social media platforms, and instant response to clients. They use machine learning and deep learning models to generate various types of reactions.

Self-driving cars: Self-driving cars operate using machine learning and deep learning algorithms.

“Self-driving cars are the natural extension of active safety and obviously something we should do”. -Elon Musk.

They are able to detect objects near the car, understand the traffic signals, detect the distance between the car and the other vehicles, etc. Tesla is the most renowned self-driving car in the market.

Limitations and Challenges

Although deep learning is an expanding technology in various domains. It comes with a number of limitations and challenges:

A large amount of data is required to train the models to achieve accurate results.
Training the deep learning models is a bit expensive since high quality GPUs and hundreds of powerful machines are required.
There is no predefined framework to help in selecting the relevant deep learning tools. As a result, adopting deep learning skills becomes difficult.
The data needs to be cleaned before applying any algorithm on it. Irrespective of how efficient the model is, without data cleansing, it will deliver inaccurate results

Conclusion

With the increase in the deployment of big data, deep neural network architecture and computational power, the conventional predictive models have improved in terms of accuracy and efficiency.

The number of organizations adopting big data and advanced technologies like artificial intelligence, machine learning, the Internet of things, etc. have grown and will continue to grow in the near future.