Categories
Machine Learning

Deep Learning- A brief Introduction

Ever wondered how YouTube is able to give us relevant recommendations based on our taste in videos or how self-driving cars operate? All of this is possible because of Deep Learning.

Deep learning is a machine learning approach that teaches the machines to learn by examples and experience.

It is a technique where machines acquire skills without human intervention.

What is Deep Learning?

It can be described as a machine learning model which enables the computer to perform classifications based on images, text, sound, etc.

These deep learning models are trained with a large amount of data and neural network architectures which may contain multiple layers.

As a result, the deep learning models are able to achieve a state of the art accuracy, which may exceed human-level performance in some scenarios. 

Machine Learning vs Artificial Intelligence vs Deep Learning: Are all of them are same?

Artificial Intelligence is a generic term that refers to procedures that enable computers to imitate human nature. Machine Learning can be described as a set of algorithms that are trained on data to increase their performance.

Whereas, Deep Learning is a machine learning technique that is inspired by human brain structure. It uses a multiple layered model framework called a neural network.

Deep Learning is a subset of Machine Learning, which is a subset of Artificial Intelligence. This can be understood by the Venn diagram given below:

Domain of Deep Learning

Deep Learning models provide accurate results which enable consumer electronics to meet user expectations. Also, accuracy is important in the case of safety-critical products like self-driving cars.

Following are the reasons which make the deep learning models more accurate than ever:

  • Deep learning models are trained using massive amounts of labelled data like the self-driving car models are trained using zillions of images and videos.
  • Deep Learning consumes high computing power like good quality GPUs, powerful clusters and cloud computing. These high quality computing machines help the deep learning models to lower the training time.

Another reason why these models have gained so much popularity is that they do not require the feature extraction step.

The traditional machine learning models like Logistic regression, Decision trees, SVM, etc. cannot be used on raw data directly. They require a separate preprocessing step called feature extraction.

On the other hand, the artificial neural networks used in Deep Learning do not require the feature extraction step.

Basic Process of Deep Learning implementation approach

In other words, the feature extraction step is part of the process which takes place in the artificial neural network.

How does deep learning work?

Deep Learning models implement Artificial Neural Networks which imitate the way the human brain computes information.

The training process involves unknown elements in the input distribution to extract features and discover useful data.

This training process occurs on multiple levels results for accurate computations.

No model is considered perfect, we need to choose the algorithms depending upon the nature of the task to be performed.

Gaining a proper understanding of all the elementary algorithms is required to choose the relevant algorithm.

Deep learning algorithms

It is the fastest-growing tech and In order to implement it, learning about various models is mandatory.

There are two types of models in deep learning: supervised and unsupervised.

Supervised models are trained using examples of a labelled dataset, i.e. the algorithm can use an answer key to evaluate the accuracy of training data.

Whereas in unsupervised models, unlabeled data is used and the algorithms try to gather information by extracting features and patterns on their own.

Supervised Models 

Convolutional Neural Networks (CNNs)

The Convolutional Neural Network or CNN is built to handle a large amount of complexity for pre-processing and computation of data.

It is an advanced and more powerful variation of the classic artificial neural networks. They were developed for image detection and for image classification problems.

CNN Deep Learning model demo

When to use the CNNs:

  • While using Image Datasets            
  • OCR document analysis                  
  • When the model requires high complexity in computing the output   
  • When the input data is 2-D but can be transformed to 1-D internally for rapid processing.
Classic Neural Networks (Multilayer Perceptrons)

The singular nature of the Classic neural networks helps it to adapt to the elementary binary patterns through a series of input, resembling the learning patterns of a human brain.      

Multilayer perceptron based Deep Learning model

When to use the Classic Neural Networks:

  • Classification problems where the set of real values is given as input.
  • Tabular dataset, in the form of rows and columns i.e. the CSV files.
Recurrent Neural Networks (RNNs)

Recurrent Neural Networks or RNNs were discovered to be used for predicting sequences. LSTM or Long short-term memory is a renowned RNN algorithm with various possible use cases.

Recurrent Neural Network

When to use the RNNs:

  • One to one mapping: a single input mapped to a single output, example: Image classification.    
  • One to many mapping: a single input mapped to a sequence of outputs, example: Image captioning i.e. multiple words from a single image.        
  • Many to one mapping: A sequence of inputs produces a single output, example: Sentiment Analysis i.e. binary output from multiple words       
  • Many to many mapping: A sequence of inputs produces a sequence of outputs, example: Video classification i.e. splitting the video into multiple frames and labelling each frame separately.

Unsupervised Models

Boltzmann Machines

Boltzmann machines, unlike the above models, do not follow any certain direction. Direction here means input layer→ hidden layer → output.

These machines have nodes connected to each other in a circular fashion of hyperspace like in an image.

model based on Boltzmann Machines

When to use the Boltzmann Machines:

  • While working with a very specific set of data      
  • To build a binary recommendation system             
  • Monitoring a system
Self-Organising Maps (SOMs)

Self-Organising Maps or the SOMs use unsupervised data and help with reducing the random variables present in the model (dimensionality reduction).

The output produced is always two dimensional for a self-organising map.

model of Self-organizing map

When to use the SOMs:

  • When the data does not have an output or Y column          
  • Creative projects like music, text, videos, etc. produced by Artificial intelligence Dimensional reduction for feature detection                 
  • Exploring the projects to understand the framework behind the dataset
AutoEncoders

The autoencoders work by automatically encoding the data based on the input values, followed by an activation function and then finally decoding the data as output. 

Auto encoder

When to use the AutoEncoders:

  • Feature or dimensionality detection        
  • Building powerful recommendation systems       
  • Performing encoding on massive datasets.

Deep learning Applications

Deep Learning, a subset of machine learning, has become a buzzword in the field of artificial intelligence.

It enables the computers to learn from past experiences and examples, helping them to solve complicated problems without human involvement.

What exactly are the problems being tackled by deep learning?

Following are a few major examples where deep learning plays an important role:

Healthcare: Deep learning is a fast-growing trend in the healthcare industry. The sensors and devices that provide real-time data about patients like overall health condition, heartbeat count, blood sugar level, etc. use deep learning. Apart from this, the pharmaceutical companies also implement these algorithms for disease detection, image segmentation, etc. 

Virtual Assistant: Virtual assistants have various applications nowadays. They act like chatbots, online training instructors, etc. The main area of application of virtual assistants is speech recognition, text to speech recognition, and vice versa using natural language processing. All this is possible due to deep learning. Siri, Alexa, Cortana, Google Assistant, etc. are some of the most popular virtual assistants.

Social Media: Deep learning helps Twitter to enhance its performance. These models access and analyse a lot of data in order to learn about user preferences. Not only this, Facebook uses it to improve its user experience by recommending relevant pages, posts, friends etc. In addition to this, Instagram uses its models to prevent cyberbullying and eliminate controversial comments.

Chatbots: Chatbots help in solving customer problems in just a few seconds using Artificial Intelligence to chat via text or text to speech. Chatbots help in consumer interaction, marketing on social media platforms, and instant response to clients. They use machine learning and deep learning models to generate various types of reactions.

Self-driving cars: Self-driving cars operate using machine learning and deep learning algorithms.

Self-driving cars are the natural extension of active safety and obviously something we should do”. -Elon Musk.

They are able to detect objects near the car, understand the traffic signals, detect the distance between the car and the other vehicles, etc. Tesla is the most renowned self-driving car in the market.

Limitations and Challenges

Although deep learning is an expanding technology in various domains. It comes with a number of limitations and challenges:

  • A large amount of data is required to train the models to achieve accurate results.
  • Training the deep learning models is a bit expensive since high quality GPUs and hundreds of powerful machines are required.
  • There is no predefined framework to help in selecting the relevant deep learning tools. As a result, adopting deep learning skills becomes difficult.
  • The data needs to be cleaned before applying any algorithm on it. Irrespective of how efficient the model is, without data cleansing, it will deliver inaccurate results

Conclusion

With the increase in the deployment of big data, deep neural network architecture and computational power, the conventional predictive models have improved in terms of accuracy and efficiency.

The number of organizations adopting big data and advanced technologies like artificial intelligence, machine learning, the Internet of things, etc. have grown and will continue to grow in the near future. 

Categories
Machine Learning

Data Cleaning in a Nutshell

“Better data beats fancier algorithms.”

Garbage in, garbage out is the motto that needs to be followed to build an accurate machine learning model.

If the data under analysis is not accurate, then it is not useful. Irrespective of how accurate your model is, without data cleaning, it will deliver biased and inaccurate results.

Thus, data cleaning, also called data cleansing or data scrubbing, is one of the most crucial parts of machine learning.

What is data cleaning?

Data cleansing can be understood as a process of making the data ready for analysis.

Eliminating null records and unnecessary columns, fixing the outliers (junk values), restructuring the data to enhance its readability, etc. are some of the components of data cleaning.

Data cleaning also focuses on increasing the accuracy of the dataset by rectifying the existing information, instead of just removing chunks of useless data.

Steps involved in data cleaning

There is no particular procedure for data cleaning, it varies from one dataset to another. However, having a roadmap is essential to keep you on the right track.

Given below are the basic steps which can be followed to create a template for your data cleaning process.

Eliminating duplicates and irrelevant observations

  • Duplicate or redundant values affect the efficiency of the model to a large extent.  The data is repeated and may add towards either the correct side or incorrect side, thereby giving biased results. 
  • The irrelevant data do not add any value to the dataset, thus should be dropped or removed to save resources like memory and processing time.

Rectifying structural errors

  • Structural errors include inconsistencies in naming conventions, typos, and wrong capitalization. These typographical errors result in mislabeled classes or categories. 
  • For instance, the model might treat “NA” and “Not Applicable” as two different categories, though they represent the same value. These structural variations make the algorithms very inefficient resulting in unfaithful results.

Filter out the irrelevant outliers

  • Outliers are the values that do not fit in the dataset under observation. These values can be understood as the noise in the dataset.
  • Outliers arise due to manual errors or data entry mistakes. The Outliers are not always incorrect, so they should not be dropped until we have a valid reason.

Handling missing data

Handling missing values is the trickiest step in the data cleaning process. The missing values can’t be ignored or eliminated since they can represent something crucial. 

Following are a couple of the most common methods to deal with the missing data:

  • Removing the observations having missing values, but might result in losing some useful information.
  • Imputing the missing values based on the previous observations. Since it is based on assumptions and not actual observations, it does not add any value to the dataset and may result in losing the data integrity.

Some data cleansing tools

Data cleaning is the most important step in machine learning to get accuracy and efficiency.

Performing data cleansing on zillions of data manually is tedious and may result in errors.

This makes the data cleaning tools prominent since they help in keeping a large amount of data clean and consistent.

Openrefine, TIBCO Clarity, Trifacta Wrangler, IBM Infosphere, Cloudingo, Quality Stage, etc. are some of the most popular data cleaning tools.

Conclusion

Working with clean data comes with a lot of advantages like improved efficiency, reduced error margin, accuracy, consistency, better decision making, and many more.

Thus, the data should be cleansed before fitting any model with it.

If you want to invest in Data cleaning then you can learn by implementing it using Python or R.

Categories
Machine Learning Webinar

Machine Learning- let us get started!

Machine learning is one of the most popular domain the new age application development programmers and companies are encashing on—just another field of computer science, which leverages on the applied practice of mathematics as well as statistics.

Why this created the buzz?

Because it reduced the intensive logic implementations for processing the massive quantity of data(generally known as big data), and the results are promising in terms of finding patterns in the data resulting in better business-oriented decisions.

Now, as a beginner, the concept of Machine learning could be overwhelming as there has been plenty of scattered information available across the web, including various theoretical courses and proprietor documentations.

So, here I will try to get you a simple flow on how as a beginner, you can get your self to familiarize yourself with the machine learning domain and where you can start looking at in the first place.

The formal definition could be:

Machine learning(ML) is a field of computer science concerned with programs that learn as well as is concerned with the question of how to construct computer programs that automatically improve with experience.

Now you might also be thinking about how artificial intelligence is different from machine learning, so here is a big picture for you.

Here you can see that machine learning is the subset or, in fact, the more specialized form of artificial intelligence. And, further supports the deep learning domain for more intense & intelligent applications.

No alt text provided for this image

Now the next point to understand is why do we want the computer programs to improve with experience. it’s because:

we have huge data and we want to make decisions or predictions from it

AND

we want computers to learn to identify patterns without being explicitly programmed to

And as said, DATA is the new currency for this digital world and is priceless. Therefore, it’s essential to utilize it to achieve the unique potential for your business.

Great, you know why it is essential for computers to improve.

Now, as a programmer, what should you know So that this automation can be achieved.

Types of machine learning

Broadly there are three

Supervised Learning

This is simplest to implement, where primarily the problems related to regression and classification are solved. And the most important is that the Data available for analysis is available in a structured way with minimum anomalies, and even if anomalies are present, they can be rectified by using statistical measures.

General use cases that are implemented under this: Image classifications, Fraud detections, weather/market forecasting, etc. So you can simply infer that where ever the simple predictions are supposed to be done that Supervised learning.

Unsupervised Learning

This is again working on the same objective of prediction, but the complexity is increased. Because the data available for analysis is either minimally structured or totally unstructured. Therefore the added process of Clustering or Dimensionality Reduction is required to be performed before the process of predications can be put in place.

So this requires more insights into the working concepts of statistical procedures and is the next stage of learning in ML. The general use case implementations can be Customer segmentation, recommender system, Feature discoveries, etc.

Reinforcement Learning

This is basically leveraging the power from both the supervised and unsupervised procedures with an addon factor of iterative learning if some error occurred(mispredictions) in the data interpretations.

The procedures(algorithms) implemented in this system are designed in such a way so that it can tune their attributes/parameters(variables) to test it against the variety of values and find the best combinations, for example, neural networks have a variety of parameters like the number of layers, the number of neurons in each layer, connection density between neurons, weights, etc.

The general use cases for such types of implementations are Robot navigation, learning tasks, game AI, self-driving cars, etc.

The interesting point is that corresponding to each type of learning there have been plenty of algorithms published as APIs under the various opensource ML libraries such as skLearn, Keras, Tensorflow, etc. and for data management is working memory(RAM) the primary libraries used are panadas and Numpy.

Here is a webinar discussion on the machine learning types and relevant stuff

So, as a programmer, it has become very easy for you to implement your use cases, provided you know what problem you are trying to solve and what data you will be using along with which algorithm you are going to use and which library supports it.

Machine Learning implementation steps

  1. Defining your problem statement
  2. Getting data from various sources and pre-processing it for feeding to the selected algorithm(s).
  3. Model building by selecting the right ML algorithm and test it with data.
  4. Optimize and improve(this requires a repeat of step 2 and step 3 till satisfactory results were produced)
  5. Summarize the results/Tell a story by using various Data visualizations.

That would be it if you followed these steps you are through with your ML implementation work.

Now the next point is how do I know which library to look into and which language shall be learned so that the implementation can be hassle-free.

Possible Machine learning track

  • Choose a programming language: Python OR R programming. I would prefer to have a python as a beginner as it’s easy to follow, and many libraries are supported by the ML community are programmed using Python. Apart from this should CRUD skills for SQL. Also, it is not like that you required to be an expert in programming skills that you will become as you practice your work.
  • Practice your data processing/wrangling using Pandas & NumPy. Also, you should practice with the Matplot library to get yourself familiarised with the data visualizations using various charts.
  • Now, as you are through with the first two stages, it is time to open your wings and get your hands dirty with algorithms from sklearn/Keras libraries or any other of your interest as per your problem statement. Take your time to work on various small implementations, start with regression-based algorithms, then classification, clustering, and so on. Spend some good time practising these as this will lay the foundation for your enterprise career.
  • So finally, it’s time to move on to the enterprise solutions used by the industry for processing real-time data like presto, HIVE, Hadoop, AWS ML toolkits, SPARK, etc.

Moreover, apart from what all is mentioned above, each specific cloud service provider has its own service stack to support the machine learning environment within its platform. And it is always up to your inclination toward the provider, and you additionally learn their platform-dependent tools over and above what we have discussed.

In case if you have a different say or have something to discuss, feel free to start the discussion thread below. I would love to do so.

Who am I to teach you about machine learning?

Well, I have been working intensively in ML to solve my Ph.D. Research problem and have been through various ML projects to test out multiple hypotheses.

Apart from this, I have been mentoring the budding researchers working on finding solutions to complex problems in the cloud computing domain.

You may read my brief career progress on the About page or check my LinkedIn.

Look forward to having you in the webinar and have a great discussion.

Cheers!
Anupinder