|
|
@ -1,18 +1,22 @@
|
|
|
|
## Emotions detection with Deep Learning
|
|
|
|
## Emotions detection with Deep Learning
|
|
|
|
|
|
|
|
|
|
|
|
Cameras are everywhere. Videos and images have become one of the most interesting data sets for artificial intelligence.
|
|
|
|
Cameras are everywhere. Videos and images have become one of the most interesting data sets for artificial intelligence.
|
|
|
|
Image processing is a quite broad research area, not just filtering, compression, and enhancement. Besides, we are even interested in the question, “what is in images?”, i.e., content analysis of visual inputs, which is part of the main task of computer vision.
|
|
|
|
Image processing is a quite broad research area, not just filtering, compression, and enhancement.
|
|
|
|
The study of computer vision could make possible such tasks as 3D reconstruction of scenes, motion capturing, and object recognition, which are crucial for even higher-level intelligence such as image and video understanding, and motion understanding.
|
|
|
|
|
|
|
|
For this 2 months project we will
|
|
|
|
|
|
|
|
focus on two tasks:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- emotion classification
|
|
|
|
Besides, we are even interested in the question, “what is in images?”, i.e., content analysis of visual inputs, which is part of the main task of computer vision.
|
|
|
|
- face tracking
|
|
|
|
|
|
|
|
|
|
|
|
The study of computer vision could make possible such tasks as 3D reconstruction of scenes, motion capturing, and object recognition, which are crucial for even higher-level intelligence such as image and video understanding, and motion understanding.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
For this project we will focus on two tasks:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- Emotion classification
|
|
|
|
|
|
|
|
- Face tracking
|
|
|
|
|
|
|
|
|
|
|
|
With the computing power exponentially increasing the computer vision field has been developing exponentially. This is a key element because the computer power allows using more easily a type of neural networks very powerful on images:
|
|
|
|
With the computing power exponentially increasing the computer vision field has been developing exponentially. This is a key element because the computer power allows using more easily a type of neural networks very powerful on images:
|
|
|
|
CNN's (Convolutional Neural Networks). Before the CNNs were democratized, the algorithms used relied a lot on human analysis to extract features which obviously time-consuming and not reliable. If you're interested in the "old
|
|
|
|
|
|
|
|
school methodology" [this article](https://towardsdatascience.com/classifying-facial-emotions-via-machine-learning-5aac111932d3) explains it.
|
|
|
|
- CNN's (Convolutional Neural Networks). Before the CNNs were democratized, the algorithms used relied a lot on human analysis to extract features which obviously time-consuming and not reliable. If you're interested in the "old school methodology" [this article](https://towardsdatascience.com/classifying-facial-emotions-via-machine-learning-5aac111932d3) explains it.
|
|
|
|
The history behind this field is fascinating! [Here](https://kapernikov.com/basic-introduction-to-computer-vision/) is a short summary of its history.
|
|
|
|
|
|
|
|
|
|
|
|
- The history behind this field is fascinating! [Here](https://kapernikov.com/basic-introduction-to-computer-vision/) is a short summary of its history.
|
|
|
|
|
|
|
|
|
|
|
|
### Project goal and suggested timeline
|
|
|
|
### Project goal and suggested timeline
|
|
|
|
|
|
|
|
|
|
|
@ -31,15 +35,18 @@ The two steps are detailed below.
|
|
|
|
### Preliminary:
|
|
|
|
### Preliminary:
|
|
|
|
|
|
|
|
|
|
|
|
- Take [this course](https://www.coursera.org/learn/convolutional-neural-networks). This course is a reference for many reasons and one of them is the creator: **Andrew Ng**. He explains the basics of CNNs but also some more advanced topics as transfer learning, siamese networks etc ...
|
|
|
|
- Take [this course](https://www.coursera.org/learn/convolutional-neural-networks). This course is a reference for many reasons and one of them is the creator: **Andrew Ng**. He explains the basics of CNNs but also some more advanced topics as transfer learning, siamese networks etc ...
|
|
|
|
I suggest to focus on Week 1 and 2 and to spend less time on Week 3 and 4. Don't worry the time scoping of such MOOCs are conservative. You can attend the lessons for free!
|
|
|
|
- I suggest to focus on Week 1 and 2 and to spend less time on Week 3 and 4. Don't worry the time scoping of such MOOCs are conservative. You can attend the lessons for free!
|
|
|
|
|
|
|
|
|
|
|
|
- Participate in [this challenge](https://www.kaggle.com/c/digit-recognizer/code). The MNIST dataset is a reference in computer vision. Researchers use it as a benchmark to compare their models.
|
|
|
|
- Participate in [this challenge](https://www.kaggle.com/c/digit-recognizer/code). The MNIST dataset is a reference in computer vision. Researchers use it as a benchmark to compare their models.
|
|
|
|
Start first with a logistic regression to understand how to handle images in Python. And then train your first CNN on this data set.
|
|
|
|
|
|
|
|
|
|
|
|
- Start first with a logistic regression to understand how to handle images in Python. And then train your first CNN on this data set.
|
|
|
|
|
|
|
|
|
|
|
|
### Face emotions classification
|
|
|
|
### Face emotions classification
|
|
|
|
|
|
|
|
|
|
|
|
Emotion detection is one of the most researched topics in the modern-day machine learning arena. The ability to accurately detect and identify an emotion opens up numerous doors for Advanced Human Computer Interaction.
|
|
|
|
Emotion detection is one of the most researched topics in the modern-day machine learning arena. The ability to accurately detect and identify an emotion opens up numerous doors for Advanced Human Computer Interaction.
|
|
|
|
The aim of this project is to detect up to seven distinct facial emotions in real time. This project runs on top of a Convolutional Neural Network (CNN) that is built with the help of Keras whose backend is TensorFlow in Python.
|
|
|
|
The aim of this project is to detect up to seven distinct facial emotions in real time.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This project runs on top of a Convolutional Neural Network (CNN) that is built with the help of Keras whose backend is TensorFlow in Python.
|
|
|
|
The facial emotions that can be detected and classified by this system are Happy, Sad, Angry, Surprise, Fear, Disgust and Neutral.
|
|
|
|
The facial emotions that can be detected and classified by this system are Happy, Sad, Angry, Surprise, Fear, Disgust and Neutral.
|
|
|
|
|
|
|
|
|
|
|
|
Your goal is to implement a program that takes as input a video stream that contains a person's face and that predicts the emotion of the person.
|
|
|
|
Your goal is to implement a program that takes as input a video stream that contains a person's face and that predicts the emotion of the person.
|
|
|
@ -49,10 +56,10 @@ Your goal is to implement a program that takes as input a video stream that cont
|
|
|
|
- Download and unzip the [data here](https://assets.01-edu.org/ai-branch/project3/emotions-detector.zip).
|
|
|
|
- Download and unzip the [data here](https://assets.01-edu.org/ai-branch/project3/emotions-detector.zip).
|
|
|
|
This dataset was provided for this past [Kaggle challenge](https://www.kaggle.com/competitions/challenges-in-representation-learning-facial-expression-recognition-challenge/overview).
|
|
|
|
This dataset was provided for this past [Kaggle challenge](https://www.kaggle.com/competitions/challenges-in-representation-learning-facial-expression-recognition-challenge/overview).
|
|
|
|
It is possible to find more information about on the challenge page. Train a CNN on the dataset `train.csv`. Here is an [example of architecture](https://www.quora.com/What-is-the-VGG-neural-network) you can implement.
|
|
|
|
It is possible to find more information about on the challenge page. Train a CNN on the dataset `train.csv`. Here is an [example of architecture](https://www.quora.com/What-is-the-VGG-neural-network) you can implement.
|
|
|
|
**The CNN has to perform more than 70% on the test set**. You can use the `test_with_emotions.csv` file for this. You will see that the CNNs take a lot of time to train.
|
|
|
|
**The CNN has to perform more than 60% on the test set**. You can use the `test_with_emotions.csv` file for this. You will see that the CNNs take a lot of time to train.
|
|
|
|
You don't want to overfit the neural network. I strongly suggest to use early stopping, callbacks and to monitor the training using the `TensorBoard`.
|
|
|
|
You don't want to overfit the neural network. I strongly suggest to use early stopping, callbacks and to monitor the training using the `TensorBoard`.
|
|
|
|
|
|
|
|
|
|
|
|
You have to save the trained model in `my_own_model.pkl` and to explain the chosen architecture in `my_own_model_architecture.txt`. Use `model.summary())` to print the architecture.
|
|
|
|
You have to save the trained model in `final_emotion_model.keras` and to explain the chosen architecture in `final_emotion_model_arch.txt`. Use `model.summary())` to print the architecture.
|
|
|
|
It is also expected that you explain the iterations and how you end up choosing your final architecture. Save a screenshot of the `TensorBoard` while the model's training in `tensorboard.png` and save a plot with the learning curves showing the model training and stopping BEFORE the model starts overfitting in `learning_curves.png`.
|
|
|
|
It is also expected that you explain the iterations and how you end up choosing your final architecture. Save a screenshot of the `TensorBoard` while the model's training in `tensorboard.png` and save a plot with the learning curves showing the model training and stopping BEFORE the model starts overfitting in `learning_curves.png`.
|
|
|
|
|
|
|
|
|
|
|
|
- Optional: Use a pre-trained CNN to improve the accuracy. You will find some huge CNN's architecture that perform well. The issue is that it is expensive to train them from scratch.
|
|
|
|
- Optional: Use a pre-trained CNN to improve the accuracy. You will find some huge CNN's architecture that perform well. The issue is that it is expensive to train them from scratch.
|
|
|
@ -86,13 +93,10 @@ project
|
|
|
|
├── environment.yml
|
|
|
|
├── environment.yml
|
|
|
|
├── README.md
|
|
|
|
├── README.md
|
|
|
|
├── results
|
|
|
|
├── results
|
|
|
|
│ ├── hack_cnn
|
|
|
|
|
|
|
|
│ │ ├── hacked_image.png
|
|
|
|
|
|
|
|
│ │ └── input_image.png
|
|
|
|
|
|
|
|
│ ├── model
|
|
|
|
│ ├── model
|
|
|
|
│ │ ├── learning_curves.png
|
|
|
|
│ │ ├── learning_curves.png
|
|
|
|
│ │ ├── my_own_model_architecture.txt
|
|
|
|
│ │ ├── final_emotion_model_arch.txt
|
|
|
|
│ │ ├── my_own_model.pkl
|
|
|
|
│ │ ├── final_emotion_model.keras
|
|
|
|
│ │ ├── pre_trained_model_architecture.txt
|
|
|
|
│ │ ├── pre_trained_model_architecture.txt
|
|
|
|
│ │ └── pre_trained_model.pkl
|
|
|
|
│ │ └── pre_trained_model.pkl
|
|
|
|
│ └── preprocessing_test
|
|
|
|
│ └── preprocessing_test
|
|
|
@ -101,7 +105,7 @@ project
|
|
|
|
│ ├── image_n.png
|
|
|
|
│ ├── image_n.png
|
|
|
|
│ └── input_video.mp4
|
|
|
|
│ └── input_video.mp4
|
|
|
|
└── scripts
|
|
|
|
└── scripts
|
|
|
|
├── hack_the_cnn.py
|
|
|
|
|__ validation_loss_accuracy.py
|
|
|
|
├── predict_live_stream.py
|
|
|
|
├── predict_live_stream.py
|
|
|
|
├── predict.py
|
|
|
|
├── predict.py
|
|
|
|
├── preprocess.py
|
|
|
|
├── preprocess.py
|
|
|
@ -114,7 +118,7 @@ project
|
|
|
|
```prompt
|
|
|
|
```prompt
|
|
|
|
python ./scripts/predict.py
|
|
|
|
python ./scripts/predict.py
|
|
|
|
|
|
|
|
|
|
|
|
Accuracy on test set: 72%
|
|
|
|
Accuracy on test set: 62%
|
|
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|