Compare commits

...

14 Commits

Author SHA1 Message Date
Xavier Lavayssière bfa57df116
Merge 038722d4ab into 2f7977a95a 2024-09-17 12:27:51 +01:00
Oumaima Fisaoui 2f7977a95a Chore(AI): Fix sp500 subject and audit 2024-09-09 09:58:11 +01:00
Oumaima Fisaoui 1d34ea0a71 Chore(DPxAI): Fix format 2024-09-06 11:18:31 +01:00
Oumaima Fisaoui 75472c0ed6 Chore(DPxAI): Fix format 2024-09-06 11:18:31 +01:00
Oumaima Fisaoui cccab05477 Chore(DPxAI): Fix format 2024-09-06 11:18:31 +01:00
Oumaima Fisaoui aa54ab1e66 Chore(DPxAI): Fix format 2024-09-06 11:18:31 +01:00
Oumaima Fisaoui 62486ed720 Chore(DPxAI): Fix the accuracy on test set 2024-09-06 11:18:31 +01:00
Oumaima Fisaoui 659074232f Chore(AI): Fix emotions detector 2024-09-06 11:18:31 +01:00
oumaimafisaoui 9c9adb1c88 Fix(Pipeline): Fix irradiat attribute values 2024-09-05 14:49:00 +01:00
oumaimafisaoui 00813d29e9 Fix(Pipeline): fix formatting 2024-09-05 14:49:00 +01:00
oumaimafisaoui fe5f82edcf Fix(Pipeline): fix datafile data info and example do not match 2024-09-05 14:49:00 +01:00
Harry f26da6368e
feat(template): question / potential-issue 2024-09-04 19:53:01 +01:00
Harry 4a8287754d
chore(template): change bug emoji 2024-09-04 19:36:24 +01:00
xalava 038722d4ab Clarified instructions for local node info (issue #2411) 2024-02-16 15:50:30 +01:00
10 changed files with 105 additions and 67 deletions

View File

@ -1,8 +1,8 @@
---
name: 🐛 Bug report
name: 🐞 Bug report
about: Create a report to help us improve
title: "[BUG] "
labels: "🐛 bug"
labels: "🐞 bug"
assignees: ""
---

View File

@ -0,0 +1,26 @@
---
name: 🙋 Question / Potential Issue
about: Ask a question or report a potential issue that isn't clearly a bug or a feature request
title: "[QUESTION] "
labels: "🙋 question"
assignees: ""
---
**Describe your question or potential issue**
A clear and concise description of your question or the potential issue you have encountered.
**Context & Use Case**
Provide the context or the scenario in which this question or issue arises. Explain why this is important to understand or address.
**Steps taken**
List any steps you have taken to try and resolve the issue or answer the question:
1. Checked the documentation/readme...
2. Tried to reproduce the issue...
3. Searched for similar questions...
**Attachments**
If applicable, add any screenshots, logs, or additional information that could help explain your question or potential issue.
**Additional context**
Add any other details or context that might be relevant, including links to related issues or documentation.

View File

@ -1,18 +1,22 @@
## Emotions detection with Deep Learning
Cameras are everywhere. Videos and images have become one of the most interesting data sets for artificial intelligence.
Image processing is a quite broad research area, not just filtering, compression, and enhancement. Besides, we are even interested in the question, “what is in images?”, i.e., content analysis of visual inputs, which is part of the main task of computer vision.
The study of computer vision could make possible such tasks as 3D reconstruction of scenes, motion capturing, and object recognition, which are crucial for even higher-level intelligence such as image and video understanding, and motion understanding.
For this 2 months project we will
focus on two tasks:
Image processing is a quite broad research area, not just filtering, compression, and enhancement.
- emotion classification
- face tracking
Besides, we are even interested in the question, “what is in images?”, i.e., content analysis of visual inputs, which is part of the main task of computer vision.
The study of computer vision could make possible such tasks as 3D reconstruction of scenes, motion capturing, and object recognition, which are crucial for even higher-level intelligence such as image and video understanding, and motion understanding.
For this project we will focus on two tasks:
- Emotion classification
- Face tracking
With the computing power exponentially increasing the computer vision field has been developing exponentially. This is a key element because the computer power allows using more easily a type of neural networks very powerful on images:
CNN's (Convolutional Neural Networks). Before the CNNs were democratized, the algorithms used relied a lot on human analysis to extract features which obviously time-consuming and not reliable. If you're interested in the "old
school methodology" [this article](https://towardsdatascience.com/classifying-facial-emotions-via-machine-learning-5aac111932d3) explains it.
The history behind this field is fascinating! [Here](https://kapernikov.com/basic-introduction-to-computer-vision/) is a short summary of its history.
- CNN's (Convolutional Neural Networks). Before the CNNs were democratized, the algorithms used relied a lot on human analysis to extract features which obviously time-consuming and not reliable. If you're interested in the "old school methodology" [this article](https://towardsdatascience.com/classifying-facial-emotions-via-machine-learning-5aac111932d3) explains it.
- The history behind this field is fascinating! [Here](https://kapernikov.com/basic-introduction-to-computer-vision/) is a short summary of its history.
### Project goal and suggested timeline
@ -31,15 +35,18 @@ The two steps are detailed below.
### Preliminary:
- Take [this course](https://www.coursera.org/learn/convolutional-neural-networks). This course is a reference for many reasons and one of them is the creator: **Andrew Ng**. He explains the basics of CNNs but also some more advanced topics as transfer learning, siamese networks etc ...
I suggest to focus on Week 1 and 2 and to spend less time on Week 3 and 4. Don't worry the time scoping of such MOOCs are conservative. You can attend the lessons for free!
- I suggest to focus on Week 1 and 2 and to spend less time on Week 3 and 4. Don't worry the time scoping of such MOOCs are conservative. You can attend the lessons for free!
- Participate in [this challenge](https://www.kaggle.com/c/digit-recognizer/code). The MNIST dataset is a reference in computer vision. Researchers use it as a benchmark to compare their models.
Start first with a logistic regression to understand how to handle images in Python. And then train your first CNN on this data set.
- Start first with a logistic regression to understand how to handle images in Python. And then train your first CNN on this data set.
### Face emotions classification
Emotion detection is one of the most researched topics in the modern-day machine learning arena. The ability to accurately detect and identify an emotion opens up numerous doors for Advanced Human Computer Interaction.
The aim of this project is to detect up to seven distinct facial emotions in real time. This project runs on top of a Convolutional Neural Network (CNN) that is built with the help of Keras whose backend is TensorFlow in Python.
The aim of this project is to detect up to seven distinct facial emotions in real time.
This project runs on top of a Convolutional Neural Network (CNN) that is built with the help of Keras whose backend is TensorFlow in Python.
The facial emotions that can be detected and classified by this system are Happy, Sad, Angry, Surprise, Fear, Disgust and Neutral.
Your goal is to implement a program that takes as input a video stream that contains a person's face and that predicts the emotion of the person.
@ -49,10 +56,10 @@ Your goal is to implement a program that takes as input a video stream that cont
- Download and unzip the [data here](https://assets.01-edu.org/ai-branch/project3/emotions-detector.zip).
This dataset was provided for this past [Kaggle challenge](https://www.kaggle.com/competitions/challenges-in-representation-learning-facial-expression-recognition-challenge/overview).
It is possible to find more information about on the challenge page. Train a CNN on the dataset `train.csv`. Here is an [example of architecture](https://www.quora.com/What-is-the-VGG-neural-network) you can implement.
**The CNN has to perform more than 70% on the test set**. You can use the `test_with_emotions.csv` file for this. You will see that the CNNs take a lot of time to train.
**The CNN has to perform more than 60% on the test set**. You can use the `test_with_emotions.csv` file for this. You will see that the CNNs take a lot of time to train.
You don't want to overfit the neural network. I strongly suggest to use early stopping, callbacks and to monitor the training using the `TensorBoard`.
You have to save the trained model in `my_own_model.pkl` and to explain the chosen architecture in `my_own_model_architecture.txt`. Use `model.summary())` to print the architecture.
You have to save the trained model in `final_emotion_model.keras` and to explain the chosen architecture in `final_emotion_model_arch.txt`. Use `model.summary())` to print the architecture.
It is also expected that you explain the iterations and how you end up choosing your final architecture. Save a screenshot of the `TensorBoard` while the model's training in `tensorboard.png` and save a plot with the learning curves showing the model training and stopping BEFORE the model starts overfitting in `learning_curves.png`.
- Optional: Use a pre-trained CNN to improve the accuracy. You will find some huge CNN's architecture that perform well. The issue is that it is expensive to train them from scratch.
@ -86,13 +93,10 @@ project
├── environment.yml
├── README.md
├── results
│   ├── hack_cnn
│   │   ├── hacked_image.png
│   │   └── input_image.png
│   ├── model
│   │   ├── learning_curves.png
│   │   ├── my_own_model_architecture.txt
│   │   ├── my_own_model.pkl
│   │   ├── final_emotion_model_arch.txt
│   │   ├── final_emotion_model.keras
│   │   ├── pre_trained_model_architecture.txt
│   │   └── pre_trained_model.pkl
│   └── preprocessing_test
@ -101,7 +105,7 @@ project
│   ├── image_n.png
│   └── input_video.mp4
└── scripts
├── hack_the_cnn.py
|__ validation_loss_accuracy.py
├── predict_live_stream.py
├── predict.py
├── preprocess.py
@ -114,7 +118,7 @@ project
```prompt
python ./scripts/predict.py
Accuracy on test set: 72%
Accuracy on test set: 62%
```

View File

@ -24,12 +24,12 @@
###### Does the text document explain why the architecture was chosen, and what were the previous iterations?
###### Does the following command `python ./scripts/predict.py` run without any error and returns an accuracy greater than 70%?
###### Does the following command `python ./scripts/predict.py` run without any error and returns an accuracy greater than 60%?
```prompt
python ./scripts/predict.py
Accuracy on test set: 72%
Accuracy on test set: 62%
```

View File

@ -241,8 +241,10 @@ breast: One Hot
breast-quad: One Hot
['right_low' 'left_low' 'left_up' 'central' 'right_up']
irradiat: One Hot
['yes' 'no']
Class: Target (One Hot)
['recurrence-events' 'no-recurrence-events']
```
@ -259,16 +261,16 @@ input: ohe.transform(X_test[ohe_cols])[:10]
output:
array([[1., 0., 1., 0., 0., 1., 0., 0., 0., 1., 0.],
[1., 0., 1., 0., 0., 1., 0., 0., 0., 1., 0.],
[0., 1., 1., 0., 0., 1., 0., 0., 0., 0., 1.],
[0., 1., 1., 0., 0., 1., 0., 0., 0., 0., 1.],
[1., 0., 1., 0., 0., 0., 1., 0., 0., 1., 0.],
[1., 0., 1., 0., 0., 0., 0., 1., 0., 1., 0.],
[1., 0., 0., 1., 0., 0., 0., 0., 1., 1., 0.],
[1., 0., 0., 1., 0., 1., 0., 0., 0., 1., 0.],
[1., 0., 0., 1., 0., 0., 0., 1., 0., 1., 0.],
[1., 0., 0., 1., 0., 0., 1., 0., 0., 1., 0.],
[1., 0., 0., 1., 0., 0., 1., 0., 0., 1., 0.],
[1., 0., 0., 1., 0., 1., 0., 0., 0., 1., 0.],
[1., 0., 0., 1., 0., 0., 1., 0., 0., 1., 0.],
[0., 1., 1., 0., 0., 0., 1., 0., 0., 0., 1.]])
[1., 0., 1., 0., 0., 0., 0., 1., 0., 0., 1.],
[1., 0., 0., 1., 0., 1., 0., 0., 0., 1., 0.]])
input: ohe.get_feature_names(ohe_cols)
input: ohe.get_feature_names_out(ohe_cols)
output:
array(['node-caps_no', 'node-caps_yes', 'breast_left', 'breast_right',
'breast-quad_central', 'breast-quad_left_low',

View File

@ -146,14 +146,14 @@ dtype: int64
array([[1., 0., 1., 0., 0., 1., 0., 0., 0., 1., 0.],
[1., 0., 1., 0., 0., 1., 0., 0., 0., 1., 0.],
[0., 1., 1., 0., 0., 1., 0., 0., 0., 0., 1.],
[0., 1., 1., 0., 0., 1., 0., 0., 0., 0., 1.],
[1., 0., 1., 0., 0., 0., 1., 0., 0., 1., 0.],
[1., 0., 1., 0., 0., 0., 0., 1., 0., 1., 0.],
[1., 0., 0., 1., 0., 0., 0., 0., 1., 1., 0.],
[1., 0., 0., 1., 0., 1., 0., 0., 0., 1., 0.],
[1., 0., 0., 1., 0., 0., 0., 1., 0., 1., 0.],
[1., 0., 0., 1., 0., 0., 1., 0., 0., 1., 0.],
[1., 0., 0., 1., 0., 0., 1., 0., 0., 1., 0.],
[1., 0., 0., 1., 0., 1., 0., 0., 0., 1., 0.],
[1., 0., 0., 1., 0., 0., 1., 0., 0., 1., 0.],
[0., 1., 1., 0., 0., 0., 1., 0., 0., 0., 1.]])
[1., 0., 1., 0., 0., 0., 0., 1., 0., 0., 1.],
[1., 0., 0., 1., 0., 1., 0., 0., 0., 1., 0.]])
```
@ -162,16 +162,16 @@ array([[1., 0., 1., 0., 0., 1., 0., 0., 0., 1., 0.],
```console
#First 10 rows:
array([[1., 2., 5., 0., 1.],
[1., 3., 4., 0., 1.],
[1., 2., 4., 0., 1.],
[1., 3., 2., 0., 1.],
[1., 4., 3., 0., 1.],
[1., 4., 5., 0., 0.],
[2., 5., 4., 0., 1.],
[2., 5., 8., 0., 1.],
[0., 2., 3., 0., 2.],
[1., 3., 6., 4., 2.]])
array([[2., 5., 2., 0., 1.],
[2., 5., 2., 0., 0.],
[2., 5., 4., 5., 2.],
[1., 4., 5., 1., 1.],
[2., 5., 5., 0., 2.],
[1., 2., 1., 0., 1.],
[1., 2., 8., 0., 1.],
[2., 5., 2., 0., 0.],
[2., 5., 5., 0., 2.],
[1., 2., 3., 0., 0.]])
```
@ -180,8 +180,8 @@ array([[1., 2., 5., 0., 1.],
```console
# First 2 rows:
array([[1., 0., 1., 0., 0., 1., 0., 0., 0., 1., 0., 1., 2., 5., 0., 1.],
[1., 0., 1., 0., 0., 1., 0., 0., 0., 1., 0., 1., 3., 4., 0., 1.]])
array([[1., 0., 1., 0., 0., 1., 0., 0., 0., 1., 0., 2., 5., 2., 0., 1.],
[1., 0., 1., 0., 0., 1., 0., 0., 0., 1., 0., 2., 5., 2., 0., 0.]])
```
---

View File

@ -1,10 +1,12 @@
## Financial strategies on the SP500
In this project we will apply machine to finance. You are a Quant/Data Scientist and your goal is to create a financial strategy based on a signal outputted by a machine learning model that over-performs the [SP500](https://en.wikipedia.org/wiki/S%26P_500).
In this project, you'll apply machine learning to finance. Your goal as a Quant/Data Scientist is to create a financial strategy that uses a signal generated by a machine learning model to outperform the [SP500](https://en.wikipedia.org/wiki/S%26P_500).
The Standard & Poor's 500 Index is a collection of stocks intended to reflect the overall return characteristics of the stock market as a whole. The stocks that make up the S&P 500 are selected by market capitalization, liquidity, and industry. Companies to be included in the S&P are selected by the S&P 500 Index Committee, which consists of a group of analysts employed by Standard & Poor's.
The S&P 500 Index originally began in 1926 as the "composite index" comprised of only 90 stocks. According to historical records, the average annual return since its inception in 1926 through 2018 is approximately 10%11%. The average annual return since adopting 500 stocks into the index in 1957 through 2018 is roughly 8%.
As a Quant Researcher, you may beat the SP500 one year or few years. The real challenge though is to beat the SP500 consistently over decades. That's what most hedge funds in the world are trying to do.
The S&P 500 Index is a collection of 500 stocks that represent the overall performance of the U.S. stock market. The stocks in the S&P 500 are chosen based on factors like market value, liquidity, and industry. These selections are made by the S&P 500 Index Committee, which is a group of analysts from Standard & Poor's.
The S&P 500 started in 1926 with only 90 stocks and has grown to include 500 stocks since 1957. Historically, the average annual return of the S&P 500 has been about 10-11% since 1926, and around 8% since 1957.
As a Quantitative Researcher, your challenge is to develop a strategy that can consistently outperform the S&P 500, not just in one year, but over many years. This is a difficult task and is the primary goal of many hedge funds around the world.
The project is divided in parts:
@ -199,4 +201,5 @@ Note: `features_engineering.py` can be used in `gridsearch.py`
### Files for this project
You can find the data required for this project in this [link](https://assets.01-edu.org/ai-branch/project4/project04-20221031T173034Z-001.zip)
You can find the data required for this project in this :
[link](https://assets.01-edu.org/ai-branch/project4/project04-20221031T173034Z-001.zip)

View File

@ -1,23 +1,27 @@
## Local Node Info
To start, we will create a simple page that displays basic information from our local node.
To get started, we will create a simple web page that displays basic information from our local node.
### Instructions
Create a web page, `localNodeInfo.html` that loads an ethereum library, connects to a local node at `http://localhost:8545` and displays basic information :
Create a web page called `localNodeInfo.html` that does the following:
- In an element with (`id`=`chainId`), the number ID of the current network
- In an element with `blockNumber` as `id` the number of blocks in the chain
1. Loads an Ethereum library, such as `ethers.js` or `web3.js`.
2. Connects to a local Ethereum node at `http://localhost:8545`.
3. Displays the following information on the page:
- The ID of the current network in an element with `chainId` as `id`.
- The number of blocks in the chain in an element with `blockNumber` as `id`.
![image](networkInfo.png)
![image](network-infos.png)
### Hint
You can use any library such as `ethers.js` or `web3.js` to connect to your local node.
🚫 Please be aware that the test environment restricts internet access for security reasons. Therefore, you need to download the library and import it locally.
Automated tests check for elements with specific IDs, the design is up to you.
🎨 Automated tests only check for the content of elements with specific IDs; the rest of the design is up to you.
Minimal structure:
🎁 Here is a minimal example structure for the HTML file:
```HTML
<!DOCTYPE html>
@ -25,7 +29,7 @@ Minimal structure:
<span id="chainId"></span>
<span id="blockNumber"></span>
<script src="XXX"></script>
<script src="./XXX"></script>
<script type="module">
// Your code
</script>
@ -34,6 +38,5 @@ Minimal structure:
```
### Notions
- [ethers Provider transaction-methods](https://docs.ethers.io/v5/api/providers/provider/#Provider--network-methods)
- [web3](https://web3js.readthedocs.io/en/v1.3.4/web3-eth.html)
- [web3 providers](https://docs.web3js.org/guides/web3_providers_guide/)

Binary file not shown.

After

Width:  |  Height:  |  Size: 22 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 9.0 KiB