feat(nlp-scraper): fix broken link and remove optional part with broken URL

2024-03-14 12:27:24 +00:00 · 2024-03-14 12:27:24 +00:00 · 187ca1884b
parent 3f22fcf06b
commit 187ca1884b
1 changed files with 1 additions and 6 deletions
--- a/subjects/ai/nlp-scraper/README.md
+++ b/subjects/ai/nlp-scraper/README.md
@ -56,7 +56,7 @@ SpaCy](https://towardsdatascience.com/named-entity-recognition-with-nltk-and-spa

 The goal is to detect what the article is dealing with: Tech, Sport, Business,
 Entertainment or Politics. To do so, a labelled dataset is provided: [training
-data](bbc_news_train.csv) and [test data](bbc_news_test.csv). From this
+data](bbc_news_train.csv) and [test data](bbc_news_tests.csv). From this
 dataset, build a classifier that learns to detect the right topic in the
 article. Save the training process to a python file because the audit requires
 the auditor to test the model.
@ -68,11 +68,6 @@ that the model is trained correctly and not overfitted.

 - Learning constraints: **Score on test: > 95%**

- **Optional**: If you want to train a news' topic classifier based on a more
-  challenging dataset, you can use the
-  [following](https://www.kaggle.com/rmisra/news-category-dataset) which is
-  based on 200k news headlines.
-
 #### **3. Sentiment analysis:**

 The goal is to detect the sentiment (positive, negative or neutral) of the news