Thumbs Up News
Using Machine Learning to filter through RSS news feeds and returning only positive articles
Train classifier with new dataset and get an accuracy of 77% \o/
Add 57 new rss feeds to scrape news
Test classifier on 450k dataset - speed test and works very well
Use nltk.SentimentIntensityAnalysis to add another layer to the classify method
Update the classifier code to use old twitter one since the new one is way too slow!
Create helper function to check if date from article is from todays date and return bool
Test new classifier with json file and check how quick it can classify compared with actual one
Replace classifier with old twitter classifier - got good results and it’s blazing fast!
Improve classification speed by moving loading of the vocabulary and classifier into the classifier class
Add logic to discard articles from dates that are not today
Create script to run all the scrapers at once and get the data into the same file
Add cnn, foxnews and skynews spider to scrape the rss feed
Finish off bbc rss scrapper
Start working on news articles crawler
Initial work on the scraper
Improve code that gets classification from news headlines and save on the file
Get classifier up to 85% accuracy
Start training new classifier
Move dataset sentences to csv format
Rewrite the new classifier for
Write new classifier to see if we can get a better accuracy than 75%
Finish training classifier for news headlines