Stock Market Prediction
This project is aimed at using Text Classification and Sentiment Analysis to process financial news and predict whether the price of a stock will go up or down.
For reading and saving data, I use libraries like xlrd, pickle and codecs. In terms of tokenization, I choose Jieba.
To achieve higher accuracy rate, I’ve added some financial dictionary to Jieba and removed stop-word from the already tokenized word list. As for extracting features, both positive and negative word dictionary are used and only considering the most common words in news for the purpose of reducing features dimension.
Talking about training and testing models, I divided the Development Set into Training Set and Dev-Test Set, and have used cross validation to find the best classifier among Naive Bayes, Decision Tree, Maximum Entropy from nltk and Bernoulli NB, Logistic Regression, SVC, Linear SVC, NuSVC from sklearn. Finally, the best accuracy was achieved at 69.5% with SVM.
Report of this project is here.
|
|