Performance Evaluation of Machine Learning Classifiers for Stock Market Prediction in Big Data Environment


Sneh Kalra,Sachin Gupta,Jay Shankar Prasad,



Supervised learning,Product Reviews,Google Cloud, Big data,Apache Spark,


I. C. Lee and I. Paik, Stock Market Analysis from Twitter and News Based on Streaming Big Data Infrastructure , in Proceedings - 2017 IEEE 8th International Conference on Awareness Science and Technology, iCAST, Taichung, 2017, pp. 312-317. II. J.V.M. Lakshmi, A Framework Model on Big Data Analytics using Machine Learning Techniques for Prediction on Datasets”, Ph.D. dissertation, Dept. Comp. Sci. and App., Sri Chandrasekhar Univ., Enathur, Kanchipuram, 2018. III. M. M. Seif et al, Stock Market Real Time Recommender Model Using Apache Spark Framework, Springer AMLTA 2018, pp. 671–683, 2018, IV. M. Shastri, S. Roy, M. Mittal , Stock Price Prediction using Artificial Neural Model: An Application of Big Data, EAI Endorsed Transactions on Scalable Information Systems, 2019 ,vol- 6, issue 20.O. B. Sezer , A. M. Ozbayoglu , An Artificial Neural Network-based Stock Trading System Using Technical Analysis and Big Data Framework , ACMSE 2017, Kennesawtate University, GA, U.S.A., April, 2017,DOI -10.1145/3077286.3077294. V. R. T. Llame et al, Big Data Time Series Forecasting Based on Nearest Neighbours Distributed Computing with Spark, Knowledge Based Systems (2018), DOI: 10.1016/j.knosys.2018.07.026 VI. S. Kalra, S. Gupta, J. S. Prasad, Sentiments Based Forecasting for Stock Exchange using Linear Regression, unpublished. VII. O. B. Sezer , A. M. Ozbayoglu, An Artificial Neural Network-based Stock Trading System Using Technical Analysis and Big Data Framework , ACMSE 2017, Kennesaw State University, GA, U.S.A., April, 2017,DOI - 10.1145/3077286.3077294. VIII. V. K. Menon et al, Bulk Price Forecasting Using Spark over NSE Data Set, International Conference on Data Mining and Big Data, DMBD 2016, pp 137-146. IX. 9388426983 X. drive-412662 XI. diesel-review-test-drive-412307 XII. XIII. XIV. XV. XVI. 925004768 XVII. spark-application/ XVIII. 50g/657387760199/reviews?page=3&sortBy=RECENCY XIX.


Implementing machine learning models for the stock’s big data emerged as a
component of algorithmic trading systems. This paper proposed a hybrid stock
prediction model based on the collection of qualitative and quantitative data of
particular stocks. In addition to tweets and news data, product reviews of the specific
companies traded under National Stock Exchange are considered to analyze their effect
on the stock movements. Historical Prices will be integrated with sentiment values
generated from tweets, news and product reviews data to construct the amalgam model
using Apache Spark and HDFS for storage of large data. The proposed model has been
implemented in Google Cloud Platform with different cluster configurations. The paper
compares the prediction accuracy based on various types of input data provided to the
model using some popular machine learning algorithms.

View Download