摘要
ABSTRACTABSTRACTThe price of oil is highly complex to predict as it is impacted by global demand and supply, geopolitical events, and market sentiment. The accuracy of such predictions, however, has far-reaching implications for supply chain performance, portfolio management, and expected stock market returns. This paper contributes to the oil price prediction literature by evaluating the predictive impact of the US President's communication on Twitter, while benchmarking various Natural Language Processing (NLP) techniques, including Term Frequency-Inverse Document Frequency (TF-IDF), Word2Vec, Doc2Vec, Global Vectors for Word Representation (GloVe), and Bidirectional Encoder Representations from Transformers (BERT). These techniques are combined with a deep neural network Long Short-Term Memory (LSTM) architecture using a five-day lag for both the oil price and the textual Twitter data. The data was collected during the term of US President Donald Trump, resulting in 1449 days of crude oil price prediction and a total of 16,457 tweets. The study is validated for Brent and West Texas Intermediate blends, using the daily price of a barrel of crude oil as the target variable. The results confirm that including the US President's tweets significantly increases the predictive power of oil price prediction models, and that an LSTM architecture with BERT as NLP technique has the best performance.KEYWORDS: AnalyticsOil price predictionLSTMBERTNLPUS president Disclosure statementNo potential conflict of interest was reported by the author(s).Data availability statementThe structured data (daily Brent and WTI crude oil prices) can be obtained from The US Energy Information Administration (https://www.eia.gov/dnav/pet/PET_PRI_SPT_S1_D.htm). The textual data (the tweets of former US president Donald Trump) can be obtained from the Trump Twitter Archive (https://www.thetrumparchive.com/). Alternatively, the data can be retrieved from the authors.Additional informationNotes on contributorsStephanie Beyer DíazStephanie Beyer Diaz is a PhD student at IÉSEG School of Management, Catholic university of Lille and member of the research laboratory LEM (UMR CNRS 9221). She has professional experience in banking and e-commerce, and previously earned her MSc degree in Big Data Analytics at IÉSEG School of Management. Her thesis topic is Data-Driven Innovation in the Financial Services Sector, for which she is collaborating with an international financial services provider based in Lille, implementing Deep Learning models for different customer-centric tasks.Kristof CoussementProf. Dr. Kristof Coussement is Professor of Business Analytics at IÉSEG School of Management in Lille, France. He founded and chairs the IÉSEG School of Management Excellence Center for Marketing Analytics (ICMA) that is a research centre focussing on developing innovation trajectories in data science with companies. His academic work has been published in international peer-reviewed journals such as Decision Support Systems, Information & Management, International Journal of Information Management, European Journal of Operational Research, Annals of Operations Research, International Journal of Production Research, Sensors, Computers in Human Behavior, International Journal of Forecasting, Data Mining and Knowledge Discovery, Computational Statistics & Data Analysis, Expert Systems with Applications, Knowledge-based Systems, Research Policy, Journal of Product Innovation Management, Journal of Business Research, Journal of Advertising Research, Industrial Marketing Management, Journal of Marketing Management, European Journal of Marketing, among others. His main research interests are all aspects in business analytics with a focus on embedding textual data in predictive modelling contexts.Arno De CaignyArno De Caigny (PhD) is associate professor of Business Analytics at IÉSEG School of Management, Catholic university of Lille and member of the research laboratory LEM (UMR CNRS 9221). Before starting his academic career, he worked as an analytical consultant for Deloitte. His research focuses on improving decision-making in companies using data and quantitative methods. He has vast experience in applying machine learning to solve challenges in the broad marketing domain. He has published in internationally renowned and peer-reviewed journals such as European Journal of Operational Research, Decision Support Systems, International Journal of Forecasting and Industrial Marketing Management and Annals of Operations Research.Luis Fernando PérezLuis Fernando Pérez is a Teaching and Research Assistant at IÉSEG, currently completing his PhD in Economics and Management Science from the University of Lille. He specialises in applied quantum computing for operations research. Prior to pursuing his PhD, Luis worked as a project manager in the Oil & Gas industry. He has also gained teaching experience in project management and coding during his doctoral studies.Stefan CreemersStefan Creemers is a Professor at IÉSEG, a visiting professor at KU Leuven, and a board member of PICS Belgium. He received his PhD in Operations management from the KU Leuven in 2009, and has published award-winning research in the fields of project management, healthcare operations, supply chain management, and quantum computing. Stefan is also the Editor in Chief of INFORMS Transactions on Education, and an Associate Editor of INFORMS Journal on Applied Analytics.