Calum Robertson: Enabling Sophisticated Financial Text Mining


Presentation slides

Extended abstract PDF

Authors

Calum Robertson (Sirca)

Abstract

A popular theory is that financial markets are efficient: all available information is factored into the price of an asset. Academics, commercial researchers, and investors devote considerable time analysing markets with the hope of improving efficiency and/or exploiting inefficiencies for profit. The sheer volume of real time information makes it difficult for an individual to keep abreast of all information related to an asset, and impossible to keep abreast of all information related to all available assets.

Information commonly used to analyse market efficiency can be broadly categorised as numerical or textual. Trading data (e.g., the time series of a share price), is numerical information that is commonly used to analyse market efficiency, and consumed by algorithmic  trading models so a computer can automatically trade when certain conditions are met. News, such as macroeconomic and company announcements, are sources of textual information that can have significant impact on the price of an asset.

Recent years has seen increasing interest in the role of news in financial markets. The increase in the availability of significant datasets of news, and advances in text mining technologies have helped drive this research. The future is bright for financial text mining, though  here are several obstacles which must be overcome to promote research in this field. In this paper we describe common data sources, research strategies in this domain, and address some of the ways we are overcoming obstacles in this field.
 

About the speaker

Calum RobertsonCalum Robertson has a background in data mining, having completed a masters degree by research, and a doctorate in this field. Extensive experience in the finance industry led to his interest in the effect of news on financial markets, which was the topic of his doctorate. At Sirca he is helping develop tools which make a large dataset of news easily accessible to academics and therefore promote sophisticated financial text mining.