A Sustainable Investment in Journalism

A sustainable business requires diverse revenue streams. ESG Investor now operates a subscription service as of Tuesday 14th May. To find out more please get in touch with our subscription team on subscriptions@esginvestor.net


Financial Analysis, Data and the Use of AI

Rhodri Preece, Senior Head of Research at the CFA Institute, says emerging technologies can help investment professionals draw insights from unstructured ESG data.

Data is being generated at an exponential rate, and the technology powering the algorithms used to parse it is growing just as fast, opening up both new opportunities for investing and innovative ways to leverage alternative data. Investment professionals are now navigating a landscape supplemented by unstructured, alternative, and open-source data. A survey on alternative and unstructured data conducted by CFA Institute in July 2023 revealed that more than half of investment professionals are incorporating unstructured data into their workflow, and 64% indicated using alternative data. This shift has prompted a reevaluation of analytical methodologies and frameworks within the industry.

Over the past few decades, the predominant approach to financial analysis has centered on leveraging structured, numerical data. As the digital revolution continued, new alternative data providers sprouted up, capitalising on the notion of data being the ‘new oil’. The exponential growth of unstructured data boosted demand for methods to process and extract valuable insights, leading data science to emerge as a highly sought-after domain of expertise within investment firms.

Understanding data in financial analysis

The first level of distinction in defining the data used in investment decision-making processes is understanding the various generators of the data, which include companies, governments, individuals, and satellites and sensors.

Company data include, for example, financial statements, operational metrics, strategic plans, and data that arise when individuals or entities interact with the company’s products and services. Examples of such interaction data include credit card transactions, app download statistics, and email receipts. Government data include economic statistics on the health, performance, and status of a country’s economy, while government interaction data include data that are generated from the day-to-day functions of government activities, including business permits, patents granted, and public service usage, such as transport ridership and facility utilisation. Individuals generate data through their online activities, such as social media engagement, consumer reviews, and search engine queries. Lastly, technologies such as satellites and sensors generate data in the form of geolocation information, satellite imagery, and internet of things (IoT) devices, like manufacturing equipment usage patterns.

The second level of distinction is the type of data, which refers to whether the data is traditional or non-traditional. Non-traditional or alternative data is defined as any data that differs from traditional investment sources, such as financial statements, market data and economic indicators. In a July 2023 CFA Institute survey, the most commonly used alternative data are publicly available government data (9%), news and media sentiment data (8%), employment data (7%), web-scraped data (7%), ESG data (7%), and so on.

The last level of distinction is the data form. Unstructured data lacks a specific format or organisation, making it harder to analyse using traditional data processing tools. Examples of unstructured data include free-text social media posts, consumer reviews, satellite images, and raw sensor data from IoT devices. Unstructured data is characterised by its non-tabular and non-relational nature. Structured data, in contrast, is well-organised and easily searchable. There is also the semi-structured data form, such as email receipts and JSON (JavaScript Object Notation) files. These files have some level of organisation but are not as rigidly structured as databases or spreadsheets.

The following table breaks down the data types and structures in a matrix using an earnings release as the data generating event to help conceptualise these concepts.





Tabular financial statements PDF financial statement

Conference call transcript: used to extract performance metrics or management guidance


Vendor sourced: Earnings sentiment score

Vendor sourced: Financial statement language complexity score

Financial statement textual analysis: using machine learning to detect year-on-year language consistency in MD&A

Conference call recording: using machine learning to detect tone of voice patterns related to earnings confidence


Using NLP to analyse unstructured ESG data

ESG data in particular presents a dynamic domain for investors to navigate because much of it is still narrative and qualitative in nature. Emerging standards and regulations are likely to make certain ESG data more quantitative and standardised in the future, but the process will be long and sporadic and regional differences will likely remain. At the same, companies continue to generate new sorts of unstructured information, such as climate transition plans. Thus, natural language processing (NLP) and machine learning (ML) will be increasingly important tools in financial analysts’ toolkits.

Investment professionals’ ability to extract valuable insights from unstructured data (such as ESG data, but also other types) has greatly improved with advances in NLP algorithms and with the proliferation of the open-source tools. some of which are freely available. For example, Github, pandas, BeautifulSoup, and scikit-learn are popular open-sources tools that are being used by investment professionals for a variety of analytical needs. For investment firms, having in-house capabilities in parsing unstructured data will become increasingly important as the barriers to using these tools continue to decline.

For investment professionals, staying abreast of technological trends, mastering programming languages for parsing complex datasets, and being keenly aware of the tools that augment our workflow are necessities that will propel us forward in an increasingly technical finance domain.

The practical information hub for asset owners looking to invest successfully and sustainably for the long term. As best practice evolves, we will share the news, insights and data to guide asset owners on their individual journey to ESG integration.

Copyright © 2024 ESG Investor Ltd. Company No. 12893343. ESG Investor Ltd, Fox Court, 14 Grays Inn Road, London, WC1X 8HN

To Top
Share via
Copy link
Powered by Social Snap