Patterns in the Words: The Evolution of Machine-Readable Data
Unstructured data is on the road to extinction. Though it will always be generated—and certain forms, like the spoken and written words, will continue to defy structure—improved recording and digital character recognition technologies will capture these most basic forms of human communication in a way that allows for automated analysis. In the meantime, more of what used to be locked inside analog containers (like paper, recording tape and other “uncooperative” storage mechanisms) is now becoming more easily navigable, searchable and minable. We can now harvest more of the informational content—the “nutritional value” —of data for timely decision support, and at unprecedented scale. Two things make this possible: First, application of data standards (like XML, or eXtensible Markup Language) that simultaneously package data in both human-readable and machine-readable formats. And, second, evolving adoption of storage/retrieval solutions for diverse, multi-structured datasets (like NoSQL, or Not Only SQL) are also key, particularly for more computationally intensive analyses.
These developments essentially symbolize a journey into finding patterns in the words, and using those new patterns to enhance the use of older, more commoditized patterns already found in the numbers. Conversion of unstructured data into an expanding universe of structured data will soon become ubiquitous. Trading strategies, risk analytics, fraud detection, and all sorts of decision support in global capital markets will eventually incorporate this converted data. Though early-use cases of machine-readable data have already made headlines, we are most certainly only at the beginning of this journey.
This TABB Group Focus Note, Patterns in the Words: The Evolution of Machine-Readable Data, explores the role of low-latency machine-readable news (MRN) as the initial phase of a longer-term journey towards much broader use of converted unstructured data in trading strategies, risk analytics, surveillance, and many other capital markets use cases. Detailing strategy examples, event types (and geographical breakdowns), user demographics, and market-sizing, this analysis offers a close look at how the high-frequency trading community has been using MRN, and more importantly, how new use cases – involving a broader spectrum of machine-readable data (MRD) – is being adopted by a larger community of hedge funds and traditional asset managers.