Trends and Applications
spacer
Real-Time Data is on the Rise

By Brian Sentance

Data and its analysis has become an important economic battleground for many industries, and nowhere is this more apparent than in the financial industry. Regulation is mandating greater data transparency across firms and trading practices. The increase in automated trading and the continuing search for new trading opportunities has led to exponential increases in the amount of data that must be captured, cleaned, managed and analyzed within a financial institution. To give you some idea of the size of the problem, the Options Pricing Reporting Authority (OPRA) in the U.S. is anticipating trade volumes at peak levels of around one million messages per second by mid-2008. Real-time data processing and the ability to store it for historic analysis have become particular pressure points for many investment banks, asset managers and hedge funds.

Vast volumes of real-time data may seem very specific to financial markets, but the importance and use of real-time data is growing in other industries too. Consider a supermarket: not exactly a hotbed of financial dealings (although it may seem equally as tense, especially on Saturday morning), but large chains are increasingly becoming concerned with real-time inventory management to ensure the shelves are always stocked. Although this is not a large concern for an individual store, management across hundreds of stores in a chain is more challenging and interesting. The data is also analyzed for what is selling well or poorly, which promotions are working and whether the latest store layout is bringing in more business. The trend toward real-time business intelligence is challenging end of day business intelligence solutions built upon more traditional data warehousing technology.

Data Getting Ahead of Technology?

Despite the obvious increase in processing power over the past 20 years, it is still frustrating that some technologies don’t change. For example, loading a spreadsheet still seems to take the same, if not more time now than it did in the early 1990s. Something of real value in an application does not seem to have improved that much at all, yet new versions of applications are constantly released that feign improvements. The discussion over “software bloat” is ongoing, and is somewhat true in the case that software engineers find it easy to use up processing power in new features that users may not perceive as benefits.

In contrast, the financial markets are in an unusual time with the speed of data temporarily overtaking the speed of technology. Traditional database technology struggles to keep up with the data update rates, especially in conjunction with the need to query data in real time. Faced with these challenges, there are a number approaches that the financial industry is adopting:

  • Hosting conventional databases in memory, rather on disk
  • Installing specialist, high-performance (but proprietary) database technology
  • Using high-performance computing (HPC) to distribute analysis load
  • Distributed data caching, also known as “data fabrics”

Currently, the most common solution to storing large amounts of historical real-time data seems to be to split the problem across two databases - one held in memory for data in use and the other in disk for historical data not currently in use. In this scenario, data is bulk-copied from the in-memory database to the historic database on disk. Hence, the issue of disk input / output is avoided for capturing the real-time updates, but at the cost of having to join across two databases if a desired query involves analyzing real-time data against historic trends.

Many of the database products in this area are proprietary in nature, having been designed explicitly for historic storage to optimize update and retrieval times (at speeds of several magnitude greater than mainstream relational solutions). The proprietary solutions are high performance, but often at the cost of usability, ease of replication and other mainstream tasks that technologists know and expect from more mainstream database technologies.

Not quite so mature is the usage of HPC (grid/clustering) in conjunction with high-performance database technology, in order to achieve real-time parallelization of data/calculation load across a group of computers. Here I am talking about real-time distribution of ad hoc query loads, and the not traditional HPC usage of huge-scale batch processing of a mathematical problem. Strongly related to HPC usage is the increasing usage and popularity of “data fabrics,” in-memory caches of data that are used to ensure that a cluster does not become “data-bound” with its overall performance inhibited by the slow provision of data to each node in the cluster.

Data Getting Ahead of Users?

So while real-time data is keeping the technologists busy, spare a thought for the consumers of both the data and the software - the end-users. In finance, as in other fields, a level of distrust often exists between the end-users and the technology staff delivering systems. This tension is exacerbated by profits at stake, short deadlines and market expertise that users have and IT staff often only partially understand.

As a result of this disconnect between offices, users resort to using Excel spreadsheets to store, analyze and report on data. Excel has a pressure-relief value for end-users that works around the fact IT delivery timelines do not meet business needs. It is not necessary to be a technologist to use Excel, and as such, it has become the definitive tool of “end-user” computing.

Traders have been using spreadsheets for years, but in the same way that technologists have been challenged by rising data volumes, so has this traditional tool of the end-user. The latest version of Excel has one million rows, but even this is not enough for more than a few days of market data for particular financial instruments. In addition, traders may have to look across many thousands of instruments spread across many markets. Even if technologists can develop systems to handle data volumes, the end-users cannot use their traditional analysis tool to get a complete view over the data and the opportunities available.

Time for Visualization?

Software engineers must continue to address the direct technological challenges of rising real-time data volumes through an optimized combination of all four approaches outlined earlier. A greater but more subtle challenge will be how best for technologists to present this data to end-users, particularly when the end-users’ favorite tool - the spreadsheet - has been surpassed by the amount of data that needs to be analyzed. For around 20 years now, visualization software vendors have been trying to sell into the financial services industry, with very limited success. Perhaps, now is the time when data visualization can finally come of age, bringing new levels of data transparency and understanding to hard-pressed users of real-time data.

About the Author:

Brian Sentance is CEO of Xenomorph and one of the founder directors. He is responsible for setting and managing company strategy. He is involved in gathering client needs within the field of analytics and data management in financial markets. Prior to joining Xenomorph in 1995, Sentance headed the pricing models development team in the equity derivatives group at JP Morgan, London. This role involved bridging the requirements of trading, quant and software development staff to deliver new financial products to market. For information about Xenomorph, go to www.xenomorph.com.

|<<TOC  <<Back    1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20   Next>>   Masthead >>|

DBTA Home Page | About Us | Contact Us | Partners

To receive a monthly notice about new material and a quarterly
complimentary print edition, click
here.

 
 

DBTA Home Page

To receive a monthly notice about new material and a quarterly complimentary print edition, click here.

Table of Contents

TRENDS AND APPLICATIONS
Strategies for Building a Successful Business Intelligence Competency Center (BICC)
Column Databases Offer Benefits
Real-Time Data is on the Rise
Database Activity Monitoring Can be Accomplished Without Performance Overhead

MV COMMUNITY
Revelation Software Unveils OpenInsight 8.1 at Users Conference
RATEX Business Solutions Adds MITS Report to its Retail Management System
IBM Plans Four U2 University Conferences in 2008
Nebraska Furniture Mart's Database Migration to Reality Goes Smoothly

COLUMNS
SQL Server 2000 Casts a Long Shadow by Kevin Kline
Multiple Approaches Exist to Implementing Entity States by Todd Schraml
Oracle Globalization Support Helps Process Information in Native Languages by Arun Kumar R.
Unraveling the "World’s Biggest Pile of Spaghetti" by Joe McKendrick
Amazon Establishes Early Lead in Cloud Computing by Guy Harrison
The Cost of Data Breaches Can be Steep by Craig S. Mullins

News
Download Central
Places to Go
Did Ya Hear?
New Products

Online Masthead

DBTA Home Page

DBTA E-Editions
April 2008
March 2008
February 2008
January 2008

 
spacer