So, the old saying “wine gets better with age” has an expiration date. The flavors, aromas and textures appear and fade over time, rather than in unison. The value of data, like wine, also fades over time for a variety of use cases that may appear and fade over time depending on a firm’s strategic focus, management changes and market conditions.
Behind the scenes of the wine industry, the vast majority of wine is not aged, and even wine that is aged is rarely aged for long. Typically, wine is not aged more than five years, since most people do not wait long to consume it. Similarly, in our experience, financial services firms tend to consume a large percentage of data during their first year and consume the vast majority of the data within five years. Therefore, a robust data management strategy should consider the lifecycle of data integration creation, consumption, archiving and removal. Firms that do this early and often will reap the most benefits when data is most meaningful and relevant to internal and external customers.
Aging to maturation: A lost data management art and science
Financial firms have typically not accounted for dynamic data and changing business demands in their core systems. Coupled with the many changes caused by COVID-19, the issue even more complicated. Finally, add the lack of fine-grained policies for data integration creation, consumption, archiving and removal, and the exponential growth in data from applying advanced forms of artificial intelligence (AI) and machine learning (ML) — and you get the “perfect storm.” Many firms lack a robust system for properly dealing with aging data to segregate higher value assets from older, irrelevant data.
Fermenting the digital enterprise with modernized data varietals
Traditional processes used to develop and modify data management systems have not leveraged the advances of advanced delivery methods such as Agile, DevOps, DataOps and MLOps, to optimize and simplify processes.
To modernize and achieve finely aged data, financial services firms must:
- Be agile and utilitarian. Data architecture must consider on-demand, self-service, crowdsourcing and AI/ML-enabled capabilities. The use of cloud to modernize data with proper cleansing, normalization, quality and consolidation to a data lakehouse will enable financial services firms to scale up and down quickly for emerging business needs. Additionally, this adaptability will make it possible to add on-the-spot computing power for additional use cases and apply AI/ML to the data to get to more of an information-rich environment. This will help these firms to incorporate additional third-party data sources and drive more insights, thus making data and subsequent analyses inherently more valuable.
- Provide open access. Platforms have three layers of data — a raw data layer, a curated layer and a consumption layer. Traditional data architectures typically grant access only to the consumption layer. However, analysts and data scientists want access to raw data to find overlooked elements that may be useful to generate additional insights. Firms typically want to integrate new data sources into analytics, AI/ML and applications in an automated way. We see that most financial services firms are currently producing and consuming data with manual processes. These firms can use machine learning to detect changes in the schema and structure of the incoming data and auto-adjust the integration patterns.
- Invest in a data-rich library to get the full impact of AI/ML. Data scientists develop features to transform data into more consumable forms for AI/ML algorithm training: for example, calculating the time in between transactions. A feature library is the collection of all the features into a standardized ontology or collection that data scientists can apply more readily to their AI/ML models. Since data scientists use features in AI/ML models as inputs into the learning systems, the more there are at the front-end, the better. AI/ML models can then select the best performing features and spend less time finding better models. The goal of an endless feature library is to create a limitless number of features from prior work and auto-calculations to capture every feature that could arise in a given data set.
- Enable a unified data security and classification model. Firms often rely on complex, hybrid environments that blend cloud-based and on-premises services with data scattered in various locations and used by myriad individuals and systems. These firms should scan and separate redundant, outdated, trivial, confidential and classified data using AI/ML to help protect data and information more closely. We recommend a unified data security and classification model using AI/ML to enable employees to focus on using the data in new and interesting ways, rather than worrying about finding workarounds and using a large amount of effort to get the same analyses completed.
Lineage and governance to achieve full data maturity