A raft of regulatory initiatives is putting data – and its management and governance – at front and centre of the operations of buyside and sell side firms. The reviewed Markets in Financial Instruments Directive (Mifid), the European Market Infrastructure Regulation (Emir) and global regulations such as the Basel Committee on Banking Supervision’s BCBS239, the General Data Protection Regulation (GDPR) and the US Federal Reserve’s CFO Attestation all touch on data.

Cian Ó Braonáin, global lead of Sapient Global Markets’ regulatory reporting practice, says the regulatory pipeline has slowed somewhat recently, and firms now need to ensure that they understand where data comes from and what they do with it. This will drive the increased accountability that many of these regulations require. “The Basel Committee on Banking Supervision’s BCBS 239 Risk Data Aggregation and Reporting Principles set the standard for what regulators will want to see in terms of reporting,” he says. “In our view, data management is everything because regulators require senior management to be accountable and that means knowing the data lineage – how it flows through the organisation and who does what to it.”

This knowledge is applicable to any regulation, he adds. “If a firm has completely accurate and timely information about data that is well documented, this will provide a clear and good structure for data governance, which will make data management easier. If that is in place for Mifid, adding projects or data elements will be quite straightforward for other regulatory requirements.

BCBS239, GDPR and CFO Attestation require banks to be accountable for their data sets and to strengthen their risk data aggregation capabilities and reporting. In response, financial institutions are evaluating and implementing data lineage tools that will enable them to track data throughout the trade lifecycle more effectively.

But tracking data and ensuring its quality is a challenging undertaking and one that has long troubled financial institutions. Says Varun Singhal senior vice-president, product manager at Axiom SL: “At an enterprise level, firms use a combination of various in-house and vendor solutions, meaning that no one technology system hosts the complete data tracking capabilities. Hence, getting data lineage right and in an automated fashion is challenging work. Firms are manually extracting the relevant data lineage information from the various sub-systems to achieve data tracking.”

Richard Hogg, global GDPR specialist at IBM, says the company’s clients are working on their information governance programmes around data. This means compiling a central inventory and policy catalogue as the key framework for information governance. “Organisations are recognising that it is key to tie defensible policies against whatever the legal citations are, across all compliance regulations in all jurisdictions they operate.”

Historically, this was known as the records, or retention, schedule but covers far more than just retention, including data residency, privacy, security even through to data breach reporting obligations. “Most recently, the impending GDPR has put an even greater focus on data lineage along with data minimisation (retention), data protection (security/cyber) and processing activities,” adds Hogg. “In parallel, it is key to manage and use a common business language or terminology, visualising the data flow and use across the business, how it is connected across the business, and tracking where it came from and where it ends up.”

There has been a ‘continuous drumbeat’ from regulators, rather than a specific set of regulations that is driving firms to improve their data lineage practices, says Arnold Wachs, principal, practice lead data management, at US-based buyside consultancy Cutter Associates.

The handling of data lineage and data tracking in general comes under data governance functions. Any data that goes out to regulators or clients must come under data governance rules, including being documented and ensuring someone is responsible for data quality at all stages of the trade lifecycle.

Singhal adds that as bank executives are responsible for submitting reports to regulators and attesting to the numbers therein, senior management needs to trust their governance processes. “The person responsible for reporting should have confidence that the numbers and positions are correct.”

Wachs says there is nothing “magical” in solving the challenges of data tracking and data governance: it requires hard work. Approaches to data lineage differ. One large Swiss bank, he says, is putting in place data governance rules that require personnel to obtain data from a specific source. This idea of using an authorised, mandated source of data is getting traction, he says, as firms tire of the challenges inherent in data inconsistencies caused by disparate data sources. “The bigger firms are doing this, and the approach is beginning to trickle down to the medium and smaller buyside firms.”

Since the same financial instrument, for example loans, can flow to different regulatory reports, firms cannot solve the data lineage problem in isolation for each regulatory report, says Singhal. Rather, they need to take a holistic approach. “Firms are working towards building a more comprehensive solution with the ability to track the financial instrument from the point of origin where the instrument was booked to the final stage where it is reported to the regulator.”

A governance catalogue is the key framework, says Hogg. This provides the shared business terms and inventory of what each type of information, or data, is, where it comes from, how it is used or processed, against what policies apply, and where it ends up. Key with this is the policy catalogue, which is intimately linked. For metadata discovery and data actions, organisations should utilise open interchange formats and application programming interfaces. Aligning data, policies and automation trust is a key cultural essential for success, he adds. “With this in place, for GDPR or any other regulation, you then have a defensible, authentic source of data across all the stakeholders, regardless of their function and differing needs. This lets everyone, up through to the chief executive, to answer the who, where, when, why – and most importantly, what – is the purpose and processing of the data depending on its lineage, business value, usage and policy decisions, transparently.”

Many firms have outsourced data management and Wachs says the third parties that provide services are having to “step up their game”. There is more push from clients for third-party providers to deliver clean, scrubbed data and to show the lineage of that data. He believes suppliers such as State Street, BNY Mellon and JP Morgan are responding to the demand from clients and are increasing transparency of their underlying data structure.

David Pagliaro, EMEA head of State Street’s data analytics arm, Global Exchange, says a hybrid approach will emerge among buyside firms. Certain data and content sets will be outsourced to firms such as his while other, more sensitive, data will remain inhouse. Under GDPR rules, there will be some content that service providers may not want to take on because of the potential fines for violating information privacy rules. There will be more standard data that can be moved to managed services where a firm will benefit from cost and scale. This would include operational data related to trade management and fund accounting.

The key to a hybrid model will be how the two models work together, he says. “In the past, firms had very structured data warehouses and the cost of managing data was high. Now as we are moving to the cloud, the cost of data management is very much lower. A potential model is that the sensitive information sits within the client’s internal cloud, while standard data sets reside within the managed services cloud. There could be cross-pollination between the two, but a firewall that prevents the third-party from seeing the sensitive data.”

Singhal says understanding the origin of the data, its flows and the transformations it undertakes, provides an organisation with a complete picture that ensures data integrity. For regulatory disclosure, it is critical to have the ability to quickly respond to enquiries relating to numbers on the regulatory-facing reports. To comply with constantly changing regulations and technology improvements requires using tools which demonstrate the impact of those changes across all systems/reports using them.

“A successful data lineage solution provides the end user – including business users, such as a report owner, and technology users, such as a system owner – the ability to track the lifecycle of a data element from the point of origin to the end report,” he says.

Wachs says many firms now realise that getting data ‘right’ for regulatory reporting also benefits client reporting. “Firms are now documenting the data lineage of individual fields; this has never happened before. It represents a huge improvement and means firms have people who really understand the data they have and can start to use that data to address day to day business problems.”

A recent report from analysts Aite Group suggests that regulatory compliance need not be a burden; it can be a benefit and a challenge for the data management function.

Data Management Technology Trends: Law and Reorder by senior analyst Virginie O’Shea, says if senior management foster a strategic view of data requirements across regulations, compliance can be a source of funding for the data management function. However, if the executive team at a firm approach compliance as “tactical firefighting”, such projects can divert resources that may otherwise be spent on operational efficiency.

The buy-side faces far fewer direct regulatory reporting requirements than does the sell-side, so it is no surprise, says Aite Group, that eight of the respondent asset management firms felt that no specific regulations had directly affected their routine tasks. This does not mean, however, that the post-crisis regulatory (and by extension the capital markets industry) focus on transparency has failed to affect the buy-side.

More than half of the 24 financial institutions studied in the Aite Group report had a data governance program in place, two were establishing such a program (in 2016 when the survey was conducted), and three were considering one. Those that had no plans were smaller asset managers and tier two brokers that were at an earlier stage in implementing securities reference data management programs of work.

Firms face significant regulatory requirements, says O’Shea, which she characterises as posing an “unsustainable amount of work”. This is best illustrated in the report’s findings on chief data officers (CDOs). Just over one third of the top 100 investment banks and brokerage firms across the world have a CDO, however the average tenure of these executives is two years and three months. A handful of these banks have more than one CDO in place, which reflects the regionally or operationally siloed nature of these firms and the data fiefdoms that have grown up around the various business lines. On the buyside the figures are even lower, with only 24% of the top 50 asset management firms boasting a CDO. Just under one third of Aite Group interview respondents had a CDO in charge of data management. The short tenure of CDOs, says O’Shea, is due to the huge amount of work they face and the lack of support in terms of funding and time. “CDOs soon realise they have no money, no power and no one really cares about data within the firm,” she says. “There’s an element of panic among CDOs as they come to realise they cannot hope to deliver given the lack of a technology budget or enough time to develop data management and governance strategies.”

Active data management and quality assurance are required to ensure firms stay clear of regulator-imposed financial penalties and reputational damage caused by inaccurate reporting inputs, says the Aite Group report. Data aggregation, which effective risk management and reporting entails, requires a consistent manner of storing and managing data over time. This is a very difficult task for firms to manage across their entrenched operational silos and functionally specific data fiefdoms.

“The establishment of data governance programs and the installation of C-level executives in charge of communicating and driving data management strategy are two ways in which both sell-side and buy-side firms have approached this challenge, but these teams have an incredibly difficult set of goals to achieve,” says the report. “Every data chief has and needs a code – a consistent taxonomy and ontology across silos, or at least a method of cross-referencing the existing data sets.”

Dermot Harriss, senior vice-president at OneMarketData, says many buyside firms are not storing data at all. “The regulations require them to go from zero to up to seven years of storage of transaction data – and that includes all parts of the order cycle,” he says. “Firms need to be able to reconstruct a trade and that requires storing context information which can include non-digital aspects such as telephone calls.”

There is no black box solution that will solve the problem firms face, he adds. Rather, any approach to data management and governance must be strategic and questions to ask include how data should be stored, how much will that cost, can data storage be outsourced, and how much context information also needs to be saved. “All of the transparency regulations are pushing in the direction of requiring the storage of more detailed information. A solution is required that stores everything once along with a mechanism that enables firms to ask questions of the data – and that anticipates the questions regulators might want answered in the future.”

Some of the large sell side firms are building central repositories and Harriss believes the initial focus for firms should be on storage. “Many firms are creating data lakes because they aren’t sure what format to use but realise they need to store it and will try to figure out the best approach later. More sophisticated firms are looking at time series databases, which use time stamps that will help when reconstructing trades.”

Harriss says firms must focus on understanding the structure of data from the very beginning. “Regulatory compliance can be a revenue opportunity. As soon as these transparency regulations really take hold customers can evaluate a firm’s business more accurately. There will be standard metrics and clients can quantify the performance of firms. There are ways firms can use regulatory metrics to gain a competitive edge.”

Peter Moss, chief executive of SmartStream’s Reference Data Utility agrees that there should be a focus on getting data right at the very beginning of any data management project. Good quality securities master data will be “absolutely critical” for meeting regulators’ requirements, he adds. “We think having a good quality source of reference data is a critical part of getting data governance right,” he says. Increasingly the financial industry is recognising the value of the utility approach to some complex issues such as reference data.