Our AMEX credit card was recently compromised. Someone got hold of the card information and Petro Canada charges started to rack up. Amex spotted this suspicious pattern and immediately initiated a fraud alert thru multiple touch points.
What does your credit card company know about you? A lot…maybe more than your spouse. A study of how customers of Canadian Tire were using the company’s credit cards found that 2200 of 100,000 cardholders who used their card at drinking places missed four payments within the next 12 months. By contrast, only 530 of the cardholders who used their card at the dentist missed four payments within the next 12 months. So drinking is a predictor of credit risk.
Predictive analytics is not a fad. It’s not a trend. In a real-time world, Analytics is a core business requirement/capability. However, many organizations flounder in their efforts not because they lack analytics capability but because they lack clear objectives. So the first question is, What do you want to achieve?
Analytics so far has largely been a departmental ad hoc activity. Even at the most sophisticated corporations, data analytics is a cumbersome affair. Information accumulates in “data warehouses,” and if a user had a question about some trend, they request “data priests/analysts” to tease the answers out of their costly, fragile systems. This resulted in a situation where the analytics are done looking in the rearview mirror, hypothesis testing to find out what happened six months ago.
Today it’s possible to gather huge volumes of data and analyze it in near real-time speed. A retailer such as Macy’s that once pored over last season’s sales information could shift to looking instantly at how an e-mail coupon impacts sales in different regions. Moving to a realtime model and also building an enterprise level “shared services” model is going to be the next big wave of activity.
The financial crisis of 2007–2011 is driving widespread changes in the U.S regulatory system. Dodd-Frank Act addresses “too big to fail” problem by tightening capital requirements and supervision of large financial firms and hedge funds. It also creates an “orderly liquidation authority” so the government can wind down a failing institution without market chaos.
Financial institutions will be spending billions to strengthen, streamline and automate their recordkeeping, risk management KPIs and dashboard systems. The implications on Data Retention and Archiving, Disaster Recovery and Continuity Planning have been well covered. But leveraging Business Analytics to proactively and reactively manage/monitor risk and compliance is an emerging frontier.
We believe that Business Analytics and real-time data management are poised to play a huge role in regulating the next generation of risk and compliance management in Financial Services industry (FSI). in this posting, we are going to examine the strategic and structural challenges, the dashboards and KPIs of interest that provide feedback, and what an effective execution roadmap needs to be for every organization.
In the ancient Indian parable of the elephant, six blind men touch an elephant and report six very different views of the same animal. Compare this scenario to a data warehouse that is getting data from six different sources. “Harry Potter and the Sorcerer’s Stone” as a field in a database can be written as “HP and the Sorcerer’s Stone” or as “Harry Potter I” or simply – “Sorcerer’s Stone”. In the data warehouse these are four separate movie titles. For a Harry Potter fan, they are the same movie. Now increase the number of movies to cover the entire Harry Potter series and further include fifty languages. You now have a set of titles which may perplex even a real Harry Potter aficionado.
What does this have to do with data analytics?
Data growth curve: Terabytes -> Petabytes -> Exabytes -> Zettabytes -> Yottabytes -> Brontobytes -> Geopbytes. It is getting more interesting.
Analytical Infrastructure curve: Databases -> Datamarts -> Operational Data Stores (ODS) -> Enterprise Data Warehouses -> Data Appliances -> In-Memory Appliances -> NoSQL Databases -> Hadoop Clusters
In most enterprises, whether it’s a public or private enterprise, there is typically a mountain of data, structured and unstructured data, that contains potential insights about how to serve their customers better, how to engage with customers better and make the processes run more efficiently. Consider this:
- Online firms–including Facebook, Visa, Zynga–use Big Data technologies like Hadoop to analyze massive amounts of business transactions, machine generated and application data.
- Wall street investment banks, hedge funds, algorithmic and low latency traders are leveraging data appliances such as EMC Greenplum hardware with Hadoop software to do advanced analytics in a “massively scalable” architecture
- Retailers use HP Vertica or Cloudera analyze massive amounts of data simply, quickly and reliably, resulting in “just-in-time” business intelligence.
- New public and private “data cloud” software startups capable of handling petascale problems are emerging to create a new category - Cloudera, Hortonworks, Northscale, Splunk, Palantir, Factual, Datameer, Aster Data, TellApart.
Data is seen as a resource that can be extracted and refined and turned into something powerful. It takes a certain amount of computing power to analyze the data and pull out and use those insights. That where the new tools like Hadoop, NoSQL, In-memory analytics and other enablers come in.
What business problems are being targeted?
Why are some companies in retail, insurance, financial services and healthcare racing to position themselves in Big Data, in-memory data clouds while others don’t seem to care?
World-class companies are targeting a new set of business problems that were hard to solve before – Modeling true risk, customer churn analysis, flexible supply chains, loyalty pricing, recommendation engines, ad targeting, precision targeting, PoS transaction analysis, threat analysis, trade surveillance, search quality fine tuning, and mashups such as location + ad targeting.
To address these petascale problems an elastic/adaptive infrastructure for data warehousing and analytics capable of three things is converging:
- ability to analyze transactional, structured and unstructured data on a single platform
- low-latency in-memory or Solid State Devices (SSD) for super high volume web and real-time apps
- Scale out with low cost commodity hardware; distribute processing and workloads
As a result, a new BI and Analytics framework is emerging to support public and private cloud deployments.
The “Raw Data -> Aggregated Data -> Intelligence -> Insights -> Decisions” is a differentiating causal chain in business today. To service this “data->decision” chain a very large industry is emerging.
The Business Intelligence, Performance Management and Data Analytics is a large confusing software category with multiple sub-categories — mega-vendors (full stack, niche vendors, data discovery, visualization, data appliances, Open Source, Cloud – SaaS, Data Integration, Data Quality, Mobile BI, Services and Custom Analytics).
But the interest in BI and analytics is surging. Arnab Gupta, CEO of Opera states why analytics are taking center stage, “We live in a world where computers, not people, are in the driver’s seat. In banking, virtually 100% of the credit decisions are made by machines. In marketing, advanced algorithms determine messages, sales channels, and products for each consumer. Online, more and more volume is spurred by sophisticated recommender engines. At Amazon.com, 40% of business comes from its “other people like you bought…” program.” (Businessweek, September 29, 2009).
Here is a list of vendors who participate in this marketspace:
However, it took until 1980s when decision support systems (DSS) became popular and mid 1990s for BI started to emerge as an umbrella term to cover software-enabled innovations in performance management, planning, reporting, querying, analytics, online analytical processing, integration with operational systems, predictive analytics and related areas.