The Oracle BI stack illustrates the landscape changes taking place from hardware to mobile BI apps.
As I see it, there are two clusters of “parallel” innovation: (1) technology/infrastructure centric and (2) business/problem centric.
The interesting thing in the technology/infrastructure centric side is the multiple paths of innovation that are taking along different technology stacks shown below. The disruptive innovation is happening in parallel along 4 different fronts: Read more
Consider this…eBay’s “Singularity” Teradata warehouse exceeds 40 petabytes. According to eBay, the company’s data volumes are 50+ terabytes per day in new incremental data, processing 50+ petabytes
and tens of millions of queries per day, with 99.98% availability and more than 50 petabytes of online storage.
Data is valuable. Data is plentiful. Data is complex. Data is in flux. Data is fast moving. Capturing and managing data is challenging.
So, if you are a senior leader in a Fortune 2000 company. How do you structure your group to deliver effective BI, Analytics or Big Data projects? Do you have the right structure, toolset, dataset, skillset and mindset for analytics and Big Data?
Organizing for effective BI, Analytics and Big Data is becoming a hot topic in corporations. In 2012, business users are exerting significant influence over BI, Analytics and Big Data decisions, often choosing analytics and visualization platforms and products in addition to/as alternatives to traditional BI platform (reporting and visualization tools).
Interested in slicing, dicing, measuring, and analyzing data for customer and business insights?
According to a recent survey by Bloomberg, 97% of companies with revenues of more than $100 million are using some form of business analytics, up from 90% just two years ago.
While businesses have embraced the idea of fact-based decision-making, a steep learning curve remains. Only one in four organizations believes its use of business analytics has been “very effective” in helping to make decisions. Data is not just ignored but often discarded in many organizations as the business users can’t figure out how to extract signal from data noise.
Do you have the right toolset, dataset, skillset and mindset for analytics? Do you want to enable end users to get access to their data without having to go through intermediaries?
The challenge facing managers in every industry is not trivial… how do you effectively derive insights from the deluge of data? How do you structure and execute analytics programs (Infrastructure + Applications + Business Insights) with limited budgets?
The exploding demand for analytics professionals has exceeded all expectations, and is driven by the Big Data tidal wave. Big data is a term commonly applied to large data sets where volume, variety, velocity, or multi-structured data complexity are beyond the ability of commonly used software tools to efficiently capture, manage, and process.
To get value from big data, ‘quants’ or data scientists are becoming analytic innovators who create tremendous business value within an organization, quickly exploring and uncovering game-changing insights from vast volumes of data, as opposed to merely accessing transactional data for operational reporting.
This EMC infographic summarizing their Data Scientist study supports my hypothesis – Data is becoming new oil and we need a new category of professionals to handle the downstream and upstream aspects of drilling, refining and distribution. Data is one of the most valuable assets within an organization. With business process automation, the amount of data being generated, stored and analyzed by organizations is exploding.
Following up on our previous blog post – Are you one of these — Data Scientist, Analytics Guru, Math Geek or Quant Jock? – I am convinced that future jobs are going to be centered around “Raw Data -> Aggregate Data -> Intelligence ->Insight -> Decisions” data chain. We are simply industrializing the chain as machines/automation takes over the lower end of the spectrum. Also Web 2.0 and Social Media are creating an interesting data feedback loop – users contribute to the products they use via likes, comments, etc.
CIOs are faced with the daunting task of unlocking the value of their data efficiently in the time-frame required to make accurate decisions. To support the CIOs, companies like IBM are attempting to become a one-stop shop by a rapid-fire $14 Bln plus acquisition strategy: Cognos, Netezza, SPSS, ILog, Solid, CoreMetrics, Algorithmics, Unica, Datacap, OpenPages, Clarity Systems, Emptoris, DemandTec (for retail). IBM also has other information management assets like Ascential, Filenet, Watson, DB2 etc. They are building a formidable ecosystem around data. They see this as a $20Bln per year opportunity in managing the data, understanding the data and then acting on the data. Read more
Fidelity Investments put out an interesting analysis on Big Data as a Macro Investment Themes for clients. Since everyone has an underperforming investment portfolio in this current market, I reproduced the article here to generate some ideas.
- New types of large data sets have emerged because of advances in technology, including mobile computing, and these data are being examined to generate new revenue streams.
- More traditional types of business data have also expanded exponentially, and companies increasingly want and need to analyze this information visually and in real time.
- Big data will be driven by providers of Internet media platforms, data amalgamation applications, and integrated business software and hardware systems.
Investment Theme – Big Data
The concept of “big data” generally refers to two concurrent developments. First, the pace of data accumulation has accelerated as a wider array of devices collect a variety of information about more activities: website clicks, online transactions, social media posts, and even high-definition surveillance videos.
A key driver of this flood of information has been the proliferation of mobile computing devices, such as smartphones and tablets. Mobile data alone are expected to grow at a cumulative annualized rate of 92% between 2010 and 2015 (see Exhibit 1, below). Read more
“Running a company is an endless quest to find out things you don’t know“
– Jeff Immelt, CEO GE
What will 2012 bring? Recently, I attended the CIO Executive Leadership Summit in Greenwich, Connecticut. I was particularly intrigued by the presentation by the new CIO of IBM, Jeanette Horan where she presented the projects she was tackling and how IBM is thinking about business analytics.
IBM is making a bet that “true leaders” will develop the capabilities required for making good and timely decisions in unpredictable and stressful environments.
IBM is adapting to this new data analytics reality by a rapid-fire acquisition strategy: Cognos, Netezza, SPSS, ILog, CoreMetrics, Algorithmics, OpenPages, Clarity Systems, Emptoris, DemandTec (for retail). IBM also has other information management assets like Watson, DB2 etc. They are building a formidable capability around the value chain: “Raw Data -> Aggregate Data -> Intelligence ->Insight -> Decisions” . They see this as a $20Bln opportunity. Read more
Apple with its iCloud offering is attacking the consumer facing digital content big data problem. Big Data is challenging on many fronts from the insights (e.g., analytics and query optimization), to the practical (e.g., horizontal scaling), to the mundane (e.g., backup and recovery).
On June 6th, 2011 Apple Inc. launched its new purpose built digital locker service called iCloud for its 225 million iTunes accounts that frees the end-user from the tyranny of the device. The iCloud service is a cloud offering that would allow users to store digital files such as photos, MP3 music, videos and documents in the cloud and access them from Internet-connected devices like iPhones, iPads, iPods, iMacs and others.
So, what’s the big deal? They are addressing a classic BI data management problem: How to free up data trapped in “device and application jails” in a user-friendly way. The “scan and match” concept is quite applicable to large scale Enterprise Datawarehouses which suffer from data integrity issues as edge data capture and consumption devices proliferate.
Data ingestion, governance and management is a huge problem facing large organizations. As data volumes double every year, not having a basic data management strategy will become an Achilles heel. Most organizations unfortunately don’t know what data assets they have, where these assets are, how they are organized and how well they are secured. Apple shows a neat way to address the Big Data problem in personal cloud management.
Data growth curve: Terabytes -> Petabytes -> Exabytes -> Zettabytes -> Yottabytes -> Brontobytes -> Geopbytes. It is getting more interesting.
Analytical Infrastructure curve: Databases -> Datamarts -> Operational Data Stores (ODS) -> Enterprise Data Warehouses -> Data Appliances -> In-Memory Appliances -> NoSQL Databases -> Hadoop Clusters
In most enterprises, whether it’s a public or private enterprise, there is typically a mountain of data, structured and unstructured data, that contains potential insights about how to serve their customers better, how to engage with customers better and make the processes run more efficiently. Consider this:
- Online firms–including Facebook, Visa, Zynga–use Big Data technologies like Hadoop to analyze massive amounts of business transactions, machine generated and application data.
- Wall street investment banks, hedge funds, algorithmic and low latency traders are leveraging data appliances such as EMC Greenplum hardware with Hadoop software to do advanced analytics in a “massively scalable” architecture
- Retailers use HP Vertica or Cloudera analyze massive amounts of data simply, quickly and reliably, resulting in “just-in-time” business intelligence.
- New public and private “data cloud” software startups capable of handling petascale problems are emerging to create a new category - Cloudera, Hortonworks, Northscale, Splunk, Palantir, Factual, Datameer, Aster Data, TellApart.
Data is seen as a resource that can be extracted and refined and turned into something powerful. It takes a certain amount of computing power to analyze the data and pull out and use those insights. That where the new tools like Hadoop, NoSQL, In-memory analytics and other enablers come in.
What business problems are being targeted?
Why are some companies in retail, insurance, financial services and healthcare racing to position themselves in Big Data, in-memory data clouds while others don’t seem to care?
World-class companies are targeting a new set of business problems that were hard to solve before – Modeling true risk, customer churn analysis, flexible supply chains, loyalty pricing, recommendation engines, ad targeting, precision targeting, PoS transaction analysis, threat analysis, trade surveillance, search quality fine tuning, and mashups such as location + ad targeting.
To address these petascale problems an elastic/adaptive infrastructure for data warehousing and analytics capable of three things is converging:
- ability to analyze transactional, structured and unstructured data on a single platform
- low-latency in-memory or Solid State Devices (SSD) for super high volume web and real-time apps
- Scale out with low cost commodity hardware; distribute processing and workloads
As a result, a new BI and Analytics framework is emerging to support public and private cloud deployments.