HBase | Business Analytics 3.0

Jun

The NoSQL and Spark Ecoystem: A C-Level Guide

New Technologies | New Possibilities

As a C-level executive, it’s becoming clear to me that NoSQL databases and Machine Learning toolsets like Spark are going to play an increasingly big role in data-driven business models, low-latency architecture & rapid application development (projects that can be done in 8-12 weeks not years).

The best practice firms are making this technology shift as decreasing storage costs have led to an explosion of big data. Commodity cluster software, like Hadoop, has made it 10-20x cheaper to store large datasets.

After spending two days at the leading NoSQL provider MongoDB World event in NYC, I was pleasantly surprised to see the amount of innovation and size of user community around document centric databases like MongoDB.

Data Driven Insight Economy

It doesn’t take genius to realize that data driven business models, high volume data feeds, mobile first customer engagement, and cloud are creating new distributed database requirements. Today’s modern online and mobile applications need continuous availability, cost effective scalability and high-speed analytics to deliver an engaging customer experience.

We know instinctively that there is value in all the data being captured in the world around out…no question is no longer “if there is value” but “how to extract that value and apply it to the business to make a difference”.

Legacy relational databases fail to meet the requirements of digital and online applications for the following reasons:

Jan

Big Data Fatigue and Company Shakeout?

Big Data is the latest “next big thing” transforming all areas of business, but amid the hype, there remains confusion about what it all means and how to create business value.

Usually when there is so much hype…there is an inevitable boom-bust-boom cycle. Hence my question: Is the Big Data shakeout inevitable?

Are we in a big data tech bubble? If you are an enterprise customer, how do you prepare for this? What strategies do you adopt to take advantage of the situation? Can you move from lab experiments to production deployments with confidence?

The sheer number of companies that are chasing “the pot of big data gold” is astounding (see below). While the innovation has accelerated the ability of the typical Fortune 1000 enterprise to absorb and assimilate has not. They tend to be 5-10 years behind the curve. As a result, many big data startups are either running out of cash or they are being folded by VCs into other firms. This boom-bust cycle is a typical pattern in innovation.

http://www.bigdata-startups.com/open-source-tools/

Source: Big Data Universe v3.. Matt Turck, Sutian Dong & FirstMark Capital

The Case of Drawn to Scale

Drawn to Scale, the four year-old startup behind Spire, shut down recently. Co-founder and CEO Bradford Stephens announced the news in a blog post. Drawn to Scale raised .93M in seed funding.

Spire is a real-time database solution for HBase that lets data scientists query Hadoop clusters using SQL. According to Stephens, the system has been by deployed by American Express, Orange Flurry, and four other companies.

Drawn to Scale showed that its technology was viable in enterprise environments and established a “presence against competitors who raised 10-100x more cash,” but even that wasn’t enough to save the startup from its financial woes.

As Hadoop evolves and different layers of the data analytics stack get commoditized, specialized vendors like Drawn to Scale will have problems surviving. SQL-on-Hadoop was a unique feature set…but over time it has become a must-have feature, that is becoming embedded in the stack – e.g., Impala in Cloudera CDH stack. As a result, firms like Drawn to Scale once unique functionality becomes difficult to monetize.

Startup to Viable Ventures

The Big Data ecosystem is exploding with exciting start-ups, new divisions and new initiatives from established vendors. Everyone wants to be the vendor/platform of choice in assisting firms deal with the data deluge (Data growth curve: Terabytes -> Petabytes -> Exabytes -> Zettabytes -> Yottabytes -> Brontobytes -> Geopbytes), translate data to information to insight, etc.

In both U.S and Europe, several billion dollars of venture money has been invested in the past three years alone in over 300+ firms. Firms like Splunk had spectacular IPOs. Others like Cloudera and MapR have raised gobs of money. In the MongoDB space alone – a small market of less than 100M total revenue right now, over $2 Billion is said to have been invested in the past few years.

May

New Tools for New Times – Primer on Big Data, Hadoop and “In-memory” Data Clouds

Data growth curve: Terabytes -> Petabytes -> Exabytes -> Zettabytes -> Yottabytes -> Brontobytes -> Geopbytes. It is getting more interesting.

Analytical Infrastructure curve: Databases -> Datamarts -> Operational Data Stores (ODS) -> Enterprise Data Warehouses -> Data Appliances -> In-Memory Appliances -> NoSQL Databases -> Hadoop Clusters

———————

In most enterprises, whether it’s a public or private enterprise, there is typically a mountain of data, structured and unstructured data, that contains potential insights about how to serve their customers better, how to engage with customers better and make the processes run more efficiently. Consider this:

Online firms–including Facebook, Visa, Zynga–use Big Data technologies like Hadoop to analyze massive amounts of business transactions, machine generated and application data.
Wall street investment banks, hedge funds, algorithmic and low latency traders are leveraging data appliances such as EMC Greenplum hardware with Hadoop software to do advanced analytics in a “massively scalable” architecture
Retailers use HP Vertica or Cloudera analyze massive amounts of data simply, quickly and reliably, resulting in “just-in-time” business intelligence.
New public and private “data cloud” software startups capable of handling petascale problems are emerging to create a new category – Cloudera, Hortonworks, Northscale, Splunk, Palantir, Factual, Datameer, Aster Data, TellApart.

Data is seen as a resource that can be extracted and refined and turned into something powerful. It takes a certain amount of computing power to analyze the data and pull out and use those insights. That where the new tools like Hadoop, NoSQL, In-memory analytics and other enablers come in.

What business problems are being targeted?

Why are some companies in retail, insurance, financial services and healthcare racing to position themselves in Big Data, in-memory data clouds while others don’t seem to care?

World-class companies are targeting a new set of business problems that were hard to solve before – Modeling true risk, customer churn analysis, flexible supply chains, loyalty pricing, recommendation engines, ad targeting, precision targeting, PoS transaction analysis, threat analysis, trade surveillance, search quality fine tuning, and mashups such as location + ad targeting.

To address these petascale problems an elastic/adaptive infrastructure for data warehousing and analytics capable of three things is converging:

ability to analyze transactional, structured and unstructured data on a single platform
low-latency in-memory or Solid State Devices (SSD) for super high volume web and real-time apps
Scale out with low cost commodity hardware; distribute processing and workloads

As a result, a new BI and Analytics framework is emerging to support public and private cloud deployments.

Top 50 Big Data Marketing Blogs

Named by NGData as one of the Top 50 Big Data Marketing Blogs: The Best Blogs with Up-to-Date and In-Depth Insights on the Intersect of Big Data and Marketing
Recent Posts
Defining Business Analytics

What is Business Analytics? Business Analytics is the intersection of business and technology, offering new opportunities for a competitive advantage. Business analytics unlocks the predictive potential of data analysis to improve financial performance, strategic management, and operational efficiency.

Analytics is “the extensive use of data, statistical and quantitative analysis, explanatory and predictive models, and fact-based management to drive decisions and actions.” Data analytics software and advanced analytics techniques, include predictive analytics, text analytics and text mining, customer analytics and data mining.
------------------------------

What is BI? BI is the "computer-based techniques used in spotting, digging-out, and analyzing 'hard' business data, such as sales revenue by products or departments or associated costs and incomes. Objectives of BI implementations include (1) understanding of a firm's internal and external strengths and weaknesses, (2) understanding of the relationship between different data for better decision making, (3) detection of opportunities for innovation, and (4) cost reduction and optimal deployment of resources." (Business Dictionary). Most widely used BI tool is Microsoft Excel.
-----------------------------
What is Big Data? Big data refer to data scenarios that grow so large (petabytes and more) that they become awkward to work with using traditional database management tools. The challenge stems from data volume + flow velocity + noise to signal conversion. Big data is spawning new tools that are mix of significant processing power, parallelism and statistical, machine learning, or pattern recognition techniques
----------------------------

Corporate performance management software and performance management concepts, such as the balanced scorecard, enable organizations to measure business results and track their progress against business goals in order to improve financial performance.
-----------------------------

Data visualization tools, include mashups, executive dashboards, performance scorecards and other data visualization technology, is becoming a major category.
-----------------------------

BI platforms provide a range of capabilities for building analytical applications. Examples are Oracle OBIEE, SAP Business Objects 4.0. There are many choices and combinations of BI platforms, capabilities and use cases as well as many emerging BI technologies such as in memory analytics, interactive visualization and BI integrated search. The idea of standardizing on one supplier for all of one’s BI capabilities is difficult to do. Increasingly, standardization and more about managing a portfolio of tools used for a set of capabilities and use cases.
----------------------------
Data integration tools and architectures in support of BI continue to evolve. Extract-Transfer-Load (ETL) tools make up a big segment of this category in addition to data mapping tools. Organizations must now support a range of delivery styles, latencies, and formats.
----------------------------
BI is about "sense and respond." Analytics is about "anticipate and shape" models.
Calendar
May 2024

M T W T F S S

1 2 3 4 5

6 7 8 9 10 11 12

13 14 15 16 17 18 19

20 21 22 23 24 25 26

27 28 29 30 31

« Nov
Email Subscription

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Email Address:

Join 995 other subscribers
Follow Us
Tweets by PracAnalytix
Top Posts & Pages
Relevant Tags
Acxiom advanced retail analytics Analytics Apple Appliance BI BI CoE BI Competency Center Big Data Big Data and Analytics. BI Shared Services BI Vendors Business Business Analytics Business Intelligence BusinessObjects cloud computing Cloudera Cognitive Computing Cognos customer analytics Dashboards Data analysis Databases Data Economy Data mining Data Overload Data visualization Data warehouse Data Warehousing Decision making E-commerce Exadata Facebook Financial Services Fitbit Gartner Google Greenplum Hadoop HANA HBase IBM industrial internet Information Management Insight Driven internet of everything Internet of things iPad IPhone Machine Learning Mapreduce Market Trends Microstrategy multi-channel Netflix NoSQL omni-channel Omni-channel Retailing Oracle Oracle Exadata Performance indicator predictive analytics retail Salesforce.com SAP SAP AG SAS Scorecards Social media Twitter United States Vertica visualization wellness
- Shirish Netke
- Ravi Kalakota

Posts tagged ‘HBase’

The NoSQL and Spark Ecoystem: A C-Level Guide

Data Driven Insight Economy

New Tools for New Times – Primer on Big Data, Hadoop and “In-memory” Data Clouds

What business problems are being targeted?

Top 50 Big Data Marketing Blogs

Recent Posts

Defining Business Analytics

Calendar

Email Subscription

Follow Us

Top Posts & Pages

Relevant Tags

About

Pages

Search

Posts tagged ‘HBase’

Subscribe

The NoSQL and Spark Ecoystem: A C-Level Guide

Data Driven Insight Economy

Share this:

Big Data Fatigue and Company Shakeout?

The Case of Drawn to Scale

Startup to Viable Ventures

Share this:

New Tools for New Times – Primer on Big Data, Hadoop and “In-memory” Data Clouds

What business problems are being targeted?

Share this:

Top 50 Big Data Marketing Blogs

Recent Posts

Defining Business Analytics

Calendar

Email Subscription

Follow Us

Top Posts & Pages

Relevant Tags

About

Pages

Search