Business Intelligence

Business Intelligence

THE CHALLENGE OF BIG DATA

THE CHALLENGE OF BIG DATA

Beyond the ability of typical DBMS to capture, store, and analyze

Billions to trillions of records, all from different sources

Businesses are interested in big data because they can reveal more patterns and interesting anomalies than smaller data sets, with the potential to provide new insights into customer behavior, weather patterns, financial market activity, or other phenomena

To derive business value from these data, organizations need new technologies and tools capable of managing and analyzing non- traditional data along with their traditional enterprise data

Analytical tools: relationships, patterns, trends

Analytical tools: relationships, patterns, trends

Online analytical processing (OLAP)

Supports multidimensional data analysis

Viewing data using multiple dimensions

Each aspect of information (product, pricing, cost, region, time period) is different dimension

A company would use either a specialized multidimensional database or a tool that creates multidimensional views of data in relational databases

OLAP enables rapid, online answers to ad hoc queries

Data mining

More discovery driven than OLAP

Finds hidden patterns, relationships in large databases and infers rules to predict future behavior

E.g., Finding patterns in customer data for one-to-one marketing campaigns or to identify profitable customers

Types of information obtainable from data mining

Associations

Classification

Clustering

Forecasting

Sequences

Text mining

Extracts key elements from large unstructured data sets

Stored e-mails

Call center transcripts

Legal cases

Patent descriptions

Service reports, and so on

Sentiment analysis software

Mines e-mails, blogs, social media to detect opinions

Web mining

Discovery and analysis of useful patterns and information from Web

Understand customer behavior

Evaluate effectiveness of Web site, and so on

Web content mining

Mines content of Web pages

Web structure mining

Analyzes links to and from Web page

Web usage mining

Mines user interaction data recorded by Web server

Contemporary tools

Contemporary tools

Data Warehouses

A data warehouse is a large store of data accumulated from a wide range of sources within a company and used to guide management decisions

A data warehouse is a collection of data drawn from other databases used by the business

It is a database that stores current and historical data of potential interest to decision makers throughout the company

Supports reporting and query tools

Stores current and historical data

Consolidates data for management analysis and decision making

Improved and easy accessibility to information

Ability to model and remodel the data

Data marts

The data mart is a subset of the data warehouse and is usually oriented to a specific business line or team.

A data mart represents the specific data from a data warehouse which a user needs

It is a subset of data warehouse in which a summarized or highly focused portion of the organization’s data is placed in a separate database for a specified function or group of users

Hadoop

Enables distributed parallel processing of big data across inexpensive computers

Key services

Hadoop Distributed File System (HDFS): data storage

MapReduce: breaks data into clusters for work

Hbase: NoSQL database

Used by Facebook, Yahoo, NextBio

In-memory computing

Used in big data analysis

Uses computers main memory (RAM) for data storage to avoid delays in retrieving data from disk storage

Can reduce hours/days of processing to seconds

Requires optimized hardware

Analytical platforms

High-speed platforms using both relational and non-relational tools optimized for large datasets

Analytical information based on current data records

Ightly integrated database, server, and storage components that handle complex analytic queries 10 to 100 times faster than traditional systems

Business intelligence infrastructure

Business intelligence infrastructure

tools for obtaining useful information from all the different types of data used by businesses today, including semi- structured and unstructured big data in vast quantities

consolidating, analyzing, and providing access to vast amounts of data to help users make better business decisions

E.g., Harrah’s Entertainment analyzes customers to develop gambling profiles and identify most profitable customers