Thursday, March 13, 2014

Business Intelligence Tool Selection

In the 1960s, Business Intelligence (BI) software gained considerable traction as "decision support systems." Nowadays, there are over 100 BI software companies competing for market share.

As we know, BI software helps organizations organize and analyze data to make better decisions. The BI market is growing so fast because of the tremendous amount of data that is now being generated. One of the biggest trends in the market is now focused more on designing user-friendly interfaces, rather than just the software architecture.

To find the best option, we must make sure the BI tool you select matches your needs. BI software can be divided into three broad categories: data management tools, data discovery applications and reporting tools (including dashboards and visualization software). What tool you choose depends on how your data is currently manged and how you plan to analyze it.

If you have data that lies in disparate places and particularly if they are "dirty," you would need a data management tool to perform ETL and check for data quality. If, however, your issue is mainly sifting through good, clean data and draw meaningful conclusions, then you would want to consider a data discovery application, which uses data mining and OLAP methods and technologies to help you do so. Finally, if all you really need is generating reports based on existing findings, then you would want to consider getting a reporting tool, which helps create dashboard and visualizations to help you present your findings.

Check out this list of the newest BI tools available on the market: http://www.softwareadvice.com/bi/. Before you make your decision though, be sure to consider your business needs, as well as your budget and feasibility to implement these solutions.

Best of luck!

Reference: http://www.softwareadvice.com/bi/

Thursday, February 27, 2014

BI and the Airline Industry: Case Study

I am a huge air transportation enthusiast, so naturally, our Cloud Airlines example in class inspired me to look for a BI solution that has been implemented in the real world. I will introduce a case that I have found in this blog.

An Asia-Pacific airline company determined that it needed a business intelligence solution that would serve as a single information source and integrate all systems across its enterprise. The company tried to implement a solution on their own but found that they lacked the internal resources. So, they reached out to Accenture to help them do so. They chose Accenture because of its more than 30 years of experience working with SAP existing relationship with the company and previous work in this area for another major airline.

Accenture used SAP Business Objects Enterprise for reporting and Dashboard Design for for the dashboard. This was designed to replace the existing, manual process of creating reports, which helped the managers focus more on analysis instead. After Accenture helped the company design their new enterprise airline data warehouse model, the next stages included:
  • Building and implementing the data warehouse and dashboard
  • Creating BI reports for airline operations, passenger revenue and profitability analysis
  • Preparing the client's team for further development and improvement
The implementation covered the entire business, from flight operations to sales to customer service. With the new solution the airline company's employees no longer needed to spend large amounts of time generating reports for top management. The automation process also reduced the risk of reporting errors that had previously existed when the reporting was performed manually. 

To learn more, visit here: http://www.accenture.com/us-en/Pages/success-asia-pacific-airline-data-warehouse-solution.aspx. 

Thanks for reading!

Thursday, February 13, 2014

Gathering Requirements

We talked about best practices for gathering business requirements in class, and I thought I would expand upon what we discussed.

Source Quality

It is important to point out that we should take some time to think about exactly whom to gather requirements from, because depending on the source, the answers we get may have varying degrees of value to us. For example, in some cases, the marketing department's input may be more helpful, while in others, the accounting department may provide more valuable information for our BI/DW work.

Key Techniques

There are specific ways via which we can gather requirements, and we may get different results at different degrees of quality and specificity. If there are no clearly defined questions, a brainstorming section, which allows free exchange (i.e., anything goes), may be the most helpful. If there are specific questions you are ready to ask, then one-on-one interviews may be better. If you are unable to find a time that both you and your clients can meet, perhaps a questionnaire is the way to go. Finally, in the case of system upgrades especially, studying the existing system may be a good starting point, because you can gather a lot of the same requirements that way without having to talk to the clients.

Selecting a Technique

You may wonder how we can select the best technique to gather requirements. Below is an interesting framework that may provide some guidance:



 Based on where your individual situation is in the graph above, you can select the right technique as follows:

  • Catch-up: Interviews, work in target environment
  • Fuzzy: Brainstorming, workshops
  • Mature: Questionnaires, workshops, prototypes
  • Selling/Teaching: prototypes
I hope that helped in shaping how you approach the critical requirement gathering process in BI/DW design!

Reference: http://epf.eclipse.org/wikis/openup/core.tech.common.extend_supp/guidances/guidelines/req_gathering_techniques_8CB8E44C.html

Tuesday, February 4, 2014

The Four-Step Dimensional Design Process

Kimball and Ross laid out a systematic way to approach dimensional modeling. I will review the four key steps below:


Step 1: Select the Business Process

A business process is a low-level activity performed by an organization. This can be taking orders, processing claims or registering students. To identify these, look for the following common characteristics:

  • Expressed as action verbs
  • Supported by an operational system
  • Generate key performance indicators
  • Triggered by an input and result in output metrics
Be careful, though, when interviewing the users in an organization, as they often tend to talk in terms of broad strategic business initiatives. These would not serve well for dimensional modeling purposes. Make sure to dig deeper and seek out the specific underlying business processes that support the analytic requirements of the initiatives. 


Step 2: Declare the Grain

Declaring the grain means specifying exactly what an individual fact table row represents. This step is crucial in the modeling process, yet many professionals tend to overlook its importance. In the past, failing to declare the grain has been the most frequent mistake that professionals make. Although the grain is ultimately equivalent to the primary key, it should still be described using business terms for clarity. Below are a few examples:
  • One row per individual boarding pass scanned at an airport gate
  • One row per bank account each month
  • One row per line item on a bill from a doctor

Step 3: Identify the Dimensions

Once the grain has been properly declared, the dimensions typically can easily be identified as they represent the “who, what, where, when, why, and how” associated with the event. A robust set of dimensions representing all possible descriptions should be identified. The following are some examples:
  • Date
  • Customer
  • Employee
  • Facility

Step 4: Identify the Facts

The facts are the performance metrics that business users are concerned about. These must be appropriately defined in accordance with the declared grain. If not, they should be placed in a different fact table. Usually, facts are numerical data, such as total cost or order quantity. However, in some case, they are not - we would call them "factless" facts. 

The four-step process must take into account not only the data sources, but more importantly, the business requirements. Although gathering user input may involve more complexity than relying on the data alone, it is essential for a sound dimensional design.

Reference: "The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling, 3rd Edition" by Ralph Kimball and Margy Ross

Tuesday, January 28, 2014

Dimensional Modeling vs. Relational Modeling

As someone with a background and experience in relational modeling, I often compare what I learn in our course to my existing knowledge of how to model data in a standard way. The following are the main differences between dimensional modeling and relational modeling, according to my learning:

1. There are two distinct groups of people who support the two approaches. There are the “Kimballites,” who believe in Ralph Kimball’s approach in which it is stated that the data warehouse should be modeled using a dimensional model/star schema. Then, there are the “Inmonites,” who follow Bill Inmon's approach in which it is stated that the data warehouse should be modeled using an E-R model/normalized model.

2. In dimensional modeling, data is partitioned into "facts," which generally refer to numeric transaction data, and "dimensions," which provide context to the "facts." For example, a sales transaction can be broken up into "facts" such as the number of products ordered and the price paid for the products, and into "dimensions" such as order date, customer name, product number, order ship-to and bill-to locations, and salesperson responsible for receiving the order.

In relational modeling, however, data is stored as "entities," which have some kind of physical or abstract existence in the real world and hold distinct attributes or properties that define what or who they are. In addition, the "relationships" between the "entities" are modeled to link them together for querying and analysis.

3. Because of the de-normalized form of a dimensional model, it tends to be easier to understand than the relational model. In addition, it reduces the time for data retrieval, since it does not require linking potential hundreds of tables together to run a query. Nonetheless, because the dimensional model is de-normalized, it is more difficult to maintain data integrity and more complicated to load data from different operational systems.

On the other hand, the relational model allows for a straightforward data entry process, but it makes it tougher for users who do not have a precise understanding of the data source and data warehouse structure.

4. With all that said, the two approaches are not mutually exclusive; there are hybrid approaches and dimensional models can be normalized to a certain degree without losing its core advantage of quick processing.

Check out the comparison table below for a few other key differences between the two methods:

Sources: