10 Cornerstones of Business Intelligence

Originally published 1 August 2009

Few companies who, if given the choice, wouldn't do their data warehouse project over again, knowing what they learn in the process.  Based on the experiences with various data warehouse implementations, most of the pitfalls teach recurring lessons for those embarking on a project.

This article outlines 10 definite cornerstones for business intelligence (BI) and data warehouse (DW) programs. By understanding these lessons learned, hopefully you will avoid the pitfalls altogether or at least decrease their affects when confronted with them.

1. Governance
Data warehouses are long-term programmes that include a number of projects and a significant amount of maintenance. The decision to deploy a data warehouse commits the business to a program that needs to be controlled to ensure that it provides business benefit and value for the cost.

Business growth, demand, external change (i.e., source system change) , maintenance, risk (regulatory policies) , issues (support), etc. will effect the data warehouse business requirements, technical infrastructure, configuration management, test management, data stewardship, service level agreement (SLA) and support model. Hence, it is necessary to have sufficient organisational structure in place to control the changes and the communication of that change. The role of governance is to provide the policies, processes and procedures necessary to ensure that the programme of work is effective.

Governance should be well defined and should include:

  1. Executive sponsor – A BI project will change the way the business users do their work. The business users become hesitant, maybe even hostile toward the new technology and application. Hence, you need an executive who is willing to be the first person to adopt the new environment and sponsor the program.
  2. Steering committee – This group monitors the programme  to ensure it delivers the right projects at the right time and at fair value.
  3. Programme management - Co-ordinated management of a portfolio of projects to achieve a set of business objectives.
  4. Exploitation team – Access to the actual users of your application is necessary for success. These are the people who can ensure that the business is extracting the most value from the solution.
  5. Implementation teams (also includes vendor teams) – Teams (those who do the actual work) need to agree up front about how to control the scope, the degree of additional requirements (in some case data attributes) that can be acceptable and how to manage the complexity of the validation rules, etc. Basically, well-defined scope control should be in place, and teams must be aware of the commercially agreed scope.

Successful governance will:

  1. Get the project going and keep it moving by adhering to the scope and deadlines for delivery. This avoids analysis paralysis and ensures that there is time to take corrective actions if necessary.
  2. Monitor and measure the ROI at each stage of the project. Define methods for determining the cost and value of any action, some apparently low cost (often called quick-win or tactical) solutions deliver little value, and their true cost is disproportionately higher as they are discontinued after a short time span.
  3. Have a clear, transparent process for deciding on the priories and then stick to them
  4. Review the processes regularly to add required ones and remove unnecessary ones.
  5. Develop and enforce clear standards for all aspects of the warehouse, such as naming conventions for code and documents, data ownership, data cleaning, access to data, service levels and performance. At the same time, care should be taken such that this phase does not elongate the project duration.
  6. Ensure that there are clear, formal processes for handling change requests, risks, scope control and issues

2. Stakeholders commitment and involvement to the BI project

IT plays a key role in interfacing between the BI project team and actual business users. However, in order to get the actual requirements in place for successful BI implementation requires more than just IT alone providing input to the project.
 
A service provider who has signed a contract to put the BI project into production definitely has ownership on the delivery. That being said, the delivery cannot be a success without active participation from the client and end users at each stage of the entire life cycle. Assuming service provider companies own everything about the successful delivery of the project is one of the biggest mistakes in BI projects.  Service providers are specialized consultants who can give you options and best practices to implement the BI project as you require.

Data warehousing is not "an IS project." It is an enterprise project that will be used by the business side of the company. There are several factors that are key to make a data warehouse successful. Project success is mainly dependent on the critical review and use of its stokeholds at each stage of the project.  Waiting to use the product until completion of project and lack of business review of the project will severely impact the quality of final product.

In some instances, the service provider is responsible for the design and development of the solution, while the customer plays the role of review and approval. Care should be taken in avoiding the sign-off delays due to deadlock between the two parties on the approach/solution/ review timelines or changing priorities or lack of interest of either of the party. Both parties should agree on the signoff timelines and stick to them.

If the project executed by the service provider is an offshore model, the commitment between the two teams on delivering the project objectives and processing the knowledge transition are keys to success of the project.

3. People
Behind every success or failure are people. People are the only differentiators. Having right people for different roles in a BI project is very important.

Project manager. BI project management requires different techniques and methods to succeed. The breakthrough in work process and methodology that form the foundation of data warehouse delivery includes such concepts as iterations and phased delivery, From a non-data warehouse perspective, it is hard to appreciate how truly revolutionary and critical these concepts are for successful BI delivery.

Most often, it becomes very challenging to convince a non-DW project manager that the analysis and design phases (development, testing and usage) in DW projects go side by side and not one after the other as in traditional project delivery. If this important aspect is ignored, then the schedule and budget are going to get hit, as one always encounters changing requirements in DW projects. Additionally, the fallout results in arguments and politics rather than focusing on technical solutions.

Solution architect.  A solution architect is one who can play a key role in integration, who can bridge the gap between business and IT people and bring them to the common table, who can define the interface/methodology for various BI processes  and who can envision the future state of the solution and provide end-to-end guidance to various teams and stakeholders.

Changing solution architects halfway into the project and assuming that he/she is going to magically fulfil all the deficiencies is the most common scenario and is one of the main mistakes. An architect cannot just walk in and fix the things.

The best time to start the solution architect involvement is beginning with the analysis stage. The architect’s expertise and skill in data warehousing and business intelligence will bring a wealth of knowledge from the field of reference architectures and such implementations that can reduce the cost and time to implement technology solutions.

People evaluating BI tools. BI initiative success also depends on the people’s skills set in evaluating the right tool or product for different functionality (ETL, ELT,  reporting, data warehousing, analytics) performed in the BI project.

Vendors’ claims about their product and its performance are good and exiting. However, usage of the correct product to address a specific problem is very important. For example, most ELT products claim that they are RAD tools  - good in data integration, but not so good in building summary databases, which require multilayer staging or complex business logic.

For example, if a director whose area of expertise is infrastructure domain is given the authority to make the decision on an analytic tool, the outcome will not impressive. Hence, it is key for the people who evaluate the tools to understand the problem statement and does some homework about the product rather than just blindly accepting vendor presentations and claims.

Team. Having right team with right skill set and experience is another key to the success of any project. For BI projects to succeed, associates should possess multiple skill sets, such as good report design and efficient SQL writing skills as well as understand the various features of the reporting tool, underlying database and data model.

With an inexperienced team, an inordinate amount of person-time may be taken up by numerous meetings to bridge the skill gap. This deprives the team, both on- and offshore, of the time necessary to progress on a very tight schedule. In addition, senior people need to spend time to support the developers in project implementation/testing and interfacing with different modules in the project. This actually takes up the lion's share of the senior people’s bandwidth, and they may end of loosing the focus on their actual job.

4. Right data model
The data model is the heart of the data warehouse, which will determine all other aspects such as performance, easy reporting and scalability. The data modelling should be done by data modellers who specialize with the appropriate pattern (dimensional/normalised/de-normalised/galaxy/hybrid) to best meet the business requirements, customer environment and technology stack.

Business user inputs are key to define the data model; however, users driving the model definition can be another biggest mistake results in a failed BI initiative.

One of the BI initiative used rapid application development (RAD) to define and develop the data mart. In RAD, the BI team provided the requirements in terms of the data attributes required in each table and insisted on having particular attributes in specific table. During this exercise, the team tried to please the business users and ended up building a set of snapshot tables and complex ETL to populate the data mart.  Because of complex ETL and inflexible data model that resulted, it became very hard to support the model. Even fixing defects or adding simple fields became - harder to do and contributed to failure in the long run.

Business users should play a key role in defining the data completeness, linking the various business areas/subjects areas. The data modeller, with business inputs, should take the ownership of designing the schema by applying right modelling rules.

5. Right tools
There are a number of important tools which are connected with data warehouses, such as multidimensional analysis tools  to view data from a wide variety of angles, query tools to send SQL and view the results, data mining tools to look for patterns, data aggregation to build summary data and , data integration (EAI, ETL, ELT , EII) tools to integrate various source system data, data visualization tools to view the trends, meta data tools to understand the changes with in database, and data quality tools to assess the quality of data.

For example, you can store data based on each transaction or you can store it based on a summary. These are examples of data aggregation. When data is summarized, the queries will move at a much faster rate. However, some of the information may be lost during a query, and this information may be important for solving a certain problem. Once you have carried out an operation, you will need to rebuild the warehouse in order to undo it.
 
In another example, is to use ELT instead of ETL to integrate the data between data warehouse and external systems. However, by using ELT in place of ETL for handling multistaging creates more complexity and become difficult for maintenance.

In addition to using the right tool, it has to have the right feature. For example, either server or parallel job can be used to implement any of the functionality in data stage ETL; however the performance of job will vary with volume and source/target partition strategy.

The lesson is that it is very important to weigh your options carefully before deciding which tool to use.

6. Time to market
Due to change in business, mergers or acquisitions, all parts of the business will view the DW differently. Clear communications will help people understand what is available for them, what the priorities are going forward and how to access the information.

Development and release should be well planned and timely. Missing release dates will contribute to additional requirement to meet the changing business, which in turn cascades to additional cost and time. Create realistic time goals. Neither rush a project or delay the rollout.

When establishing time estimates on that first project, allow for the learning curve. Implementation time should be in three- to six-month ranges. If it takes longer, users will lose interest. You can maintain the interest level of your users if you quickly deliver smaller components of the data warehouse.

7. Environment and hardware
In order to ensuring the DW will get the optimal performance and will scale as data set grows, you need to get the hardware configuration correct.  Regardless of the design or implementation of a data warehouse, the initial key to good performance lies in the hardware configuration used. Many DW operations are based upon large table scans and other I/O-intensive operations, which perform vast quantities of random I/Os. In order to achieve optimal performance, the hardware configuration must be sized end to end to sustain this level of throughput.

Another important aspect is to ensure for a successful BI initiative is to have the right environment for development and testing. Because of the nature of the BI project and phased implementation approach, there is a need for different development and test environments with the right hardware, software and test data. In case of offshore delivery model, things will be further complicated as it requires additional environments at offshore sites and synchronization between onshore-offshore environments.

Test data plays key role in the success of BI project development. Not having right data during development increases the integration time by contributing to more integration defects; not having right test data for system time will stretch the testing times twice to that of actual. Poor performance, inefficient resource utilization, increased number of defects, and time slippage are some of the issues caused by poor test data.

There should be a strong release management process in place to support multienvironment, multirelease BI initiatives.

8. Source system data quality
As we speak about the maturity of the data warehouse today, there is still a lot to be explored and learned about the severity of the data quality problem. The quality of the data has to be checked and cleansed at the source or at least before it enters the data warehouse. It is inappropriate to do any quality checks in the data warehouse itself.
 
In one of my BI initiatives, it was assumed that the data quality of the source system was reasonable  (confirmed by IT) and the data warehouse was built. However, when users started looking at the data, they discovered that 20% of data was defective. To deliver accurate information, it took 70% of the total project time to fix the source system defects in the data warehouse. The cost (time and budget) of fixing defects in data warehouse is really high.

Assuming data quality can be managed “somehow” is a mistake and has to be addressed before the DW program starts. The source system data quality initiative need not start as a big bang or with the purchase of an expensive tool. It could be an initiative a step-by-step approach that can be automated with a tool once the initiative has matured in the organization. Data governance is another key to success for any data quality initiatives.

9. Customer satisfaction
The client sponsoring the DW project and the end users have to accept the solution which is being built by the implementation team; there is no doubt about this fact. At the end of the day, if the solution being built is successful, it has to be liked and should demonstrate value-add to the clients. The time and effort spent on a particular initiative should demonstrate value for money.

But a word of caution. In this process, the implementation team, which is most often a service provider company, offering offshore support as well, should not get into the “pleasing” mode with the clients and users. It might not be practically possible to implement the client’s entire wish list. This should be communicated in a strong but polite way. The requirements driving the DW initiative should be validated very critically so that the best solution can be built. What cannot be done should be communicated as clearly as you communicate what can be done.

10. Data migration
Data migration is a key element to consider when adopting any new system or when data is brought from a legacy system into a data warehouse. Data migration can be a fairly time-consuming process and can take from a few hours to weeks depending on the volume of data and involved processes.  When the migration process takes week, careful though should be given to playing catch up with moving source system data.

Data volume, value (value the existing data provide versus value in new systems), source and target structures, performance, storage media, multivendor (different source and target ) hardware, application downtime and migration process (manual versus automatic) will be considered in the careful design and development of the migration strategy. Also careful thought should be given to build the history rather simple migration required.

Data migration is not a simplistic task and if not given due consideration early in the process of developing or purchasing a new system it could become very expensive.

 

  • Narasimha Murthy
    Narasimha Murthy has 11 years of experience with Tata Consultancy Services, Ltd. For the last  eight years he has worked in the Business Intelligence and Performance Management Practice supporting various customers.

Recent articles by Narasimha Murthy


Related TechTarget Editorial Content


 

Comments

Want to post a comment? Login or become a member today!

Be the first to comment!