Business Intelligence Introduction
1. What is Business Intelligence (BI)?
1) General observation
Companies use a myriad of applications and tools to manage their day-to-day business (CRM, ERP, Office Suite). While each of these applications can store, analyse or modify certain types of data, they are not necessarily compatible with each other. In addition, each service, team or department can use a range of applications, sometimes different from the other entities in the company. The volume of data collected or created, the lack of standardisation and the multiplicity of applications used make it difficult for the company’s decision-makers to exploit and analyse the data globally.
Business Intelligence (BI) refers to the means, tools and methods used to collect, consolidate, model and restore data, whether tangible or intangible, from a company in order to provide decision support and enable a decision-maker to have an overall view of the activity being processed.
After a presentation of the different aspects of BI, we will show how the V-cycle methodology and Agility can be applied to a BI project.
2. The decision-making information chain and its components
1) The decision-making information chain
The decision-making information chain consists of different phases:
1. The gathering phase
The gathering phase consists of detecting, selecting, extracting, transforming, and loading into a data warehouse (DWH) all the raw data from the various information storage sources (databases, flat files, business applications, etc.).
This phase is generally carried out using an ETL (Extract, Transform, Load) tool. Thanks to connectors, the ETL can extract a large set of data of different types, and then, thanks to transformers, manipulate these data to aggregate them and make them consistent with each other. We will see later all the functionalities and advantages of ETLs.
2 The modeling phase
Once the data has been centralised, the modelling phase consists of storing and structuring the data in a unified space (the data warehouse) so that it is available for decision-making use. This phase is also carried out using ETL tools via connectors that allow writing to the data warehouse.
The data can again be filtered and transformed to ensure the consistency of the whole in the data warehouse. Finally, during this phase the stored data can be pre-processed via calculations or aggregations to facilitate access to analysis tools.
3. The restitution phase
The restitution phase aims to make the data available to users by considering their profile and their business needs. Direct access to the data warehouse is not allowed since the objective is to segment and distribute the data collected so that it is consistent with the user’s profile and easy to use.
During this phase, new data calculations can be made to meet specific user needs. There are many different tools available in the output phase. They may include reporting tools, portals for accessing dashboards, tools for navigating in OLAP cubes (or hypercubes) or statistical tools.
4. The analysis phase
In the analysis phase, end users will analyse the information provided to them. Usually, data is modelled by query-based representations to build dashboards or reports via business intelligence tools (Power BI, Tableau, Qlikview, etc.).
The aim of this phase is to provide the user with the best possible assistance in analysing the information provided and making decisions. This includes controlling access to reports, supporting queries and visualising results.
2) Focus on ETL (Extract, Transform, Load)
As presented in the decision-making information chain, the data processed can come from many different sources (ERP, social networks, CRM, emailing software, connected objects, etc.). The constant increase in the amount of data collected can of course be an opportunity for companies, considering that they know how to exploit it. ETL tools meet this need thanks to:
- Extracting data from different information sources: using connectors or APIs. Please note that we are talking about existing data in the company’s IS only. ETLs are not intended to collect new data. Moreover, it is possible to use business rules to extract only part of the data.
- The transformation of this data: The multitude of data sources and formats explain the difficulty for companies to exploit them. The ETL will transform the raw data collected to clean it, conform it, standardise it, document it, correct it and duplicate it if necessary. This is a crucial step that will make all the data compatible with each other and conform to the target format (defined according to the use cases that the company wants to implement in the DWH). In other words, this operation consists of aggregating data tables to create “super-tables”.
- Loading the transformed data into the DWH: Once the data has been transformed, it is sent to the DWH. Unlike a data lake, which comprises a set of unaggregated/structured data, the data loaded into the DWH is organised and structured on the basis of a common language.
The main advantages of using an ETL :
- Improve the exploitation of company data, even in complex environments with multiple data sources
- Coupled with a DWH, they provide an exhaustive view of all the data available to the company
- Improve the efficiency of IT teams, as ETL processes automatically feed the DWH in real time (less scripting, code generation)
- Increase the agility of the IT architecture, as ETLs can easily interface with any type of data source
- Increased responsiveness, thanks to the additional services offered with ETL solutions, such as sending alerts to warn administrators of problems
- Ease of use of the tools thanks to the graphic representation of the scripts used
3) Focus on Data Warehouse and Datamart:
Data Warehouse: A place for storing all the data used by the decision support information system. It enables decision support applications to benefit from a source of homogenous, common, standardised, and reliable information. In addition, the data warehouse ensures a watertight seal between the operational system and the decision support system, and therefore reduces the risk of decision support tools affecting the performance of the operational system in place. The data warehouse must follow several key principles:
- Be business-oriented: The structure of the data warehouse must be designed according to the needs of the users.
- Non-volatile: Data should never be rewritten or deleted; it is static, and users have read-only access
- Integrated: The data warehouse contains most, if not all, of the company’s data, and these must be reliable and consistent with each other
- Historical: All additions/changes in the data warehouse must be recorded and dated
Datamart: Small “shops” of data which together form the data warehouse. They are a subset of the data warehouse and therefore follow the same key principles. The difference between the two is that the Datamart meets a more specific business need than the data warehouse.
There are two main approaches to data warehouse modelling:
- The Kimball method: A so-called bottom-up approach in which data marts are first formed according to the activities or entities of the company. There could therefore be a data mart for finance, one for sales and another for human resources. The information within these data marts is not standardised. The Datawarehouse is then designed, which is the combination of different Datamart.
- The Inmon method: A top-down approach in which the Datawarehouse is formed first with all the company’s available data. The data marts are then designed according to the company’s areas of activity or entities.
In parallel with his work on data warehouse modelling, Kimball also proposed a BI project management approach in his book The Data Warehouse Lifecycle Toolkit (1998). He proposes three concepts that will be developed later in the context of the V cycle or agile methods.
4) Focus on data visualisation (dataviz)
Data visualisation consists of visually representing raw data, in an interactive approach, with the aim of simplifying its handling, understanding and control. This approach marks a desire for openness: data (and its analysis) is no longer a ‘closed’ domain, exclusively reserved for experts (data scientists, data analysts, etc.), but is open to a wider range of professions, which are less familiar with it and more distant from its inherent complexity.
The ever-increasing volumes and variety of data available to companies and the demand for more flexible and rapid analysis mean that the data management strategy, and consequently the tools made available, must be reviewed.
Dataviz tools are part of this reflection because they aim to simplify the accessibility, understanding and interpretation of data arriving at the end of the decision-making chain.
A self-service approach to analysis is allowed, the user can choose the angle from which he wants to study the data, he is given autonomy and flexibility on the measurements, dimensions, and formalisation of the representations (centred around the visual and schematisation of the data), which he can modulate at his ease.
In a few words, data visualization tools allow to:
- Centralise the display of data in one place (accessibility)
- Combine and cross-reference data from various sources
- Understand the data simply and quickly by making sense of it
- Create and share customised dashboards
With the aim of:
- Measuring performance and identifying notable trends
- Helping decision-makers in their strategic orientation and fostering innovation
- Optimising organisations and turnover
Data visualisation is an essential component in the decision-making information chain. It is an integral part of a company’s strategy for making data available and analysing it. Companies with a well thought-out and lasting ambition around data are keen to use data visualisation tools.
A clear craze has been observed among software publishers, with the multiplication of accomplished, more or less perfected data visualization solutions on the market. Most of the historical data/BI publishers (IBM, Microsoft, etc.) are now also completing their offer with data visualisation tools.
This methodological approach, beyond the tools, is perfectly in tune with the times and contributes to the democratisation of data.
3. Why is it called a Business Intelligence (BI) project?
1) Specificities of the BI project:
Most of the time, BI projects are carried out within the organisations’ IT departments. It is also common for a specific team to be dedicated to BI in companies where such projects are carried out. The transversality of the project and the heterogeneity of the needs are two characteristics of a BI project. While these characteristics may be true for other IT projects, they are systematically true for BI projects.
Transversality and globality of the project: One of the specificities of a BI project is its transverse nature within the organisation. Indeed, the areas of use of BI affect all the company’s business lines, from finance to logistics, via sales and human resources. To guarantee its success, a BI project must therefore be carried out globally, with all the business lines affected, to ensure that all their needs are taken into account. The project team must therefore have a very good knowledge of business processes and needs, and collaboration between the IS and the business is one of the keys to the success of BI projects.
Heterogeneity of needs: Unlike some IT projects, and as seen previously, BI impacts all the company’s businesses. However, each of these businesses will have specific data and information needs. One of the difficulties of BI projects comes from the fact that the business needs are numerous and heterogeneous. This implies a dependence on different external data sources that the BI project manager will have to address.
2) Key success factors for a BI project: The GIMSI method
As we will see later, the method is not incompatible with the V-cycle or Agility. Rather, it is a methodological framework that formalises the conditions for a successful BI project. The meaning of the acronym GIMSI is as follows:
- Generalization: the method can be applied to different areas (administration, marketing, production) and by all types of organization (SME, association, large company)
- Information: access to the ‘right’ information and the pillar of decision making
- Method and Measurement: GIMSI is a method, the principle of which is Measurement
- System and Systemic: the objective of the method is to set up a Decision Support Information System integrated into the company’s Information System
- Individuality and Initiative: the method encourages the autonomy of individuals to take the initiative
The GIMSI method is based on 4 phases:
- Identification: What is the context?
In this phase, the following are studied:
- The external environment of the company: competitive environment, politics, economy, technology
- The internal environment: company strategy, organisation, which processes or activities you want to include in the project (logistics, sales, HR, etc.)
- Design: What needs to be done?
In this phase, all the solution design activities are carried out:
- The objectives of the BI project: what is the purpose of the project, what are the expected benefits
- Information structure: what information will be available for each selected process and in what format?
- Data requirements: Identification and collection of data required to provide the information defined above
- Overall coherence: does the overall design achieve the objectives? Are there redundancies in the data collected?
- Implementation: How to do it?
In this phase, the solution is implemented with:
- The choice of BI software to support the project
- Deployment of the software in the company
- Continuous improvement: Does the system always meet expectations?
Following deployment, the system and new user needs are regularly analysed to ensure that the solution implemented meets these new needs.
4. To conclude:
Business Intelligence (BI) provides companies with tools to facilitate strategic or operational decision-making. It uses data analysis as a basic principle.
The data is extracted, modified, and loaded using a specific tool: the ETL. It is then stored in a data warehouse to obtain an intelligence that is presented via restitution tools. This is what is known as the decision-making information chain.
Decision-making projects are characterised by their cross-functionality, as they involve several business lines and departments within the company. The data used is often heterogeneous according to user needs, which implies a dependence on external sources.
Headmind Partners “Data Solution” consultancy service helps customers with their digital transformation around business intelligence. Our consultants provide their expertise to implement BI tools adapted to our customers’ environment and business problems.
Website: Eiden, Florian ” Gestion de Projet Décisionnel – Méthodes Agiles ou Cycle en V ? BI is winning you over! .2012. [Accessed 05/08/20]. Available at: https://fleid.net/tag/cycle-en-v/
Website: Eiden, Florian ” BI project management: keep your users close to you! BI is winning you over .2012. [Accessed 05/08/20]. Available at: https://fleid.net/2011/12/20/gestion-de-projet-decisionnel-gardez-vos-utilisateurs-proches-de-vous/
Website: Boyer, Clément ” Lecycle en V ” SupInfo International University .2017. [Accessed 05/08/20]. Available at: https://www.supinfo.com/articles/single/6278-cycle-v
Course: RIGAUD, Lionel. BI Project Management. Business Intelligence Option, CY TECH (EISTI), 2012
Book: Kimball, Ralph “The Data Warehouse Lifecycle Toolkit” 1998
Deltil, E., & Pereira, G. “BI – Business Intelligence: Presentation” Université de Marne La Vallée. 2017. Available at: https://www-igm.univ-mlv.fr/~dr/XPOSE2006/DELTIL_PEREIRA/presentation.html#need
Cartelis. “Comprehensive ETL Software Comparison: Cloud vs. On-Premise vs. Open Source” Cartelis. 2018. Available at: https://www.cartelis.com/blog/comparatif-logiciels-etl/
Supinfo School of Computer Science. “Understanding the steps of a BI process” Supinfo. 2017. Available at: https://www.supinfo.com/articles/single/3548-comprendre-etapes-processus-bi
Fernandez, A. “The GIMSI method” Piloter.org. 2018. Available at: https://www.piloter.org/mesurer/methode/fondamentaux_gimsi.htm
Fernandez, A. “The new managerial dashboards: the whole decision-making project”. 2011.
Wikiversity. “Strategic project management: Decision-making project” Wikiversity. Available at: https://fr.wikiversity.org/wiki/Gestion_de_projet_strat%C3%A9gique/Projet_d%C3%A9cisionnel
Lau, S. & Sabatier, J. “Business Intelligence: The place of BI and the management of BI projects in large French organisations. Business Intelligence: Place de la BI et pilotage des projets décisionnels dans les grandes organisations françaises” CIGREF. 2009. Available at: https://www.celge.fr/wp-content/uploads/2015/09/Business_Intelligence_CIGREF_2009.pdf
Sodifrance. “Stages and concepts of a BI project” Sodifrance. 2009. Available at: https://blog.sodifrance.fr/les-etapes-et-notions-dun-projet-bi-2/
Deltil, E., & Pereira, G. “BI – Business Intelligence: The steps of the process” Université de Marne La Vallée. 2017. Available at: https://www-igm.univ-mlv.fr/~dr/XPOSE2006/DELTIL_PEREIRA/process.html#collection
Lecomte, S. “BI project: 4 steps to success” Alphalyr. 2019. Available at: https://alphalyr.fr/blog/projet-bi-etapes-pour-reussir/
Cartelis. “Data Warehouse Architecture – Traditional vs. Cloud Approaches” Cartelis. 2018. Available at: https://www.cartelis.com/blog/architecture-data-warehouse/
Tehreem, N. “Data warehouse concepts: Kimball vs. Inmon approach” Astera. 2020. Available at: https://www.astera.com/fr/type/blog/data-warehouse-concepts/