Designing BI Solutions in the Era of Big Data
3. Proposed Methodology
3.2. Guideline
In this section, the detailed guideline that describes proposed model is given.
Step 1 EA fulfilment: According to the structure defined in The Zachman Framework: The Official Concise Definition, Zachman EA must be completed by rows, where each row represents a top level with respect to the one that follows in order, nevertheless there exist a big dependency among each of the elements of the columns. Table 1 shows proposed dependencies between cells. The order, in which cells must be fulfilled, depends on the relationship between them.
Fig. 1. ELTA Proposed Model
Step 2 Extract and Load Processes: Based on the information defined in the EA from the step 1, the IT users can extract all necessary for business information from heterogeneous data sources and load it in a big data storage. This step should be implemented by the IT users.
Table 1. Proposed rules to fulfil Zachman EA.
|
What |
How |
Where |
Who |
When |
Scope Contents |
A1 |
B1 |
C1 |
D1 |
E1 |
Business Concepts |
A2=(A1) |
B2=(B1+A2) |
C2=(C1+B2) |
D2=(D1+B2+C2) |
E2=(E1+A2+C2) |
System Logic |
A3=(A2+B2+F2) |
B3=(B2+F2) |
C3=(C2+A3+B3) |
D3=(D2+F2+B3) |
E3=(E2+B3+C3) |
Technology Physics |
A4=(A3) |
B4=(B3+A4) |
C4=(C3+A4+B4) |
D4=(D3+A4+B4) |
E4=(E3+D4 ) |
Step 3 Management Control Tool: The main goal of this step is to define all necessary information for the decision making process. The idea is to use data from big data storage to create new global indicator for the Balanced Scorecard perspective, as it is described in Compensatory Fuzzy Logic Uses in Business Indicators Design. This assures reduction of the gap between strategical and tactical levels, because it is possible to know, how to link each indicators from the different management levels and improve the enterprise knowledge. Methodology includes one step with Principal Component Analysis (PCA) in order to discover the correlation among the whole indicators. This step should be performed by the business users.
Step 4 Transformation Process: The main goal of this step is to properly transform all data based on the necessity of information for the decision making process. Based on the big data storage from the step 2 of current guideline and the indicators defined in the step 3, it is possible to know which transformations are necessary to support the entire business report requirement. This step should be implemented by the IT users.
Step 5 Virtual Data Mart Layer: The main goal of this step is to define several virtual data marts in accordance with the business report requirements. In-memory approach is used to accelerate creation and usage of data mart. Such solution is bringing more flexibility and unprecedented performance due to its in-memory nature.
Step 6 Develop BI System: The main goal of this step is to develop a BI system. Based on the data marts structure is necessary to define the OLAP schema and business users defined reports. There are big variety of available tools for building BI solutions. And one of the most popular is Pentaho BI Suite. Pentaho is popular due to its BI features and licensing policies. Although, according to the authors experience, it is possible to achieve great flexibility in BI solution by combining Pentaho BI Suite with others tools, like Birt Report.
Step 7 Analysis: The main goals of this step is to analyse most parts of the available information to support decision making process and discover new pattern in the business by using data mining techniques, it will help in: (a) redefining the indicators in the Balanced Scorecard (in case it's necessary) and (b) support the decision making process. For this step, any 3rd party external tool like Weka, or integrated into big data platform analytical facilities, can be used.