Methodologies for Data Warehousing

Businesses and institutions must collect and store temporal data for accountability and traceability. This paper highlights an approach to dealing with transaction lineage that considers how data can be stored based on timestamp granularities and methods for refreshing data warehouses with time-varying data via batch cycles. Identify three ways transaction lineage can be used and how this is relevant to temporal data. What industries do you think transaction lineage will always be relevant in? How?

Abstract

Data warehouse applications and business intelligence (BI) communities have been increasingly feeling the need to collect temporal data for accountability and traceability reasons. This necessitates building capabilities to capture and view transaction lineage in data warehouses based on timestamps. Data as of a point in time can be obtained across multiple tables or multiple subject areas, resolving consistency and synchronization issues. In this article we discuss implementations such as temporal data update methodologies, coexistence of load and query against the same table, performance of load and report queries, and maintenance of views on top of the tables with temporal data. We show how to pull data from heterogeneous sources and how to perform join operations with inequality predicates with more than one source table and then load them in analytical subject area tables maintaining transaction lineage. We propose several business views based on timestamp filters for use by different applications.


Source: Nayem Rahman, https://quod.lib.umich.edu/j/jsais/11880084.0002.103/--temporal-data-update-methodologies-for-data-warehousing?rgn=main;view=fulltext
Creative Commons License This work is licensed under a Creative Commons Attribution 3.0 License.