by Prabu Varatharaju
What is Snapshot data?
Snapshot data is a copy of business data preserved at some point-in-time. In some cases we may not be able to recover older snapshot data from the current data sets because we need to preserve snapshots at regular (scheduled) intervals for historical analysis.
Some of the sample snapshot tables in the Oracle BI Applications Financials module:
- W_AP_AGING_INVOICE_A
- W_AP_AGING_SUPPLIER_A
- W_AR_AGING_CUSTOMER_A
- W_AR_AGING_INVOICE_A
These are key tables that are used to generate some important reports in the financials such as:
- Payables Aging Reports
- Receivables aging reports
- Trend Reports
- Invoice details by aging buckets
When do we go for fresh full loads in the data warehouse?
There are certain scenarios when the need arises to perform fresh full loads on an already loaded data warehouse. That could be due to any of the following reasons listed below
- Transaction source application upgrade
- Data warehouse application upgrade
- Data fixes or data consolidation in the transaction source system
- Extending data warehouse application(adding more modules)
What is the problem in restoring historical snapshots in Data Warehouse during full load?
In a transactional environment (OLTP), it’s pretty easy to restore snapshot data from backup after any system upgrade or revamp. However in a data warehouse environment (OLAP), it’s not as simple or straight forward to restore the snapshot data after a full load because all the old dimension keys preserved in the snapshot data become obsolete after a full load.
The problem is illustrated below using sample customer aging snapshot data:
Before Full Load
Customer Dim
Customer Aging Snapshot
After Full Load
Customer Dim
Note that after the full load, the key for Customer ‘C1’ is changed to 101 from 1. However, our snapshot backup still points to Cust Key ‘1’. Therefore the conventional data backup and restore concept may not work for snapshot data. To get this work, we can employ the following strategy:
Backup Snapshot Data With Native Codes
Create a backup table for snapshot data by converting Dim keys to native codes. We can use these native codes for lookup to get the dimension key while restoring snapshots.
Example:
Customer Aging Snapshot Backup
Extract-Transform-Load (ETL) Steps
The ETL step of creating snapshot backup tables is exactly the reverse process of populating the snapshot tables. All Dim keys from snapshot tables are converted back to native codes by lookups into the dimension table, comparing dim keys and pulling out native codes from the lookup table. See the screenshot below for more information:
Restore Snapshot Data After Full Load
After a full data warehouse load we can recover all the snapshot data from the backup table by reverting back the native codes with new dim keys. The screenshot below illustrates this process...
Customer Dim (After Full Load)
Customer Aging Snapshot (Recovered)
Conclusion
The Importance of this type of design is not recognized in the beginning of the project because the need arises only when we plan to reload the entire data warehouse. Engaging in this activity at a later stage increases the complexity because detailed analysis, design, development & re-testing of all the data and impacted reports needs to be conducted a second time. However, with proper planning and impact analysis, this can be carried out successfully.
Prabu Varatharaju is a Senior Consultant and Informatica ETL expert at KPI Partners who specializes in Oracle Business Intelligence Enterprise Edition (OBIEE), Oracle BI Applications. Check out Prabu's blog at KPIPartners.com. |