Foreign Key in Oracle Data Warehouse - Best Practice

My last article "Why should we create Primary and Foreign Keys (Constraints) in Database when data consistency is guaranteed by Application?" was well taken by many developers/DBAs, but some data-warehouse developers/DBAs did not receive it well. They argued that in Data-Warehouse some times it is impossible (very difficult) to create Foreign Keys on the Fact tables.

There were mainly following 2 reasons against Foreign Keys:
- Validation of Foreign Keys in Fact table is an overhead because ETL process has already verified the referential integrity of fact tables
- Foreign Keys in fact tables, that dimension tables need to be loaded first and then we can load fact tables. It restrict ETL process to load both Dimension and Fact tables in parallel.

Above both points are valid in-terms of ETL loads, but without Foreign key Oracle Optimizer will miss vital information. Foreign Keys enforce data consistency and validate fact tables has no other data than of dimension table. Foreign Keys helps Oracle Optimizer to generate best execution plan. It enables CBO to perform query transformation like "Join Elimination" and "Query Rewrite" on Materialized Views. Last but not the least, Foreign Keys are self documentation tool and having Foreign Keys enables developers to easily find tables relation.

So now the Million Dollar Question is, Does Oracle provides any solution where we can get all benefits on Foreign Keys and can avoid the restrictions like slowing down the ETL load?

Again the answer was already there in Oracle Documentation (Data Warehousing Guide):
https://docs.oracle.com/database/121/DWHSG/schemas.htm#DWHSG9053

There can be following states of any constraints
- DEFFERED / NOT DEFFERED - constraint checking can be deferred until the end of the transaction (COMMIT)
- ENABLE / DISABLE - An enabled constraint ensures that all new data modifications upon table satisfy the conditions of the constraints.
- VALIDATE / NOVALIDATE - Validated constraint ensure that all data that currently resides in the table satisfies the constraint.
- RELY / NORELY - The RELY state provides you with a mechanism to tell Oracle that a given constraint is believed to be true, even if it is in DISABLE and NOVALIDATE state.

RELY Constraints in a Data Warehouse are not used for data validation, but they:
- Enable more sophisticated Query Rewrites for materialized views.
- Enable Optimizer Query transformation (Join Elimination) by CBO.
- Enable tools to retrieve information regarding table relations directly from the Oracle.

Creating a RELY constraint is inexpensive and does not impose any overhead during ETL load. If Foreign Key is in RELY state, the Primary Key referred must also be in RELY state.

Now let's come to the practical part. Lets assume we have following basic Star Schema, each dimension table has Primary Key defined. Lets modify our tables as per suggestion of Oracle Data Warehousing Guide.


Step 1: Modify Primary Keys of all Dimension tables to enable RELY
nimish@garg> alter table d_date modify primary key rely;
Table altered

nimish@garg> alter table d_product modify primary key rely;
Table altered

nimish@garg> alter table d_supplier modify primary key rely;
Table altered

Step 2: Create Foreign Keys on Fact table with RELY disable novalidate
nimish@garg> alter table f_sales add FOREIGN KEY(d_date_id)
  2  REFERENCES d_date(d_date_id)
  3  RELY disable novalidate;
Table altered

nimish@garg> alter table f_sales add FOREIGN KEY(d_supplier_id)
  2  REFERENCES d_supplier(d_supplier_id)
  3  RELY disable novalidate;
Table altered

nimish@garg> alter table f_sales add FOREIGN KEY(d_product_id)
  2  REFERENCES d_product(d_product_id)
  3  RELY disable novalidate;
Table altered

The last thing to note here is to enable Query Rewrite and Join Elimination we need to set query_rewrite_enabled = TRUE and query_rewrite_integrity = TRUSTED for session or at system level. To conclude this article I must say RELY constraints are very powerful and unique feature of Oracle Database and is there from Oracle Database 8i (or before) but still very unknown. RELY constraints enables us to achieve all benefits on Foreign Keys and can avoid the restrictions which slows down the ETL processing.


Related Posts:
- Complex View Merging Transformation and ORA-00979 in Oracle 12c
- VIEW PUSHED PREDICATE - A Powerful Optimizer Transformation feature
- SQL Tuning: Partition Pruning and HASH JOIN - Real Life Example
- How to Get Execution Plan and Statistics of SQL Query
- Top 15 new features of Oracle Database 12.2 for developers

12 comments:

  1. Hi Nimish,

    Fisrt i would like to thank you to share this knowledge.
    I have query
    "How to Run Two different procedure inside different package at a time without DBMS_SCHEDULER job run."
    Please provide your suggestion .

    ReplyDelete
    Replies
    1. begin
      dbms_scheduler.create_job
      (
      .
      .
      job_type => 'PLSQL_BLOCK',
      job_action => 'begin pkg1.proc; pkg2.proc; end;',
      .
      .
      );
      end;
      /

      Delete
    2. Been on several projects where there were NO FK constraints. Mantra from java guys, "we handle integrity in the application(s), the db is just a data store ..." guess what? orphan records everywhere, data was a mess ... unreliable results from queries ...yada, yada, yada

      Delete
    3. as i mentioned in my last post: never trust application [developer] ;)

      Delete
  2. There is three models in data warehouse : ROLAP,MOLAP,HOLAP. for Relational OLAP(ROLAP) and Hybrid OLAP structure in Relational Databases you need relational model between fact tables and dimension tables as primary keys exists in dimension tables and foreign keys exists in fact tables. There is 3 Relational model : start schema , snow flake ,galaxy. These models are relational and relations between facts and dimensions is necessary so that use appropriate index for fields that used in relation are very important at Effitioncy and performance in Data warehouse environment

    ReplyDelete
  3. Data Warehouse Dimension tables should be used for ETL validation but take care permitting Foreign Keys on shared Dimension tables. If DDL column changes are required on those tables then the Foreign Keys constraints will have to be dropped before the table can be truncated and loaded. You can't truncate a table with Enabled Foreign Keys

    ReplyDelete
  4. When you say start schema the 'snowflake' way you are referring to a Star schema in the Data Warehouse terminology.

    ReplyDelete
  5. Very useful info. Hope to see more posts soon!. warehouse racking

    ReplyDelete
  6. Your work is very good and I appreciate you and hopping for some more informative posts. Thank you for sharing great information to us. facility management

    ReplyDelete
  7. I think this is an informative post and it is very useful and knowledgeable. therefore, I would like to thank you for the efforts you have made in writing this article. custom embroidered keychains

    ReplyDelete