• Products
    • Observe

      Keep tabs on your world in real-time, reducing a cacophony of activity from massive amounts of raw content into orderly, easy to consume data

    • Orient

      Our low-code, composable, distributed, and event-driven predictive analytics workflow engine mines for actionable insights hidden in data at any scale

    • Dominate

      This insight interaction and decision support application takes you beyond self-service, visual exploration of piles of insights & provides auto-ML powered actionable, scenario-specific advice to achieve the goals you specify

  • Markets
    • Government

      At BigBear.ai, we take pride in providing mission-critical services and solutions to our government customers. This includes Data Analytics, Systems Engineering, and Cyber services.

    • Commercial Markets

      BigBear.ai provides innovative AI solutions to companies across commercial markets, ranging from Space to Media to Shipping and Transportation.

  • Company
    Just Announced
    Press Release
    BigBear.ai Names Former Intelligence Officer Tony Barrett as President of Cyber and Engineering Sector
    • Col – 1
      • About

        We help governments and businesses make the decisions that change markets and define outcomes

      • Investor Relations

        We operationalize artificial intelligence and machine learning at scale through our end-to-end platform

      • Partners

        We work with our technology partners to customize products for the private and public sectors

    • Col – 2
      • Team

        Our executive team brings decades of world-class experience

      • Newsroom

        News articles from our press room

  • Careers
    Apply Now
    AWS Principal Architect
    Location:

    Columbia, Maryland

    Description:

    Our AWS CCoE is a collaborative, diverse, and highly-capable team which is key to Company growth. We are seeking an AWS Cloud Architect to be a key member of our AWS Cloud Center of Excellence. This position will be based out of Columbia, MD and is fully remote.

    Category: Engineering

    • Col – 1
      • Explore Jobs
        • Returning Applicant Login
        • Current Employee Login
      • Culture
      • Benefits
      • Military and Veterans
      • Product and Technology
      • Join Our Talent Community

        Sign up to receive personalized alerts and stay up to date on job openings right for you

  • Resources
    Featured
    Video
    Panel Discussion: AI/ML Applications to Support DHS
    • Col – 1
      • Blog

        See what’s new in artificial intelligence, machine learning, and data analytics

      • Resource Library

        Browse our resource library and discover more about our products and solutions

      • Newsroom

        News articles from our press room

  • Contact
BigBear.ai
  • Products
    • Observe

      Keep tabs on your world in real-time, reducing a cacophony of activity from massive amounts of raw content into orderly, easy to consume data

    • Orient

      Our low-code, composable, distributed, and event-driven predictive analytics workflow engine mines for actionable insights hidden in data at any scale

    • Dominate

      This insight interaction and decision support application takes you beyond self-service, visual exploration of piles of insights & provides auto-ML powered actionable, scenario-specific advice to achieve the goals you specify

  • Markets
    • Government

      At BigBear.ai, we take pride in providing mission-critical services and solutions to our government customers. This includes Data Analytics, Systems Engineering, and Cyber services.

    • Commercial Markets

      BigBear.ai provides innovative AI solutions to companies across commercial markets, ranging from Space to Media to Shipping and Transportation.

  • Company
    Just Announced
    Press Release
    BigBear.ai Names Former Intelligence Officer Tony Barrett as President of Cyber and Engineering Sector
    • Col – 1
      • About

        We help governments and businesses make the decisions that change markets and define outcomes

      • Investor Relations

        We operationalize artificial intelligence and machine learning at scale through our end-to-end platform

      • Partners

        We work with our technology partners to customize products for the private and public sectors

    • Col – 2
      • Team

        Our executive team brings decades of world-class experience

      • Newsroom

        News articles from our press room

  • Careers
    Apply Now
    AWS Principal Architect
    Location:

    Columbia, Maryland

    Description:

    Our AWS CCoE is a collaborative, diverse, and highly-capable team which is key to Company growth. We are seeking an AWS Cloud Architect to be a key member of our AWS Cloud Center of Excellence. This position will be based out of Columbia, MD and is fully remote.

    Category: Engineering

    • Col – 1
      • Explore Jobs
        • Returning Applicant Login
        • Current Employee Login
      • Culture
      • Benefits
      • Military and Veterans
      • Product and Technology
      • Join Our Talent Community

        Sign up to receive personalized alerts and stay up to date on job openings right for you

  • Resources
    Featured
    Video
    Panel Discussion: AI/ML Applications to Support DHS
    • Col – 1
      • Blog

        See what’s new in artificial intelligence, machine learning, and data analytics

      • Resource Library

        Browse our resource library and discover more about our products and solutions

      • Newsroom

        News articles from our press room

  • Contact
Home Blog Oracle Tips and Tricks to Enhance ETL Loads
Blog

Oracle Tips and Tricks to Enhance ETL Loads

Alex Franjie
March 22, 2022
  • Share
  • Share

Oracle Tips and Tricks to Enhance ETL Loads 

Some of the biggest problems I ran into as an ETL developer were the loading of large tables. Loading large tables is an issue because the ETL load can go into core business hours, making the data unavailable for customers when doing nightly loads or full refresh of large tables. 

The first step I would consider is to optimize both the select and insert statements, for which we can use the /* parallel */ hint to do some parallel read and write. However, be careful while using that hint as you can reach your CPU limits, which can slow down the query instead of speed it up, proving that understanding the environment is crucial. Then, to further optimize the inserts, using the /* append */hint is vital and can save you time on writing to the table.  

Indexing fields that are being joined on in-source tables and any columns used to filter data being ETL’ed can significantly enhance the performance of the select. Disabling indexes, triggers, and constraints on the target tables can improve the performance of the rights, then enabling them after ETL is complete; however, that should not always be done, and a rebuild of the indexes is required periodically. Utilization of partitioning on the target tables would allow you to run multiple inserts into the different partitions simultaneously. You can also achieve asynchronous ETL in Oracle by utilization of DBMS_SCHEDULER.  

Change Data Capture (CDC), which is not a well-known technology for many developers, is a handy tool. CDC is achieved in Oracle using Oracle LogMiner, which essentially reads the transaction logs and provides all the Inserts, Updates, and Deletes on each table since the last time the logs were read. As a result, users can perform close to real-time updates to tables. However, there is a catch with CDC, if the system does a lot of bulk transactions, there are not enough REDO logs, and/or it isn’t done enough, you may miss some transactions due to REDO logs being overwritten. Therefore, an on-demand or weekly data refresh must be done to obtain missed transactions. An alternative method is to use Oracle’s exchange partition, which would allow you to load the data into a different table and swap the partition with the new table. After taking these steps, you can almost instantaneously get the refreshed data in without affecting the user community. 

Final hints that ETL developers to consider: Estimate what the size of the target table is going to be and how fast it’s going to grow. You can then create tables and set the INITIAL extent and NEXT extent to match your predictions. When loading in Oracle, if the tables have to extent storage multiple times during the load, it slows things down. For example, if I anticipate that the first load is 2GB and will grow 100MB nightly, then my INITIAL extent would be 2GB, and my NEXT would be 100MB. Ask DBAs to increase the REDO log size and Block Size of the database, especially for data warehouse databases. 

 

About the Author 

Alex Franjie is Section Manager/Senior Data Architect at BigBear.ai and has 10 years of IT experience designing and implementing innovative business solutions. He has a wide range of skills and experiences in enterprise IT system solutioning for on-prem, cloud, and hybrid environments. In addition, he specializes in database design, including the creation of Data Warehouses. 

Leave a comment Cancel reply

Your email address will not be published. Required fields are marked *

BigBear.ai

2022 BigBear.ai • All Rights Reserved.

  • Privacy Policy
Corporate Headquarters

6811 Benjamin Franklin Drive, Suite 200
Columbia, MD 21046
Tel: 410.312.0885 • Email: [email protected]

CareersContact