By 2025, IDC predicts worldwide data will grow 61% to 175 zettabytes, with as much of the data residing in the cloud as in data centers.
Data retention policies based on value, risk, and compliance requirements are gaining significance as emerging technologies such a Big Data, Analytics, IoT etc have gained prominence for enterprises. The sheer amount of aged data and rapidly increasing new data from a variety of sources stored both within on-premises environments and in the cloud creates severe governance, security, and management challenges.
“Storing Everything” is not an effective information governance practice & maybe the reason for, slowing down your Database Application performance, increasing costs, and increasing time for maintenance activities such as patching, backup recovery, cloning, etc.
Storage cleanup is key to the success of your EBS upgrade and data migration to cloud. With that in mind, it’s inevitable for enterprises to be challenged over time with a growing mountain of information assets and unnecessary data or code, much of which is no longer of value to the business. Whether hosted on-premises or in an as-a-service platform, this growth impacts application performance, increases the backup time and creates greater licensing costs for infrastructure, databases, or subscriptions. To stress the point further, storage cleanup is crucial as data growth significantly impacts the performance of applications, and in some cases, organizations need to purchase more storage to accommodate this seemingly never-ending data growth.
Benefits of Archive and Purge
Storage cleanup can be effectively achieved through data archive and purge. Data archiving and purging deletes obsolete data to maximize the utilization of existing resources and minimize the time to run maintenance activities on relevant data only.
- Reduce downtime of the EBS applications during the production upgrade
- Purging results in fewer records in key tables, thus the upgrade scripts run faster
- Purging results in a smaller database, so other tasks that occur during the upgrade such as periodic backups take less time
- Improved user experience after the upgrade of Oracle EBS applications
- Improved application performance
- Improved performance of concurrent programs
- Potentially better results from user testing
- Close business periods faster
- Reports potentially have less obsolete data
- Potentially remove duplicate records
- Increased system availability and reliability
- Improved maintainability during and after the upgrade of Oracle EBS applications
- Faster cloning
- Faster backups
- Faster patching
- Removes obsolete and unnecessary files from the file system
- Removes obsolete and unnecessary objects from the database
- Scope of exercise includes partitioning large tables that can not be purged, this, in turn, improves performance and potentially reduces downtime of the EBS applications during the production upgrade
- Reduces storage consumption
- Reduces costs
- Potentially allows for additional instances to support the upgrade project
- Potentially reduces liability by removing obsolete or inactive sensitive historical records which are not required for regulatory compliance
- Eliminates the possibility of remediating obsolete CEMLI code
Best Practices for Running Archive and Purge on your Growing Data
1. Determine the Retention Period Based on the Business Requirements and Rules and Policies Used
A data retention policy is part of an organization’s overall data management. A policy is important because data can pile up dramatically, so it’s crucial to define how long an organization needs to hold on to specific data. If this important piece on archive strategy is missing in your organization, you can Download the free “Toolkit: Sample Data Retention Policy” and begin here.
Different data types should have different retention periods. How Long Should Data Be Kept? A good rule of thumb is that data should be kept only as long as it’s useful (i.e., for your business needs) and as short a time as required (i.e. according to laws and regulations). Retaining data longer than necessary takes up unnecessary storage space, costs, and decreases the performance of your application systems more than needed.
Data retention is defined by legal and legislative needs, internal audit needs, and functional business needs. The Data Owners are the ultimate decision-maker to understand these needs and address them.
2. Determine Whether to Purge or Archive
The policy should also explain who is responsible for each category of data, and if data that is no longer needed should be archived or deleted. The frequency will be determined on the basis of the relevance of the data with respect to the business / legal needs along with determining the data growth rate, IT will have to determine the storage or processing speed of the infrastructure being used. The execution of this activity can be done through an experienced DBA, however, if the data is mission-critical, it is highly recommended to include an expert database vendor to help you assess and plan this Archive and Purge strategy to success.
3. Make Sure Replication is not Enabled During the Window When these Scripts are Run
The state of a replication rule can be enabled or disabled. If a replication rule is disabled, the data to which the rule is assigned is not replicated. Disabling replication for a client node causes the server to skip the node during replication processing.
4. Run these During off Peak Load Hours, Oracle Recommends these too
Archive and purge could be a resource (like CPU) intensive. The most basic performance tuning strategy for archive processing is to reduce competition from normal application processing. For increased performance, run archive processing during off-peak or maintenance hours. You can further speed archive and delete processing by increasing the number of database connections. In cases where there is no competition for the database, you can remove any unnecessary constraints from the database. Another way to enhance efficiency is to design the partitioning of your database tables so that you can process the rows more easily.
5. If Archiving is Required, Make Sure there is Enough Disk Space Available on the Database Server
Data would be moved to archive tables instead of simply purging, archival space should be equal to or greater than the current table’s storage.
6. Plan Your Purging Strategy Since Purging Requires a Significant Amount of Time
If there are millions of rows that need to be deleted or copied from the database, would take a significant amount of time to complete the archive and Purge process. Pre-planning, executing, and post-clean-up check and validation for accuracy & integrity of your existing system & reports, significant time must be planned for this exercise and the appropriate skills of the DBA or the expert vendor. A database expert vendor can bring in the necessary skills, tools, and also advise cost-effective methods to accelerate this process to success.
7. In a Multi-Data Center, it is Recommended that You Run Purges at Low Data Flow
In a Multi-data center, it is recommended that you run purges at low data flow since the data in tables is replicated. You should consult your database administrator if you have multidimensional clustering (MDC) set up and require purging.
8. Custom Purging Scripts only Include the Tables used by the Standard Purging Scripts Provided. Oracle Recommends this too
The alterations to the provided purge scripts can include parameterization for the user ID. Such alterations should be thoroughly tested before being used in production to ensure they function as expected.