There has been much chatter in the NonStop space over the years regarding whether auditing Enscribe files and SQL/MP tables 1) improves overall application and file system performance and 2) improves overall file system data integrity and reliability. The good news? The answer to both is a resounding YES! On top of that, auditing allows your application data changes to be recorded in the TMF audit trail, and then become the source for HPE Shadowbase replication. We suggest skeptics reference our website for more information, and to be amazed at the benefits that TMF-auditing can provide for you and your organization. For more information, please reference our white paper: Only the Truth: Debunking TMF NonStop Data Protection Myths.
Question: How much more audit trail capacity will I need if I change all of my tables and Enscribe files to not use audit compression?
Answer: Audit compression is file label attribute that activates a TMF/file system method to reduce the size of UPDATE events stored in the audit trail. With Enscribe for example, it only saves the changed bytes; normally the entire before and after images are saved. For SQL/MP, it saves the columns listed in the UPDATE statement rather than the before and after images of all columns.
To understand the effect changing the AUDITCOMPRESSION setting will have on your audit trail data generation, it depends on the difference in size between the compressed updates and the uncompressed updates. If the uncompressed updates are so much larger than the compressed updates that it would cause audit retention issues, then you would need to increase the audit capacity on the system until it is no longer an issue. You can run a simple test with your files/tables audit-compressed by checking on the total TMF audit generated, and rerun the test with your files/tables not audit compressed. In general, using audit compression uses a small amount of additional CPU for the compression work.
Note that replicating audit compressed data can present an issue if you are using certain HPE Shadowbase features, for example the “insert-not-found” (INF) capability. When enabled, INF will effectively convert an UPDATE into an INSERT when the target record or row does not exist. The problem with a compressed update is that not all of the record or row data will be present in the target INSERT when INF is enabled. Hence, for this reason, when using audit compression, either all target columns must have a DEFAULT clause or audit compression must be turned off.
Question: Does Shadowbase software pin audit trails?
Answer: No it does not. We understand that RDF can pin audit trails (for example, if it is behind and the audit is going to scratch). However, the Shadowbase engine and other third party replication engines cannot pin audit trails at this time. It will require an implementation of an HPE RFE for Shadowbase software to support pinning TMF audit trails. Check with the HPE TMF Product Manager for more information.
Question: What are the dangers of using audit compression with data replication?
Answer: Audit trail/transaction log-based data replication products such as HPE Shadowbase software rely on data change information written into the audit trail (by TMF on HPE NonStop systems). For example, when an application performs a database Update operation, the audit trail contains the before and after images of the updated records. Shadowbase replication uses this information to correctly reflect the Update operation in the target database, keeping the source and target databases consistent.
On HPE NonStop systems, there is an option to compress audit data. When a file/table with specified audit compression is updated, only the changed data itself is written to the audit trail, in the form of an “audit trail fragment,” meaning the entire record is not written to the audit trail. Only the changed bits are written, because such fragments should consume less audit storage, thereby saving disk space (on the assumption that disk storage is expensive).
Shadowbase replication can properly replicate compressed audit trail data from a NonStop source to a similar NonStop target database (i.e., Enscribe to Enscribe, SQL/MP to SQL/MP, or SQL/MX to SQL/MX). This is a common architecture for normal business continuity replication. Audit Compression issues actually only occur for various forms of dissimilar (heterogeneous) replication.
However, the use of audit compression may not be beneficial, and can cause issues when using audit trail based data replication.
To try and resolve this issue, Shadowbase has a “Fetchsource” feature. When Fetchsource is enabled, under the conditions described above when audit compression is active and the audit trail does not contain sufficient information to insert a record or row into the target database, Shadowbase on the target system will communicate across the network with Shadowbase on the source system, which reads the current values of the missing columns and returns that information to the target. Shadowbase on the target system will then use that information to assign values for the missing columns on the Insert operation.
However, the Fetchsource operation doesn’t necessarily resolve the inconsistency either. In the time between the completion of the original transaction which updated the table on the source system, and Shadowbase attempting to apply that transaction on the target system, those missing columns could have been updated by subsequent transactions on the source system such that the column values returned by the Fetchsource operation do not reflect the values of those columns at the time of the original transaction, and thus the databases will remain inconsistent. Of course, when those subsequent source transactions are applied to the target database, the “offending” table will be made consistent with the source database (“eventually consistent”). But in the meantime the incorrect data may have been read on the target system and used for any number of purposes (reports, forwarded to other databases and applications for analysis, etc.), all of which will be erroneous with possible unknown negative consequences. It is also possible that an unplanned outage of the source system may occur before any subsequent “corrective” transactions are even executed, or they are lost in the replication pipeline and not applied to the target database, such that the target system takes over with the data inconsistency unresolved. The target is now running as the active system but with inconsistent data, and further transactions executed using the incorrect data will spread the inconsistency and may cause multiple further errors (which will continue to cascade unless and until the original inconsistency is discovered and corrected – which may require performing a table load, or taking part or all of the database offline).
In summary, while the Fetchsource process may decrease the time when data inconsistencies can arise between the source and target databases while using audit compression, it certainly does not decrease it enough. Gravic strongly recommends that customers consider removing audit compression if they need to replicate audit-compressed data and have it be automatically converted into INSERTs when the target record or row does not exist.
What is the recommended solution given these various potential issues with audit compression on HPE NonStop systems? Simply turn off audit compression, and observe the impact on both CPU utilization (which is expected to decrease), and audit trail generation (which may not increase by much, if at all). After turning off audit compression, it may become apparent that audit trail disk space usage does not increase significantly, and audit compression can be left off without any significant consequences, thereby avoiding all of the potential issues.
 This issue arises when replicating from an Enscribe source database to a SQL/MP or SQL/MX target database, or between any other combination of dissimilar source and target databases. It does not arise when replicating Enscribe to Enscribe or SQL/MP|MX to SQL/MP|MX where Shadowbase replication can correctly handle audit compressed data. Therefore, it is more of an issue in a heterogeneous data/application integration use of data replication, than in a business continuity environment (where by definition source and target databases are homogeneous, so the situation does not occur).
Question: Does Shadowbase software replicate audit events generated by RDF?
Answer: Shadowbase software is capable of replaying audit events generated by RDF. However, its default behavior is to skip RDF audit events. There is a configuration option to alter the default behavior.
Shadowbase software can replicate generated RDF audit events, but the default setting of this feature is disabled. Note that enabling this setting will cause audit events generated by a file/table reload or partition operation (e.g., an online split) to also be replicated.
Question: How does Shadowbase software handle database index files, alternate key files, and partitioned tables?
Answer: Index files and alternate key files should be excluded by the replication configuration. Otherwise, unexpected results will occur. Partitioned tables can be handled by specifying the primary partition in the SOURCEFILE of the DBS and enabling the ALLPARTITIONS parameter, which is the safest approach, since it also picks up dynamic partition changes. Alternatively, you can supply a wildcarded DBS SOURCEFILE value (e.g., $VOL*.SUBVOL.FILENAME); however, that may pick up other files that are not part of a specific base file/table definition.
Question: How many threads (i.e., QMGR/CONS or QMGR/CONS->DOCW/TRS) should I configure?
Answer: It depends on the transaction size and rate as well as the network load. Adding more QMGR/CONS or QMGR/CONS -> DOCW/TRS threads can help improve replication performance by distributing the tables or files to be replicated among several threads. If replication seems to lag, particularly during batch jobs, then adding another thread or two may be a viable solution. We generally recommend that the normal thread load be less than 50% of the thread’s maximum performance capacity, to allow for catch up in case a shutdown/restart occurs.
However, there are diminishing returns with adding new threads. At some point, new threads will make negligible impact on replication latency. Further, each additional thread means more load on the network. Testing is the only way for a customer to know the optimal number of replication threads to configure in order to maximize replication efficiency while avoiding an overloaded network. Customers’ system capacity, transaction loads, and network loads are different, which means that an optimal number of threads for one customer will not necessarily be optimal for another customer.
Question: Is there a tool/wizard to convert my RDF configuration into a Shadowbase configuration?
Answer: In general, the RDF configuration is significantly simpler than an equivalent Shadowbase configuration. (RDF is a simpler product with a specific configuration purpose.) We have guides for the HPE field teams to assist in converting RDF configurations into Shadowbase configurations.
Question: Should I configure replication with TCP/IP or Expand?
Answer: Shadowbase software supports either/both. There are no clear-cut answers, but these are the basic differences. Configuring an Expand replication environment is certainly simpler when using the Shadowbase configuration tools. A TCP/IP configuration requires additional processes per replication thread, specific network-related details, and is also routable where UDP is not. TCP/IP is extremely useful for migrating systems that have the same name, because duplicate system names cannot exist on an Expand network.
Answer: There are a few approaches; however, the simplest will be discussed here. The Shadowbase DDL to SQL Conversion Utility (SBDDLUTL) tool can convert an Enscribe DDL record definition into a matching SQL/MP table definition and assist with “flattening” (normalizing) the DDL record definition. With NSB 6.700 (Shadowbase version 6.700), SBCREATP is also available (via TCD at time of posting*, please periodically check our release page for new releases, or “Subscribe” using the button in the footer) for converting SQL/MP table schemas into their corresponding SQL schemas .
*Last edited: 7/15/22.
Answer: Yes! Please see Shadowbase Enterprise Manager (SEM). However, we also support, coordinate, and work with management and monitoring solutions providers such as Idelji with WebViewPoint Enterprise.
Answer: By its nature, data replication is a serial activity. It is very important to apply the first data change events first, and the follow-on data change events afterwards. This application is required when the events affect the same record/row and the same partition of the source file/table with HPE NonStop.
When preserving the begin transaction to end transaction consistency of the entire source transaction, it also often means that related transaction data must be serialized and applied in order. Therefore, the normal method of parallelism to boost performance cannot be used. The architecture is limited from a performance perspective.
The short answer is: if the data is related (same record/row, or source to target transaction consistency needs to be preserved), then performance often cannot be improved by invoking parallelism. Using a single replication thread preserves serialization and guarantees source to target data consistency (meaning source to target transaction boundaries). However, spreading the replication apply load over several replication threads does not.
Replicating nonrelated data is different.
If the programs are updating many files, and the load is spreading across separate replication threads (all single files replicate on the same replication thread, while separate files replicate on separate replication threads), then parallelism is used to improve replication performance and reduce replication latency. The interfile impact is removed.
For example, if a transaction updates FILE1, FILE2, and FILE3, and the target transaction boundary must match the source transaction boundary, then all three files must be replicated on the same replication thread. For instance, if a large batch transaction (or bulk load) is updating FILE2 and the application is executing many small transactions against FILE1 and FILE3, then the smaller transactions (against FILE1 and FILE3) need to wait behind the longer running transaction against FILE2. This delay occurs when it is time for replication to transmit, and the target file to replay the large batch transaction that completed against FILE2.
With Shadowbase software, spreading the load across separate replication threads is an option.
For example, FILE1 change data is replicated on its own replication thread; FILE2 change data is replicated on its own replication thread; and FILE3 change data is replicated on its own replication thread. In this architecture, FILE1, FILE2, and FILE3 data are replayed under separate transactions at the target, but the change data for FILE1 does not impact FILE2 and FILE3 replay; the change data for FILE2 will not impact FILE1 and FILE3 replay; and the change data for FILE3 will not impact FILE1 and FILE2 replay. Shadowbase Support can assist with implementing this replication environment by spreading the files and tables that need replicated over multiple replication threads.
Note, if the main issue is for only BULK LOADING and not batch processing in the middle of a small transaction OLTP activity:
Then a separate replication thread can be set up for just the BULK LOAD, and the change data replay activity can be paused for the file being loaded, while the load takes place on the BULK LOAD replication thread. Shadowbase Support can assist with implementing these kinds of architectures.
Answer: There is “no one size fits all” answer in regards to system resource utilization, overhead, and performance. In general, customers tell us that RDF seems to use less CPU, but sends more data, while Shadowbase software uses a bit more CPU, but sends less data. Shadowbase software does not replicate index or alternate key file I/O, but RDF must do this replication. The short answer is to perform a proof of concept (POC).
Answer: There are many variables that affect loading throughput. They include application and disk activity on both systems, the number of CPUs on the system and their type, network bandwidth, load throttling settings, and online versus offline loading. The fastest method is to perform offline loading while the applications are down (e.g., BACKUP/RESTORE or similar).
If you must perform an online load (application is active for updating), Shadowbase software allows the audit trail that is generated from active applications to be queued for replaying once the load is complete. Audit can be queued in the standard audit trail files or in Shadowbase queue (QMGR) files.