ArcGIS Archiving for Utility Geodatabases (Part 1)
ArcGIS Archiving provides a product-based method for tracking all changes made to your versioned Geodatabase. Among other important things it can answer the question:
When did this object (gas main, service line, conductor, cable, etc.) get modified or deleted?
A particularly important question in today’s GIS record-keeping environment. Without it you’ll need to implement a custom solution to answer that question. Yes, the ArcGIS “Editor Tracking” and ArcFM meta-data field AutoUpdaters will tell you the “who” and “when” of object creation or modification, but neither process will tell you about deletes.
Our observation however is that relatively few utility ArcGIS users take advantage or Archiving. Further, some may shy away from Archiving, in part, because there is little published information on how it works, as well as its impacts and risks.
Esri provides considerable documentation on the background, setup and use of Geodatabase Archiving. The purpose of this post is to provide another data point that utilities can use in deciding whether Archiving is for them. The post does not describe experience from a production environment, but does provide “clinical” results from a test environment with a utility dataset containing several million features in a geometric network.
Fundamental Archiving Concepts
Historical Tables: When a Geodatabase is enabled for Archiving a new table is introduced to the Geodatabase for each registered class call a “History” table, named for the original class suffixed by “_H”. The history table is identical to the original, with the following fields added:
GDB_FROM_DATE
GDB_TO_DATE
GDB_ARCHIVE_OID
Historical Table Contents: On enabling a Geodatabase for archiving all features present in the class within the SDE.DEFAULT version are duplicated in the “_H” table. GDB_FROM_DATE is set to the current date/time and GDB_TO_DATE is set to “12/31/9999” – indicating it does not have a retired date. As transactional versions are posted to SDE.DEFAULT updates are made to “_H” tables; new objects are added with the current GDB_FROM_DATE, deleted objects are updated with the GDB_TO_DATE set to the current date, and updates are duplicated… (same OID can exist multiple times with different from/to date ranges.) The important point here is that all objects that have ever existed in SDE.DEFAULT are present in the “_H” table.
Transactional Versions and Historical Versions: The traditional versioned geodatabase contains transactional “versions” arranged hierarchically (each version has a parent) that start out identical to the parent and differ from it as edits are made to add, change and delete objects. The thing that distinguishes one version from another are edit transactions. Through the reconcile and posting process these edits are merged upward so that edits in child versions find their way to the parent(s). There is nothing native to the transactional version process that maintains a history of when an edit was made. Historical versions, in contrast, hold all information ever present in the archived class. The “delta” between what was present yesterday and what’s present today is defined in the time stamp fields. And what’s returned for any query are objects with a lifespan that includes the specified date and time.
Summary of Findings
Below are some results from our preliminary testing. Key findings from this preliminary set of tests are as follows:
- The functionality as described in Esri documentation is straightforward to achieve and testing provided expected results. The few exceptions are noted below.
- ArcMap display for transactional versions is not impacted by the presence of historical version tables. Not a surprise, since queries on transactional versions don’t include historical tables. But nice to confirm.
- ArcMap display for historical versions is not slower than display on transactional versions – at least in our preliminary tests. The approach to query of historical versions is very different than on transactional versions, but not more complex.
- Database table sizes increased as expected.
Basic Functionality
With that let’s review in a simple work flow how archiving works. The following steps illustrate a gas main replacement task and show how archiving can help us keep track of what happened.
Basic Editing
We’ll start with a gas main in the DistributionMain feature class along Sumac Avenue for which we’ll replace a segment. Below on top is the “before” state as it exists in SDE.DEFAULT. Below that is the “after” edited equipment. Specific steps between “before” and “after” were:
- Create a version as a child of SDE.DEFAULT and switch to that version
- Start editing on the new version
- Select and split the main at two locations
- Disconnect the isolated main
- Use the ArcFM Abandon Tools to abandon the selected main to the AbandonedDistributionMain class – which deletes the isolated feature from the DistributionMain class
- Add end-cap fittings at both ends of the abandoned main.
- Disconnect service lines from the abandoned main and connect them to the new replacement main.
- Reconcile and post the edit version to SDE.DEFAULT.
On completion of these steps an ArcFM Gas Pressure System Trace on the newly replaced main equipment returns expected results, as shown below.
Basic Archiving Queries
So far, this is all standard ArcFM work flow steps. However, since we have Archiving enabled we can change to a historical version just before the time when our gas main replacement version was posted, using the “Historical” tab on the “Change Version” dialog.
Once pointed to this date and time in the past we can see the gas mains, services and other equipment on Sumac Avenue as they existed prior to our post. Further, as shown below, we can again perform an ArcFM Gas Pressure System trace on the features and get expected results with the data present at that time.
Finally, when we add the historical DistributionMain layer to the map we can see ALL distribution main features that now and have ever existed in the SDE.DEFAULT version. In this case we see than the main we abandoned in our edit session has a GDB_TO_DATE that’s earlier than the current date and time – and thus can see EXACTLY when it was deleted.
Impact on Display Performance
The first question asked about Archiving is typically “how will it impact my display performance?” We know that the Archiving process introduces new tables into the database schema, will this add complexity to database queries that result in slower display?
That’s what our next set of tests set out to explore. We started with an SDE Geodatabase with several million features including a utility network and exercised standard displays using Esri’s Performance QA Analyzer to collect display times for standard map extents under the following circumstances:
1. A fully un-versioned, un-archived Geodatabase
2. That versioned Geodatabase with 100 versions all children of SDE.DEFAULT, with a random number of features present and half of the versions reconciled
3. Versioned Geodatabase from scenario 2 with Archiving enabled and layers pointing to the SDE.DEFAULT version
4. Versioned Geodatabase with Archiving enabled from scenario 3 and layers pointing to the Archive DEFAULT version
5. Versioned Geodatabase with Archiving enabled from scenario 4 with the current version pointed to a date and time in the past – so the display query must filter on date and time
We would expect each scenario to introduce additional complexity to the Geodatabase, and thus would be somewhat slower than the previous. First we’ll simply look at average display times captured from standard display extents on the un-versioned and versioned database with edits.
Per the chart at the right we see display times on our sample database for standard displays ranging in scale from 1:2,000,00 to 1:12,000. As expected, displays on the un-versioned database (in blue) are consistently faster than displays on the database after its been versioned and edits applied. This is well known behavior.
The next chart includes display times with Archiving enabled – yet the map layers pointing to the SDE.DEFAULT version, as with the previous test. Display performance consistently falls between that of the un-versioned and versioned database.
The final chart includes display times on archive versions. The first of these tests looked at the DEFAULT historical marker – that date and time of the last update to the SDE.DEFAULT version. Esri documentation states that connecting to this version “can actually consume fewer database resources than working with the equivalent versioned class.” And in our tests we found displays on this version slightly faster than queries on transactional versions. In the second test with isplays pointing to a historical version – prior in time to the most recent post to SDE.DEFAULT – were similar.
These tests were performed in a controlled, non-production environment. However, it seems that:
1. Introduction of archiving does not impact display performance of transactional versions, and
2. Display of historical versions is comparable to that found with transactional versions
Database Size
As advertised, when archiving is enabled on a class all rows present in the SDE.DEFAULT version are copied into to a class named with a suffix “_H”, so if our original class was GASVALVE the archive class will be GASVALVE_H.
As we’ve seen above, the increase in database size will not directly impact display performance, nor should it impact other query performance, either attribute queries or traces, nor should it impact editing performance. Operations that will be impacted will be administrative operations such as backups.
Observed Functionality Issues
As mentioned, our tests found nearly all operations and behaviors to perform as expected. Here are observations for the few exceptions (though the first is simply an observation):
- Archiving will not replace the need for classes containing “abandoned” equipment, such as abandoned mains, conductors, etc. Typically there are multiple requirements for display and query of abandoned equipment along with existing equipment – for which abandon classes are used.
- There is no way to load data into historical versions. Archiving begins at the time its enabled in ArcCatalog.
- While the ArcFM Identify Tool operates as expected on layers pointing to a historical version, it does not appear to be aware of layers pointing to the historical archive (added using the “Add Historical Archive” tool).
Summary
This post presented a preliminary assessment of ArcGIS Archiving and with few exceptions found ArcGIS and ArcFM functionality and performance to be as expected. There are many more functions not yet examined, but initial indications confirm that Archiving could provide a very valuable component of a utility ArcGIS implementation.