Microsoft’s Dynamics 365 platform is widely used to implement complex and diverse line-of-business systems. Many of these implementations require the migration of data and functionality from a set of legacy systems and components.
Often clients want all legacy data and relationships between data elements to be preserved in the Dynamics post-implementation for purposes of historical data analysis and reporting, and new features built on top of existing ones.
I recently had a chance to work on a project that required migrating a legacy Ruby on Rails system with PostgreSQL backend to Dynamics. The database contained 30+ tables with about 1,000,000 records.
The unique challenge was to recreate the Postgres data schema in Dynamics quickly: the implementers of the legacy system chose to highly denormalize several tables in Postgres, resulting in tables with 500-600 fields in them. Re-creating this schema configuration in Dynamics by hand was not an option. Instead, I used the Postgres metadata queries and a couple of tools to automate schema creation, and then used Microsoft’s SQL Server Data Tools (SSDT) with Kingswaysoft integration toolkit to implement the data migration and integration package.
Extracting schema from Postgres and recreating it in Dynamics
The XrmToolbox, an open-source collection of tools for Dynamics, has a handy plugin that allows bulk creation, update or delete of attributes for an entity, called Excel to CRM Bulk Attribute Creator.
To generate an Excel file in the format shown above (note that the format provides many more columns for various attribute values such as option set values, restrictions, format types etc.), I’ve used the metadata query to get the list of fields from a given Postgres table, along with some field properties, and convert them into a set of fields that the Attribute Editor requires for its input.
The basic query extracts name and type of each field for a given table by querying information_schema.columns table. The full query text and its analysis is provided below:
The query discards all fields mark as deprecated in the legacy system and then investigates the PosgtreSQL type value of each field to infer desired Dynamics type value. For example, if Postgres type is integer, it would use Whole Number for Dynamics unless the field name contains ‘amount’ or ‘value’ or ‘cost’, in which case we’d want the matching Dynamics attribute to be created as Currency.
This logic is highly dependent on overall logic and conventions used by Postgres database designers. For example, the database I was working with invariably terminated field names that logically represented bit values with a question mark, while being a bit sloppy with actual field types, so these bit fields sometimes carried text type attribute.
The query also creates display names for attributes by replacing underscore in Postgres field name with a space and using INITCAP function to convert the text to proper case.
The query is executed for each table we need to migrate and query’s output, shown below in PGAdmin (a PostgreSQL database management and query tool), is copied to the Excel template file generated by Excel to CRM Bulk Attribute Creator, and then this XrmToolbox plugin is used to add all these legacy attributes (I chose to manually create the entity itself first in Dynamics, without any custom attributes besides its name).
Setting up Dynamics entities for integration
The referential integrity in the source Postgres database is implemented via numeric primary and foreign keys. To be able to recreate faithfully the legacy relationships among new entities in Dynamics and to set up the ability to validate the completeness of the migration/integration steps, I’ve retained all these numeric primary and foreign key values as numeric fields in Dynamics entities and used each legacy primary key to created a related Key object in dynamics.
Using legacy ID as a Dynamics key allows us to guarantee the uniqueness of the record in Dynamics, and to use it a search attribute in integration steps to connect source Postgres with target Dynamics data. It also allows implementation of post-integration data validation steps. For example, to verify the completeness of a given step, we can run a query matching source to target data based on the same key value present in both systems, and ascertain that each record has 1:1 relation between source and target systems.
The entity key will ensure that neither integration step nor erroneous input by a user will lead to a duplicate record. Trying to duplicate key will lead to an error:
Implementing data migration project with SSDT and Kingswaysoft
The Integration Services bundled with Microsoft SQL Server Data Tools (SSDT) provide a highly versatile platform for building complex integration scenarios and implementing them in Visual Studio. While it comes with a plethora of connectors for various systems and data types, the connectors for Dynamics 365 or for PostgreSQL are not included.
Both connectors allow a developer to configure a connection in an Integration Services project and use a query against the connector as a datasource, or use it as a data destination.
Once the query against the Postgres data source is setup, and it is connected to a Dynamics destination, Visual Studio provides a standard mapping dialog to map source fields to destination fields.
The mapping UI allows us also to resolve Dynamics lookups. For example, in the screenshot above the Postgres id field is mapped to Dynamics id field, which is configured as an entity key and is used to resolve Create vs Update operation. The Dynamics connector provides an Upsert operator which will automatically execute correct operation based on key data resolution.
While I could manually set up the mappings as shown above, the Integration Services will automatically map source fields to destinations as long as field names match. This is a great feature when you have to deal with multiple tables containing hundreds of fields in each; it saves a ton of time!
In order to enable this auto-mapping, I wrote another metadata query that reads Postgres table definition and converts it into a Select statement with field names transformed to match the Dynamics attribute schema names.
The query is using STRING_AGG function to concatenate field names while slightly changing each field name to match the Dynamics schema name conventions; we need to add schema prefix to each field and drop underscores. It also processes date fields separately to exclude date values Dynamics doesn’t support, and it ignores deprecated fields.
The integration package consists of several sequential data flows. When the currently executing flow ends, the next flow’s execution begins. Any error stops the integration run.
The flows are organized according to the level of derivation and data dependency: Level 0, for example, contains data elements that do not depend on anything else, Level 1 entities only depend on Level 0 entities, etc.
Within each flow, multiple integration tasks execute in parallel. This level of concurrency proved to be sufficient for my project.
At the end of a data task’s successful execution, the count of processed rows maybe viewed in Visual Studio window and also queried from Dynamics 365 and compared with the PostgreSQL record count:
Metadata discovery and custom SSDT connectors allow us to quickly implement fairly complex data migration scenarios.
Questions: please post your questions / feedback on our blog.