Earlier this month I had the pleasure of participating in several sessions at Convergence 2009 in New Orleans. One of my favorites was a CRMUG session on data migration and integration. I was on a roundtable panel with Tim Thorpe and Leslie Guffey from YRC Logistics and Brendan Peterson from Scribe Software.
There were a lot of great questions asked, and I learned several tips that can really help speed up DTS performance for migrations and integrations with Scribe.
1. Multiple DTS--One thing I've always wondered about is what is the effect of running multiple instances of the Scribe workbench on one machine? For example, if you have three data loads that you need to import, is it faster to run them in parallel, or would it take just as long as running them separately?
Brendan confirmed that it is supported to run multiple instances of the workbench simultaneously, and that if you have a powerful server with multiple processors/cores, this will be much faster than running each job individually.
2. Picklist validation--Like it sounds, this is a feature in the Microsoft CRM adapter that validates that picklist values are valid. This also adds some overhead to the DTS. If you know that the picklists values in your source are valid, or if your source does not contain any picklist data, disabling picklist validation can significantly decrease DTS execution time. To disable picklist validation, In Scribe Workbench click Target then select "Adapter Settings." on the General Settings tab uncheck the "Validate Microsoft Dynamics CRM picklist fields."

3. Increase batch query size--This is the size of the cache used for queries. If you have a beefy server with lots of ram/processor available, you can increase this value as high as 5,000, and this can improve the performance of your DTS. Query Batch size settings are located in the Adapter general settings (same place as #2)
Your results may very based on your specific DTS and the available resources in your environment. Have any additional data migration tips (doesn't have to be Scribe related)? Leave a comment or send me a message and I will post a follow-up.
Running multiple instances isnt always supported. Some scribe adaptors only support a single connection from the machine - e.g. the Pivotal adaptor :(
Also, regardless of whether the data is different, an update will be performed (an the modifiedon date changed). To avoid unnecessary updates its best to filter the source data set first. One option is to use a SQL Select across two databases - source and destination and return only changed rows. If you have other systems that are fed changes based on crm modified records then this is quite important - you wouldnt want to send your entire account or contact list to a remote system every night bacuse all records were being updated regardless!
Posted by: Regan | March 24, 2009 at 10:48 PM
One thing Brendan pointed out to me yesterday was the query polling time in your Source Query Publishers. If you have 10 publishers querying SQL every 10 seconds you can cause a bottleneck. The other tip is to periodically re-index the ScribeShadow table, and set a DTS to periodically delete old ScribeShadow table records (Operation = 'D').
Posted by: Michael Dodd | March 26, 2009 at 09:59 AM
Hi,
i am very interested in CRM data integration. My question is: are the slides/the content of this session available for download somewhere?
Thanks
Jan
Posted by: Jan | March 30, 2009 at 07:49 AM