Saturday 30 August 2008

Child vs Parent Package Configurations Part 1

One of my colleagues - Sam Loud - prompted a mention from the venerable Jamie Thompson on the thorny subject of SSIS configuration (in fact he starts with the problems with logging configuration - something I've also encountered, but the subject for another post I think). In fact we were debating it in the office that very day .

Sam's beef with parent package configurations is that it makes the development of child packages more onerous. This is due to the inability of child packages to use master configurations at run-time, unlike a direct configuration to a config file or database table which the child package can use. An example of this follows:

Firstly create a new SSIS project in BIDS/Visual Studio and create two packages - 'Parent' and 'Child':



We'll create a very simple child package to read the @@VERSION variable in SQL:

Add a connection manager to the package pointing to the master database on a convenient server (in this case my local machine):



Add an Execute SQL Task selecting the connection manager created above and type "SELECT @@VERSION" as the SQLStatement property:



Create a package variable with a 'Child' scope called 'ConnString', and set the connection manager to use the variable through the Property Expressions Editor:



You'll note at this point that you need to type the connection string into the variable to get the package to work. This of course is what Sam is complaining about having to do with Master configurations. However the advantage becomes clearer when adding a second child package.

Finally create a simple configuration using an XML file to set the ConnString variable value:



Executing the package should result in success. This is an example of a direct child configuration. Adding a second child package - Child2 - running through the same steps as for Child you can skip the step typing the connection string into the variable - simply add the variable and the configuration (remembering to click 'Reuse' when prompted). You should be able to execute Child2 successfully with the package picking the connection string up from the config file.

Now for the Parent configuration:

In the Parent package create a variable called 'ConnString' with Parent scope and use the XML configuration created for the Child.dtsx package. Create a File Connection Manager, selecting 'Existing file' in the drop-down and the file location of Child.dtsx (I'm not using MSDB for this example though this might be advisable in practice). Add a Execute Package task using the File Connection Manager created earlier.

Switch to the Child.dtsx package and remove the file configuration created earlier and create a 'Parent package variable' configuration adding 'ConnString' as the Parent variable:



You should be able to execute the Parent package and see BIDS switch to the Child.dtsx package and execute that in turn.

But what happens when you replace the file configuration in Child2.dtsx, with Parent package variable and try and execute it by itself? It will fail - as the ConnString variable in Child2.dtsx is blank and it will not 'look-up' the variable in Parent.dtsx. To get it to work you need to manually type the connection string into the ConnString variable. This is the crux of Sam's complaint. Using parent configuration means typing the connection string every time you create a package rather than just the once. There are associated disadvantages in that you can't test the development configuration until the parent package is run and so on.

The advantages of parent package configuration relate to larger, more complex projects:

1. Configuration is centralised and 'hidden' from child packages:

This makes changing the configuration (from database to environment variable for example)infinitely easier if you have a large number of packages. This might be an issue you can't control the environment your packages are running on, or you have multiple different possible environments to target.

2. Development amongst multiple developers without direct configuration is easier:

In my experience when developing packages in a team of 3 or more people direct configurations can become a pain. This is unless everyone has their machine set up exactly the same way. I've often found myself disabling the direct configurations in order to get a package to use a different database (local rather than server-based for example) or filepath to the one specified in the configuration. Using parent configurations leaves you as a developer free to set the variables as you please.

So in conclusion - because this post has gone on long enough - use child packages with direct configuration when the project is small, the environments you're using are well-defined and the number of packages are low - probably no more than 6-8. Otherwise the safest course of action is parent configuration.

In a future post I'll look at the possible solutions and/or work-arounds for working with parent configurations without typing configuration in for every package.

Thursday 7 August 2008

SSIS and Oracle providers - another warning

As you may have gathered I'm working with Oracle and SSIS a lot at the moment - and here's another gotcha with it:

Colleagues finalising a release encountered this error when testing SSIS packages on a 64-bit version of Windows 2003 R2:
ORA-06413: Connection not open
This is discussed at length in this MSDN thread amongst others.

The upshot is that the Oracle provider can't cope with parentheses (ie brackets) in the filepath of programs connecting to Oracle using the provider. Yes - you read that right.

Those of you who know anything about Windows running on X64 architecture know that 32-bit applications are stored under a seperate programs folder ie:
c:\Program Files (x86)\
The sorts of programs that live under this folder are Business Intelligence Development Studio (ie BIDS) which is 32-bit only, and the 32-bit version of DTExec.exe - which executes SSIS packages. So you can see the problem; any SSIS package that uses the Oracle provider and runs in a 32-bit context will error.

Fine you might say, don't run it under a 32-bit context then. Unfortunately if you're having to extract data from Excel (as we are) you have to - there is no 64-bit provider for Excel. Happily there is a fix and a workaround. The bug fix is patch number 5059238 available from Oracles MetaLink site. However you need a 'Support Identifier' which we haven't got hold of yet. I'll edit this post if we (or when) we try this fix.

The work-around is simply to avoid using parentheses when calling executables. The way this is done is by using the old-school MS-DOS short-name for the file path. Hence
c:\Program Files (x86)\Microsoft SQL Server\90\DTS\Binn\DTExec.exe
becomes
C:\Progra~2\Microsoft SQL Server\90\DTS\Binn\DTExec.exe
This avoids having a parenthesis in the file-path and therefore the Oracle provider won't error. In fact you can launch BIDS/Visual Studio from the command-line in a similar way - so you can check the connection managers work:
C:\Progra~2\Microsoft Visual Studio 8\Common7\IDE\devenv.exe
So to top this post off here is an example of a command-line that will run an SSIS package using the Oracle provider under a 32-bit context on a 64-bit server:
"C:\Progra~2\Microsoft SQL Server\90\DTS\Binn\DTExec.exe” /DTS "\MSDB\[Package Name]" /SERVER [Server Name]

Tuesday 5 August 2008

More on Oracle connectors

A useful wiki here containing a plethora of info on SSIS connectivitity. There is a table here listing the different Oracle Providers. This also includes the points I made in my previous post. Since it appears under my post on a Google search for 'Oracle providers' I thought I'd try and redress the balance a little...