Thursday, 30 September 2010

Next version SSIS

Details of the next version of SSIS have been leaking onto the internets. Some highlights; undo in the designer - which will be a big time-saver, data-flow sequence containers - allowing developers to group and hide transformations, and last but not least - rounded corners!

This is great, and no doubt other improvements will be announced over the next few weeks (presumably Visual Studio 2010 integration is a cast-iron certainty). However other gripes on connect remain open, or have already been closed as 'won't fix', such as case sensitivity on the Lookup transform (forcing additional 'uppercase' columns, or passing the whole lookup to the SQL database), and some of Jamie Thomsons debugging suggestions.

Nit-picking aside its interesting to note that less than 6 months after the release of R2 MS is starting to talk about CTP's of the next version.

Wednesday, 22 September 2010

Data dude build without a Visual Studio installation

The simple answer is that you can't.

I've just run into this issue myself. It strikes me that requiring an additional client license, and a full VS install simply to_build_a project is way over-the-top. The TFS alternative is obviously better (it being a server product) - but potentially still requires additional licensing. Anyway with increasing numbers of shops running continuous integration I think it removes a significant amount of desirable functionality from the Visual Studion database project story.

If you want to try and get database projects built only using MSbuild vote here.

Sunday, 19 September 2010

Data dude erroring on deployment - Connect request

I've created a connect request for the problem described below. Fingers crossed it gets accepted - if you want it added go and vote now!

Saturday, 21 August 2010

Data dude erroring on deployment over-writing database file settings

At MITIE we've started using Visual Studio (VS) 2010 for managing database development. The nice thing about 2010 is that the functionality from Data Dude that used to be in a seperate sku and license is now included in the vanilla editions of VS Pro and Premium (Kimberly Tripp has a good post about it here). Premium has more functionality than Pro, obviously.

I won't cover the advantages of using VS 2010 to develop databases as that is covered in detail elsewhere. However there are some wrinkles that aren't quite so well documented.

I started to get this error whilst trying to deploy a database project from VS 2010, that I'd previously imported from SQL Server:

Error SQL01268: .Net SqlClient Data Provider: Msg 1834, Level 16, State 1, Line 1 The file [file name] cannot be overwritten. It is being used by database [database name].

There is some information on the problem here however the thread is a bit long-winded and the final fix leaves something to be desired...

Essentially my problem boiled down to the fact that the database that I imported had different file settings to the database I was deploying to. In this case the filegrowth settings. Therefore the database project also had different filegrowth settings than the deployment database. In this situation VS/data dude will detect a difference and attempt to alter the database file. Obviously if there are open connections to the database then this won't work. Therefore there are two solutions:

1. Close all connections to the database and then deploy (possibly even putting the database in single user mode). This is a bit inconvenient and assumes you're happy with the settings in the database project. In this case I wasn't.

2. Happily there is an alternative. You can change filegrowth (and other) settings of the database project, however these aren't available through the VS 2010 IDE. Instead you have to edit .sql files found here:


[Project location]\[Database project name]\Schema Objects\Storage\Files


In my case I changed the filegrowth settings from:


FILEGROWTH=1MB




To:


FILEGROWTH=10%




After doing this my deployment problems were solved.

Clearly appropriate filegrowth settings is a whole seperate topic - which could be the subject of at a whole post in itself. On this occasion 10% was better than 1MB.

Unfortunately there is a limited set of database settings available to change through VS 2010 using the database properties > schema settings dialogue - which doesn't include autogrowth let alone sizing. Ideally it'd be nice to be able to amend those settings through VS itself. I think there's probably another Connect request in the works...

Wednesday, 7 July 2010

MERGE and Slowly Changing Dimensions

Blimey - has it really been over 2 months since my last post? A few things have happened since then - not least moving job from Clarity to MITIE. Anyway on with the subject of the post.

Using the MERGE statement is pretty well documented however what about using it to update a dimension that includes Type 1 and Type 2 changes? There are examples on the web but they're either not SQL specific or seem flawed in one way or another. To follow this example you'll need the AdventureWorksDW2008 database available here. The R2 version of AdventureWorksDW might work but as yet I haven't tested it. Firstly lets set up a staging table and a dimension table:



SELECT ProductKey,ProductAlternateKey AS ProductId,EnglishProductName,

       ISNULL(ListPrice, 0) AS ListPrice,HASHBYTES('SHA1', EnglishProductName)

                                         AS Type1CheckSum,

       HASHBYTES('SHA1', CAST(ISNULL(ListPrice, 0) AS VARCHAR(10))) AS

       Type2CheckSum

INTO   ProductStaging

FROM   AdventureWorksDW2008..DimProduct

WHERE  EndDate IS NULL



GO



CREATE TABLE [dbo].[DimProduct]

  (

     [ProductKey]         [INT] IDENTITY(1, 1) NOT NULL,

     ProductId            NVARCHAR(25) NOT NULL,

     [EnglishProductName] [NVARCHAR](50) NOT NULL,

     [ListPrice]          [MONEY] NOT NULL,

     DateUpdated          DATETIME NOT NULL DEFAULT GETDATE(),

     Type1Checksum        VARBINARY(8000) NULL DEFAULT 0,

     Type2Checksum        VARBINARY(8000) NULL DEFAULT 0,

     StartDate            DATETIME NOT NULL DEFAULT GETDATE(),

     EndDate              DATETIME NULL,

     CurrentFlag          BIT NOT NULL DEFAULT 1

     CONSTRAINT [PK_DimProduct_ProductKey] PRIMARY KEY CLUSTERED ( [ProductKey]

     ASC )

  ) 



Note that we're using HASHBYTES on the staging table - this will be used to compare columns. The SHA1 algorithm ensures sensitivity to changes across those columns. If you're using non-character fields you'll need to cast them before hashing them.

Firstly lets add new records (at this point all of them):



MERGE dbo.DimProduct AS dim

USING ProductStaging AS stg

ON dim.ProductId=stg.ProductId



WHEN MATCHED AND dim.Type1Checksum<>stg.Type1Checksum THEN --type1 changes

  UPDATE SET  dim.Type1Checksum=stg.Type1Checksum,

        dim.EnglishProductName=stg.EnglishProductName,

        dim.DateUpdated=GETDATE()



WHEN NOT MATCHED THEN --new records

  INSERT (ProductId, EnglishProductName, ListPrice, Type1Checksum, Type2Checksum)

  VALUES  (stg.ProductId, stg.EnglishProductName, stg.ListPrice, stg.Type1Checksum, stg.Type2Checksum);





You'll note that this statement actually does the Type 1 updates as well - ie where the checksum\hash columns are different.

OK, lets add some changes to the ProductName - which will be handled as Type 1 overwrites, and some changes to ListPrice - which we'll handle as Type 2:



UPDATE ProductStaging

SET    EnglishProductName = 'Fred',

       Type1CheckSum = HASHBYTES('SHA1', 'Fred')

WHERE  ProductId = 'AR-5381';



UPDATE ProductStaging

SET    ListPrice = 9.99,

       Type2CheckSum = HASHBYTES('SHA1', CAST(ISNULL(9.99, 0) AS VARCHAR(10)))

WHERE  ProductId = 'BE-2349';



GO



So now we'll do the Type 1 update using the same MERGE statement as above. We also set the DateUpdated column - for auditing purposes:




MERGE dbo.DimProduct AS dim

USING ProductStaging AS stg

ON dim.ProductId = stg.ProductId

WHEN MATCHED AND dim.Type1Checksum<>stg.Type1Checksum THEN --type1 changes

  UPDATE SET dim.Type1Checksum = stg.Type1Checksum,

             dim.EnglishProductName = stg.EnglishProductName,

             dim.DateUpdated = GETDATE()

WHEN NOT MATCHED THEN --new records

  INSERT (ProductId,

          EnglishProductName,

          ListPrice,

          Type1Checksum,

          Type2Checksum)

  VALUES (stg.ProductId,

          stg.EnglishProductName,

          stg.ListPrice,

          stg.Type1Checksum,

          stg.Type2Checksum); 



Here we're careful to ensure that Type 1 changes apply to all records including the historic ones. Some examples filter for current records only...

Finally the Type 2 INSERTS. The trick here is to use the OUTPUT statement on the UPDATE to create a pseudo-table and then SELECT from that into the DimProduct dimension table to create the new 'current' record. Note that I'm also setting the CurrentFlag and EndDate on the old dimension record - so that its clear this is a historic record. Again all in one statement:



INSERT dbo.DimProduct

       (ProductId,EnglishProductName,ListPrice,DateUpdated,Type1Checksum,

        Type2Checksum,

        CurrentFlag)

SELECT ProductId,EnglishProductName,ListPrice,GETDATE(),Type1Checksum,

       Type2Checksum,1

FROM   ( 

MERGE DimProduct AS dim

USING ProductStaging AS stg

ON dim.ProductId = stg.ProductId

WHEN MATCHED AND dim.Type2Checksum<>stg.Type2Checksum AND dim.CurrentFlag=1 THEN

  UPDATE SET CurrentFlag = 0,

             EndDate = GETDATE()

OUTPUT $Action AS Operation,stg.ProductId,stg.EnglishProductName,stg.ListPrice,

stg.Type1Checksum,stg.Type2Checksum 
) AS MergeOut

WHERE  Operation = 'UPDATE';





So that's it! Hope its um...useful.

Wednesday, 21 April 2010

SQL Server 2008 R2 RTM

MS have issued a press release stating that SQL Server 2008 R2 has been Released To Manufacturing (are they still pressing discs? I guess so...). It also states that it will be available in early May from the likes of us and MS's own channels. Self-service BI gets top billing. I've certainly been impressed with the reporting elements of R2 - not only PowerPivot but also Reporting Services, so I'm looking forward to using it in anger with clients...

Thursday, 18 March 2010

SQL Bits VI Registration Open

Title says it all really. Go here to register and here for the agenda - lots of BI stuff needless to say...