Nov 03

I've been employing SQL Server Integration Services for some years now and find it to be quite an improvement over its predecessor Data Transformation Services. Apart from being an ETL product, it also provides different built-in tasks to manage a SQL Server instance. SSIS is not an enhancement to DTS but rather a new product which has been written from scratch to provide high performance and parallelism and as a result of this it overcomes several limitations of DTS. Not to mention the fact that versions of each package are easilty stored and maintained using TFS, and so plays well with other Visual Studio design elements by your support team.

The best part of SSIS is that it is a component of SQL server.  It comes free with the SQL Server installation and you don't need a separate license for it. Because of this, along with hardcore BI developers, database developers and database administrators are also using  it to transfer and transform data.

Here are some tips when employing SSIS that I've come across over the past couple years...

Avoid asynchronous transformation wherever possible

Before I talk about different kinds of transformations and its impact on performance, let me briefly talk about of how SSIS works internally. The SSIS runtime engine executes the package.  It executes every task other than data flow task in the defined sequence.  Whenever the SSIS runtime engine encounters a data flow task, it hands over the execution of the data flow task to data flow pipeline engine. The data flow pipeline engine breaks the execution of a data flow task into one more execution tree(s) and may execute two or more execution trees in parallel to achieve high performance. Now if you are wondering what an execution tree is, then here is the answer.

An execution tree, as name implies, has a similar structure as a tree.  It starts at a source or an asynchronous transformation and ends at destination or first asynchronous transformation in the hierarchy. Each execution tree has a set of allocated buffer and scope of these buffers are associated the execution tree. Also each execution tree is allocated an OS thread (worker-thread) and unlike buffers this thread may be shared by any other execution tree, in other words an OS thread might execute one or more execution trees. In SSIS 2008, the process of breaking data flow task into an execution tree has been enhanced to create an execution path and sub-path so that your package can take advantage of high-end multi-processor systems.

Synchronous transformations get a record, process it and pass it to the other transformation or destination in the sequence. The processing of a record is not dependent on the other incoming rows. Because synchronous transformations output the same number of records as the input, it does not require new buffers (processing is the done in the same incoming buffers i.e. in the same allocated memory) to be created and because of this it is normally faster. For example, in the Derived column transformation, it adds a new column in the each incoming row, but it does not add any additional records to the output.

Unlike synchronous transformations, the asynchronous transformation might output a different number of records than the input requiring new buffers to be created. Because an output is dependent on one or more records it is called a blocking transformation.  Depending on the types of blocking it can either be partially blocking or a fully blocking transformation.  For example, the Sort Transformation is a fully blocking transformation as it requires all the incoming rows to arrive before processing.

As discussed above, the asynchronous transformation requires addition buffers for its output and does not utilize the incoming input buffers.  It also waits for all incoming rows to arrive for processing, that's the reason the asynchronous transformation performs slower and must be avoided wherever possible. For example, instead of using Sort Transformation you can get sorted results from the source itself by using ORDER BY clause.

DefaultBufferMaxSize and DefaultBufferMaxRows

The number of buffer created is dependent on how many rows fit into a buffer and how many rows fit into a buffer dependent on few other factors. The first consideration is the estimated row size, which is the sum of the maximum sizes of all the columns from the incoming records. The second consideration is the DefaultBufferMaxSize property of the data flow task.  This property specifies the default maximum size of a buffer. The default value is 10 MB and its upper and lower boundaries are constrained by two internal properties of SSIS which are MaxBufferSize (100MB) and MinBufferSize (64 KB). It means the size of a buffer can be as small as 64 KB and as large as 100 MB. The third factor is, DefaultBufferMaxRows which is again a property of data flow task which specifies the default number of rows in a buffer.  Its default value is 10000.

Although SSIS does a good job in tuning for these properties in order to create a optimum number of buffers, if the size exceeds the DefaultBufferMaxSize then it reduces the rows in the buffer. For better buffer performance you can do two things.  First you can remove unwanted columns from the source and set data type in each column appropriately, especially if your source is flat file.  This will enable you to accommodate as many rows as possible in the buffer. Second, if your system has sufficient memory available, you can tune these properties to have a small number of large buffers, which could improve performance. Beware if you change the values of these properties to a point where page spooling begins, it adversely impacts performance.

BufferTempStoragePath and BLOBTempStoragePath

If there is a lack of memory resource i.e. Windows triggers a low memory notification event, memory overflow or memory pressure, the incoming records, except BLOBs, will be spooled to the file system by SSIS. The file system location is set by the BufferTempStoragePath of the data flow task. By default its value is blank, in that case the location will be based on the of value of the TEMP/TMP system variable.

Likewise SSIS may choose to write the BLOB data to the file system before sending it to the destination because BLOB data is typically large and cannot be stored in the SSIS buffer. Once again the file system location for the spooling BLOB data is set by the BLOBTempStoragePath property of the data flow task. By default its value is blank.  In that case the location will be the value of the TEMP/TMP system variable. As I said, if you don't specify the values for these properties, the values of TEMP and TMP system variables will be considered as locations for spooling.  What is important is to change the default values of the BufferTempStoragePath/BLOBTempStoragePath properties and specify locations where the user executing the package (if the package is being executed by SQL Server Job, then SQL Server Agent service account) has access to these locations.  Preferably both locations should refer to separate fast drives (with separate spindles) to maximize I/O throughput and improve performance.

How DelayValidation property can help you

SSIS uses validation to determine if the package could fail at runtime. SSIS uses two types of validation.  First is package validation (early validation) which validates the package and all its components before starting the execution of the package.  Second SSIS uses component validation (late validation), which validates the components of the package once started.

Let's consider a scenario where the first component of the package creates an object i.e. a temporary table, which is being referenced by the second component of the package. During package validation, the first component has not yet executed, so no object has been created causing a package validation failure when validating the second component. SSIS will throw a validation exception and will not start the package execution. So how will you get this package running in this common scenario?

To help you in this scenario, every component has a DelayValidation (default=FALSE) property.  If you set it to TRUE, early validation will be skipped and the component will be validated only at the component level (late validation) which is during package execution.

Tags: |

Comments

lånapengar

Posted on Sunday, 29 November 2009 13:56

And I believed I knew everything there was to know about this subject...well, now I do - solid post. Bye, Carson Cocken ~ lanapengar

hcg girl

Posted on Monday, 30 November 2009 09:23

Your blog is so informative … keep up the good work!!!!

how to get a small business loan

Posted on Wednesday, 23 December 2009 22:20

I subscribed to your blog when is the next post



Regards
Tucker



hotels in san miguel

Posted on Thursday, 21 January 2010 01:16

Yesterday, scientists in the United States revealed that beer contains small traces of female hormones.

To prove their theory, they fed one hundred men twelve pints of beer and observed that 100% of them started talking nonsense and couldn't drive.



Regards
Collins








religious book store

Posted on Thursday, 21 January 2010 09:17

As the manager of our hospital's softball team, I was responsible for returning equipment to the proper owners at the end of the season.

When I walked into the surgery department carrying a bat that belonged to one of the surgeons, I passed several patients and their families in a waiting area.

I heard one man say to his wife, "Look, honey, here comes your anesthesiologist."




Regards
Hunter




dentist ft worth

Posted on Saturday, 23 January 2010 11:10

1. My grandmother started walking five miles a day when she was 60. She's 97 now & we don't know where the heck she is.

2. The only reason I would take up jogging is so that I could hear heavy breathing again.

3. I joined a health club last year, spent about 400 bucks. Haven't lost a pound. Apparently you have to show up.

4. I have to exercise in the morning before my brain figures out what I'm doing.

5. I don't exercise at all. If God meant us to touch our toes, he would have put them further up our body.

6. I like long walks, especially when they are taken by people who annoy me.

7. I have flabby thighs, but fortunately my stomach covers them.

8. The advantage of exercising every day is that you die healthier.

9. If you are going to try cross-country skiing, start with a small country.

10. I don't jog. It makes the ice jump right out of my glass.




Regards
Fritch

Trade show displays

Posted on Saturday, 23 January 2010 13:24

At the cocktail party, one woman said to another, "Aren't you wearing your wedding ring on the wrong finger?"

The other replied, "Yes, I am, I married the wrong man."



Regards
Mcloud

Dental courses

Posted on Saturday, 23 January 2010 14:58

Two cannibals meet one day. The first cannibal says, "You know, I just can't seem to get a tender missionary. I've baked 'em, I've roasted 'em, I've stewed 'em, I've barbequed 'em, I've even tried every sort of marinade. I just cannot seem to get them tender."

The second cannibal asks, "What kind of missionary do you use?"

The other replied, "You know, the ones that hang out at that place at the bend of the river. They have those brown cloaks with a rope around the waist and their sort of bald on top with a funny ring of hair on their heads."

"Ah ha!" he replies. "No wonder.. those are friars!"


Regards
Engstrom

kiosks for sale

Posted on Tuesday, 26 January 2010 15:27

There is obviously a lot to jazz some this. I expect you prefabricated many reputable points in Features also.




Regards
Meredith



breast augmentation oklahoma city

Posted on Tuesday, 26 January 2010 15:33

Please let me know if you are fascinated to line as article writer for me? I can offering $10/article.




Regards
Shelton










Watch House Online

Posted on Wednesday, 27 January 2010 19:09

Although I do concur with your mark, I hump my own reservations.




Regards
Wilkinson





washington state dui attorneys

Posted on Friday, 29 January 2010 01:01

Very interesting message module bookmark your place to checkout if you write more roughly in the futurity.





Regards
Quinn

healthy dog biscuit

Posted on Friday, 12 February 2010 14:44

Wonderful illustrated collection. I impart you most that. No uncertainness it testament be real recyclable for my hereafter projects. Would same to see several separate posts on the like individual!





Regards
Marlowe



Raleigh Painting Services

Posted on Monday, 22 February 2010 23:10

When is the next post comming on this topic.


Regards

Willam

Heating and Air Baltimore

Posted on Monday, 22 February 2010 23:10

I bookmarked your post will read this latter


Regards

Arvin

Spa Chemicals

Posted on Monday, 22 February 2010 23:28

I digged this for more news from you.



Regards and respect
James

passing a drug test

Posted on Wednesday, 31 March 2010 02:48

When is the next post comming on this topic.


Regards

Willam

pass drug test

Posted on Wednesday, 31 March 2010 02:49

When is the next post comming on this topic.


Regards

Willam

Add comment


(Will show your Gravatar icon)

  Country flag

biuquote
  • Comment
  • Preview
Loading