2009-01-07

DTC and Windows Firewall - A Rant

Let me preface this by saying: If you're looking for a solution to getting Microsoft Distributed Transaction Coordinator (MSDTC) working through Windows Firewall, this is not the post you're looking for. No solution exists here.

I'm working on a project that is in three parts — a rich Windows Forms client, a Web Service business and data access layer, and a SQL Server database. The QA guy was having an issue where some web service calls were timing out, causing the application to crash. While I couldn't replicate it, I did notice that the web service calls being made resulted in a spike of read locks on the database. I figured a viable option would be to have that service set a transaction scope with an isolation level of Read Uncommitted. It was a quick and dirty way to get it to read data without locking in a condition where, for whatever reason, it seemed that read locks were colliding. (Doesn't make a lot of sense to me, as nothing was writing to the database at the time, but it's all I had to go on.)

After trying and failing to use the Enterprise Library to create a transaction (it kept insisting the transaction and the command objects belonged to different connections, even when I created them equal), I decided to go with the System.Transactions namespace to manage it. From a code perspective, it would be much, much cleaner anyway.

It did perform much faster, without the spike in database locks, but it revealed another problem. In trying to solve the problem himself, QA guy thought he should try splitting the application up instead of running everything on one machine, so he put the database in a virtual PC. When he got the new web service code, it threw an error about being unable to connect to the DTC service on the database server.

This was a fortunate discovery, as it would've been a problem if we got to a client's environment and found this issue.

I created three VPCs of my own and put the three pieces each on its own VPC. The rest of the next two days, I spent trying to get DTC to work between the web service and the database server.

Everything ran fine when the Windows Firewall was disabled on all machines. First, DTC had to be configured on both the web service server and the database machines by enabling "Network DTC Access" and inbound and outbound "Transaction Manager Communication". Also, the DTC service on the SQL Server had to be started and set to Automatic start. (The web service server would start its own as needed.) Also, since my VPCs were in a workgroup and not a domain, I had to set DTC to use "No Authentication" rather than "Mutual". Quite a bit of extra work, but so far, not completely outside the realm of what we might possibly have to ask a client to do to their own servers.

Turning on the firewall on the client wasn't an issue. Turning on the firewall on the web service server caused an issue with both the servicing of web service requests and database access. The former was relieved by allowing port 80 through, naturally. The latter was relieved by creating an exception for the process C:\Windows\System32\MSDTC.exe, which was recommended by articles I found describing DTC and Windows Firewall (namely, that all machines participating in distributed transactions should have this exception).

Then there was the firewall on the database server. I had to allow port 1433 for SQL Server access. No problem. Doing that meant non-transactional methods worked without a hitch. But from there, for the life of me, I couldn't get DTC to go through. I tried allowing port 135. I tried adding MSDTC.exe as an exception. Nothing. It just wouldn't work.

One or two sites suggested opening up a range of ports that DCOM uses. However, since port ranges are not a valid exception for Win32's firewall, I did not feel adding a series of exceptions for each individual port from 5000 to 5020 was a viable option — not for me, and certainly not for our clients. Not all articles indicated this was necessary anyway.

It just. Wouldn't. Work.

Incidentally, the utility DTCPing apparently works fine (once you enable RPC on both machines through the Group Policy Editor, which I had to do since my VPCs are running XP Pro). This would seem to indicate that it's not a networking issue or even a firewall issue, just a DTC-refusing-to-work issue.

A lot of things bug me about this. For one, setting up DTC requires a lot of extra steps (turn on network access, mess with authentication, start service and set startup option on database server, create firewall exceptions) that can't (easily) be done in an install routine. Two, the fact that DTC was even used at all — the code in question created a TransactionScope object, created a DbCommand, and called LoadDataSet on a Database, before disposing of everything (except the DataSet). This, to me, should not be taking more than one connection to the same database, and therefore it should not require a distributed transaction coordinator at all. And of course the typical "this (apparently) fundamental piece of Windows systems is broken thanks to a new 'security' feature" problem.

Since the only solution was to disable the firewall completely on the database server, and since this is not something we are going to ask our clients to do, I rolled back the change that added System.Transactions to the project. Right now, we're going with the assumption that the timeout errors were a problem with QA guy's machine, as it only happens when everything is running all in the process space of his primary machine. If he moves anything to a VPC (even just breaking it in two pieces, whether the web service ends up with the client or the database), it runs fine; and installing everything on the demo machine (which is actually older, less-performing hardware) also doesn't have the issue.

There go my plans for cleaning up the code, too.

No comments: