However, there was still a significant amount of work needed to be done by the test and deployment team at this point. We had a complete hands off build and test system. We could build code and test its logic in one process. We could also test post build using UI test tools such as Watin and Selenium. By this point in time, we were also on the cusp of implementing Quick Test Pro test scripts into the process. There was only really one manual piece of work that we needed to complete in order to have a completely hands of build/test/deployment.
Now, we had a great deal of NAnt scripts that could do pretty much anything we wanted. We were building and running multiple tests, the only thing we could not do at the time was automatically release on a successful build process. We didn't want this to take place in production though, but this would be acceptable for any other environment (including DR). We had a shiny new SCM in place that was fully supported by the build scripts we had created. Right about then, we were also churning out the same amount of code as the main development teams. We worked in c# as well as some other languages like Python and Ruby, whereas the development teams worked exclusively in VB.Net. The last hurdle for us to cross would be the automated deployment after a god build.
It wasn't just the code that the backroom teams were developing that was growing. The company had expanded it's customer base and we were now providing solutions to other global players in the finance sector. The test and deployment teams had already demonstrated that we could work with the development teams to create a build process exclusively for them. In reality, we just a bunch of NAnt scripts and tasks that we used dynamically. The only thing that we needed to do next was to appraise our CI software.
Cruisecontrol.Net is a fantastic tool, but if you work in a company that is providing different solutions to different clients it can be a bit tricky. As the business at large was my teams customer, we quickly found that even though we could put together a CI system for any given team to cater for each spec would mean we would need to implement a Cruisecontrol.Net server per development team. We also found out that Cruisecontrol.Net would be a bit of a nightmare to configure automated deployments. Even though Cruisecontrol.Net could make use of .Net Remoting, the head of infrastructure pointed out (and rightly so) that to achieve an automated deployment using this software would be a security disaster. At the time, Cruisecontrol.Net was a complete CI tool in a box, to implement an automated deployment using Cruisecontrol.Net would have meant we would have had to deploy a Cruisecontrol.Net server to each environment and its selection of application servers. The obvious problem here is that if had have done that, then any malicious users could trigger a build out in the environment itself and possibly substitute our code for theirs. We looked at a way around this, but we could not come up with a solution that infrastructure could support due to security issues - we explained that we could get away with NAnt being out in the environment to handle deployment. Again, and rightly so, infrastructure pointed out that the same problem regarding malicious builds would still exist - basically they said that we were not to have any build tools out in the environments at all. At this juncture in our brainstorming, I pointed out that MSBUILD is already out there and is comparable to NAnt and that it really couldn't be removed as it was needed by .Net! Infrastructure was thankful for this update, but still forbade us from using it for deployments.
Right about now we had two problems on our hands. First up we needed a new CI build tool that could cater for the direction the company was moving towards and secondly, we needed to come up with an automated deployment tool.
We dealt with the CI element first.
We took a look around to see what was on offer. Team Foundation Server could do what we want build wise, but was simply not up to the task as far as our development was concerned. Cruisecontrol.Net was getting left behind with regards to features - the way in which it worked also started to become obsolete for us. We looked at some other CI tools that were available, both the head of development and myself thought we should go for the best of bread solution. This had already been highlighted in our existing setup - there were no tools or frameworks etc that we were using that were entirely dependant on another. From the outset, I wanted each element of the overall system to be as agnostic as possible so that when the time came to change something there would be little impact on our ability to carry our testing and deployments etc. We looked at other tools out there, things like Anthill Pro etc. We ended up going for a tool called TeamCity. It pretty much ticked all the boxes we needed, it was free (up to a certain point) and worked on the server agent architecture - and it was this function that decided it for us.
On adopting TeamCity to be our main CI tool, we asked for one extra server in the build environment (infrastructure did ask why we needed a fifth machine when we were only using one of our existing four...). The new addition was pretty important. Cruisecontrol.Net is a complete build server in a box which means that you need to have it installed on each machine you want to build from (or, at least this is how it worked when we were using it). TeamCity is a little more subtle. It makes use of what are termed as 'build agents'. With a server running constantly listening for build agents, you are able to add as many build servers as you want to cater for each of your projects. For instance, I could have (and have) ran a Linux box to build Mono assemblies. This box would be configured for this type of build only. My TeamCity server is running on a Windows 2008 machine on the network, it has been configured to only allow Mono builds to take place on the box running Linux. With Cruisecontrol.Net, you would have had to configure an instance of Cruisecontrol.Net on each build machine you wanted to build apps from. The TeamCity Agent/Server design is much, much more favourable. It means you can have a relatively cheap box sitting somewhere that is configured to do only one type of build - or in fact be configured to represent a machine out in an environment. You begin to ensure that the code you are building and testing is conforming to the spec laid out for that piece of build. With one server controlling it all and providing valuable MI. When we got to see this in action, we were very pleased. Now we could build for any spec and deliver information on these builds to whomever needed to get feedback.
This again made people happy. It made me happy because it meant that all my guys could simply get on with what they needed to do. The guys writing and maintaining the code we needed for this to take place were not bogged down manually checking that everything had built not only correctly, but also matched the spec for that piece of build. It enabled us to simply agree on a build spec for that development team and implement it. In essence, this was win all over.
As TeamCity uses the server/agent architecture, infrastructure was not adverse to us placing a build agent out in the environments to start working on a delivery system if we needed to do this. There was a lot of work that needed to take place prior to this though, the workstream that would enable a build to be deployed to an environment needed to be bashed out first. For instance, one of our requirements was that we wouldn't release a build to an environment until it has:
- Been built successfully.
- Passed it's unit tests.
- Passed it's UI tests on the build box.
These were pretty simple requirements for ourselves. The majority of the work left to us was to develop the NAnt tasks needed to carry out environmental testing and to create a deployment platform.
There was a pseudo deployment platform in place already, it worked as part of the existing application. it was primarily used to handle some functions out in an environment. Things like pushing assemblies out to application servers and controlling web servers et. I took the decision to move away from this system - I didn't want the build/test/deployment process to be dependant on the actual product we were making.
So, we came up with a prototype platform that implemented .Net Remoting and MSBUILD. There were a few other staging areas as well, like TFTP servers and small services to control things like web-server availability and application servers. As the project for automated deployment went on, we encountered new tools like PowerShell. To me, this was a very useful tool, it meant that I could create c# code that could be accessed directly from within a PowerShell script (we already knew we could script .Net code in NAnt, but we didn't like to do this.). So, MSBUILD gave way for PowerShell, we liked it and so did infrastructure. With these tools in implementation, we needed to define the automated release process. It went a little like this:
- Build machine is selected from server farm.
- Build machine clears a workspace.
- Latest code is updated to the build machine.
- Latest code is built.
- If the build was successful, unit testing takes place.
- On successful unit testing the assemblies are packaged for deployment.
- Selected environment would begin a countdown to unavailability.
- Once build is packaged and the environment is down for deployment, existing code is uninstalled on target machines.
- Latest build is deployed and activated on target environment.
- Tests are then carried out in the environment, mainly UI testing.
- On successful test completion, all relevant schema and sql is deployed to that environments DB servers.
- If the new code fails at all, everything is rolled back and the DB is not touched.
And that was pretty much it. All we needed to do was to come up with some simple .Net classes to handle some of the deployment steps via .Net Remoting. Our PowerShell scripts would then handle the process out in the environment. If it all failed, everything was rolled back prior to releasing to the DB and the team responsible for the build would be notified.
We could have ran a test prior to committal on each developers machine. However, the head of development and myself thought that this could be a little tricky to carry out as there were so many developers working on so many work-streams in so many environments.
When it came to deploying to production, the same scripts and classes were used, but with intervention from someone in the release team to guide it all. So the same process was used, but in such a manner that steps were followed manually.
Just prior to leaving that company, we had created a complete build to release system in just under a year. The system I designed and created with the other guys in the release team (and ultimately the test team when I got promoted) was a world away from the VB scripts we used when I first joined. Now, each customer of the company has its very own build and test specification as well as an automated deployment specification.
Doing all of this would not have been possible if it were not for the availability of the tools at our disposal. There are earlier posts on my blog on how NAnt can be used to create new tasks. Systems like TeamCity allow you to concisely manage all aspects of a development work-stream from end to end. The only thing I never got around to doing for that company was to implement FITNesse.
I continue to praise CI and Agile to my colleagues, when ever I start a new contract, I often ask what build process they have in place. Directly after working for that company, I joined a small team within a huge organisation that needed to create a mini version of my last CI project. I now get asked by developers and employers to explain just how easy it is to implement for their own projects. It astonishes me sometimes to see very complex pieces of software not getting even the simplest unit test - and that has to go through a manual build phase with set specification.
No comments:
Post a Comment