Developer Aspirations

YAPB - Yet Another Programming Blog

Tuesday

12

July 2011

Ideal Development Process - Part 1: The Build

by Colin Miller, on Building, Continuous Integration, development, Musings

I've been kicking this idea around for a while now. I often make comment on ways that I would prefer to develop or practices I would like to use or even avoid. However, they're often fairly disjointed things. "Write tests first" stands ok on its own, but where does it fit into the overall big picture of a full development work flow? So starting today I'm going to outline my ideal development process I would love to use. There's a lot of aspects to such a process so it will be divided into multiple posts.

[caption id="" align="aligncenter" width="553" caption="The state you want to avoid when writing software. This team... did not avoid that fate. Now their cubes lie empty and hollow, with a cold echo speaking of unimaginable loneliness and dispare. (actually they're still there just without the sign)"]#FAIL[/caption]

The Build

Today we're going to start with the build process. Building an application is something that needs to happen quickly and with confidence. What this means is that creating a build for deployment should be a process that doesn't waste hours, or even minutes of your time. Ideally it should be nearly instantaneous, like with Rails, PHP, or other interpreted scripting languages. However, there should still be a process to kick off and verify the syntactic correctness (via compilation) of code along with exercising tests.

While a plugin for an IDE to run a build and perform all of the tests would work, I don't like being tied down to a particular editing tool. If you want to run your build process by sshing into your work machine from home, you shouldn't need to bring up a graphical client to run the build. There should be an easy command to just launch it off. For this something like maven or ant can help accomplish the task. If it's a non-compiled language then just a series of scripts for running the unit tests can be fine as those tests should be forcing the code to be interpreted (and internally compiled) and catch those sort of errors. Ideally, this whole process should take less than 20 seconds. Enough time to collect your thoughts after a change, but not too much to cause you to lose your focus, leave the zone, and feel like you're watching a status bar.

The build should be able to produce a working deployable. This can be either a packaged distributable such as a Java .war file, an executable binary, or just a folder of flat php files that can be copied into an apache home directory. It doesn't matter as long as the build system can produce something deployable that can be used for integration testing.

Building Should Be Easy On A New Dev Box

This deployable should not be dependent on a specific environment. If it's a .war file for a web application that uses a database, the deployment should not have to assume that the database exists in order to be deployed. The build or deployment process itself should have the opportunity to create the database with the needed structures and perhaps some sample testing data. Something like Rail's Active Record is a good candidate for this type of database versioning and creation. Liquibase is another tool for handling these sort of database changes and setups to a project.

Similar to the database, the system should not require a ton of applications or libraries to be installed (or environment variables to be set up) in order to build. Naturally you'll need to get some of these tools, but time should be taken to write scripts to automate the process of setting up a new development environment from the checked out source code of your project. This could be using wget to download compilers, languages, libraries, and other tools along with automations to set up property files and environment variables. These scripts need to be kept up to date as well, and should be exercised as part of your continuous integration.

Speaking of which, Continuous integration should be taken into consideration with your build system. Whenever code is checked into whatever version control system you're using (I prefer git, but use SOMETHING), those changes should be picked up by a continuous integration server that can run your build. The CI box should be cleaned after each run (ideally started from a blank VM) and those dev scripts that set up the environment for the test should be run. There shouldn't be a manual process taking place for setting up the build server. The build server should run the build, including running all unit tests, deploy the distributable if it's a web application, and run any integration tests on the deployed version. Emails should be sent for failures, and small red lights should turn on at the developers desk (I'm building one of those right now).

Large Applications

This sounds all great and dandy for your standard small team project. But what about large projects? What about something like Amazon.com or LinkedIn? These sites are composed of hundreds (possibly thousands) of services that all talk to each other via some sort of transport with coupling between modules. How do you build an entire site consisting of hundreds of thousands or millions of lines of code and keep your build under 20 seconds? The simple answer is: you don't. You can't really, but you shouldn't have to build everything to work on your part of the application.

Breaking down large problems into smaller problems is the essence of what software development is. A huge site like LinkedIn is a large problem from a testing and deployment standpoint. However, the sections that you're working on is most likely just a small service with a few dependencies on other services. For this, there are two solutions.

The first solution is for every service to have a mocked deployable. Something that acts as a stub so that when you deploy the service you're using, you can rely on the stubbed service to provide an approximation of a correct answer to your call. It's almost like a larger version of a mocked object that persists for interactive usage. These can be difficult to set up, but are worth the time invested because they can help stabilize your API and be reused for many components. This way when you deploy your real service, you just use stubs for testing your results. To make setting this up easier, you should standardize on a transport protocol and write a tool that automates the creation of a stub based on an interface.

The other solution is to have each service set up independently in different containers or boxes that can all talk to each other. The whole stack is up all of the time. When you make changes to a service, the continuous integration server builds and deploys that service. All other services can now talk to your new version and your service talks to real versions of all of the other services. The main problem with this approach is that if you break a service, other services that rely upon it can also break. It won't be in the code for these other services (though they should probably handle errors coming from your service better), but it does make it harder to test them. However the results will be more accurate since they're actual running services rather than stubs.

Personally I would prefer the stubbed service solution. Part of creating any new service should be to create a quick useful stub that can be deployed in placement of a real implementation. This would also have all of the unit and integration tests associated with it.

Conclusion

In essence, the build process should be fast, easy to automate, and run a full set of tests. The build should create a useful deployable that can act as the finished product. This should be something that you would release at the end of an iteration in the agile development process. The build should allow for a new developer machine to get started quickly, including scripts to acquire and install all required components for the build. It should be highly automated, to the point where you should be able to type in a short command and have everything run. It should be heavily tested and reliable so that the build runs the same way each time.

comments powered by Disqus