What is Continuous integration?
Continuous integration is a computer software development discipline that is valuable for any software development project that has two or more people working on it. It is a basic step to contemplate after investing in source control.
Continuous integration is also known as "CI" for short, also "the CI build" or sometimes just "the build", e.g. "You broke the build".
Continuous integration depends on source control. A modern source control system will support CI integration in some form. CruiseControl is the most popular CI plug-in for Subversion and other systems. Microsoft Team Foundation Server comes with CI built into the 2008 version. Another CI system is TeamCity from JetBrains, which works on many popular Source Control systems, incdling TFS, Subversion, Visual SourceSafe, CVS and Perforce
So what is it? Continuous integration means that integration happens every time that anything changes. Integration in this sense is building code from different developers into a finished piece of software. The reason for continuous integration is to avoid a lengthy and risky integration process whenever a release is needed.
So what happens? Continuous integration is basically a process on a server that is triggered every time new code is checked into the source control system. At the most basic it compiles the latest version of the whole of the source code and presents the results as a success or failure. If the compilation gave an error, then the last changes checked in are said to have broken it, and the priority is to fix that first. The result status is often a traffic light; red for failure, green for success, yellow for warnings. Hence when a build is fixed, you could say "the build has gone green". As a result, you are always in a known state as long as the build works - you know that the build works, and you can even run the latest version if you need to test or demo it.
So if the code compiles, it's bug free and good enough to ship? No, but the build is usually far more than just compilation. Some builds are thorough enough that if the build is successful, it's good to go for final manual testing before shipping.
The main addition is that automated unit tests are run after compilation and any failure should break the build. Code quality metrics can be run, code coverage percentage calculated for the unit tests; and these failing their thresholds can break the build. Installer packages and documentation can be generated. In some cases larger projects have found it beneficial to run an automated "smoke test" install to a machine set aside for the purpose. Failure at any stage will, of course, break the build.
This is all intended to make integration problems and bugs visible as soon as possible, and to ensure that problems are swiftly addressed at source. And to put new builds on tap.
If source control is step one is getting good modern software engineering practices in place, continuous integration is step two or three. You can have automated unit tests before you have continuous integration, but without CI you will have to fire them manually. CI will run them every time a change is made.
There is an agile software development saying "if it hurts, do more of it". Clearly this doesn't apply in all circumstances, e.g. hitting yourself over the head with a hammer, but for things that you are going to have to do more than once, like finding bugs and releasing working software, it helps a lot to grease the wheel and to keep it turning. Doing more of it will encourage automation, which will make it easier in future.
In practice it's not always that easy. Build scripts use their own arcane syntax and call out to a multitude of external tools. This can be fragile and complex, so debugging them is still somewhat of a black art. The debug output from build scripts can be too verbose to read in entirety (I've seen 5Mb of text output from a small project, and 50Mb from larger ones) and still manage to be cryptic and obscure when you do find the relevant lines. People with good skills in managing builds are in demand.
Continuous integration comes from Extreme Programming, via Martin Fowler and Kent Beck in the late 1990s. However the practice will help software development even on more traditionally-run projects.
Continuous integration best practices are:
Use a Source control system to maintain a single, versioned repository of project source code. This is a separate topic, and is a necessary precondition for CI and for other good practices. All files needed to build the system should be placed in the source control system, except for the basics of common resources– e.g. the standard install of operating system, compiler and tools. Often required resources such as third party libraries or build tools are often placed in source control with the goal that on a fairly generic newly created machine, it is possible to successfully perform a get from source control followed by build.
Keep the build fast. Noticing and fixing problems is much easier if feedback is fast. If the build takes too long, a new checkin can occur before the last build has finished, and a backlog can build up. You don't want to wait for hours for your changes to finally build, you don't want to be in the position of having to avoid checking in after 4pm, since that could break the build at 6:30pm after you've left the office, and will impact co-workers in another time zone. Some build systems allow you to mitigate this and accumulate changes so one build rolls up all checkins since the last build. This stops backlogs building up, but if the build breaks there can be more than one possible cause, so it has drawbacks too. A fast build avoids all this complexity.
How fast does the build need to be? It depends on the project, but 5 minutes is ideal for most projects, and 10 minutes is considered acceptable in general. However some teams adopting CI with an existing project would rejoice if their build time came down to an hour. With slower builds you have to check in less often and more cautiously, so it influences your style for the worse.
Generally the time taken to execute unit tests is a big factor in build time. Often there is more than one kind of build, and the CI build does not execute slow tests. Slow tests are usually ones that depend on external systems such as web sites or databases, or test the whole system top-down, rather than running little pieces of it. These are often considered "integration tests" not unit tests, but that's a different topic.
Automate. If it can be automated, it should be automated. Ideally, the build should run all the way though to generating a tested final product (software package, installer, .msi file, cd image or whatever form you use to deliver software to your client). Doing a release from the source should be a one-click process – just start the appropriate build. This may not be the CI build - the CI build is usually with debug data, and often the release build is not.
This removes opportunity for errors under pressure, and thus facilitates change. The CI build will often result in a debug executable being made available for manual testing with no extra steps. e.g. The build can be automatically deployed to a test web server.
Build frequently. Each person should check in frequently, at least daily. But whenever a logical piece is working correctly, it is a good place to check in.
Unit test. The build should test itself. Unit testing is a separate topic, but CI is an huge enabler of unit tests and code analysis tools.
Make the build status visible. If everyone can see the state of the build, everyone knows what's happening as soon as it happens. Openness, visibility, and fast feedback are major agile values. Encourage people to know the state of the build, and to avoid breaking it. But while putting a rubber chicken on the desk of the person who breaks the build may be effective, bear in mind that in some circumstances it could be considered to be public humiliation, which is not always a good management strategy.
Martin Fowler on Continuous Integration: http://martinfowler.com/articles/continuousIntegration.html
Cruise Control: http://cruisecontrol.sourceforge.net
Microsoft TFS: http://msdn.microsoft.com/en-us/vsts2008/products/bb933758.aspx
Joel Spolsky mentions automated builds here http://www.joelonsoftware.com/articles/fog0000000043.html and http://www.joelonsoftware.com/articles/fog0000000023.html