Agile/XP

Agile Integration

sort

develop

practices

integrate

practices

build

practices

test

practices

deliver

practices

Catena 2007

a

Appleton et al. 2007

a

Braithwaite 2007

Appleton, ed. 2007

a

Integrate Button

a

Duvall 2007

a

Duvall 2007, ch. 6

Getting Started

Build Software at Every Change

r

During build process, we clean derived objects, view integrated code, generate derived code and documents, compile source and tests, rebuild a database with test data, test, inspect, package, label all files used in the build process, deploy, and generate feedback. We do not write code or non-trivially merge. Execute build script, and commit changes to the repository, from the command line, to avoid dependencies on IDE or version control system. Using an IDE to run a build is appropriate as long as you can run the same build without using the IDE as well.CI process. (1) Developer privately builds while updating code and rebaselining, then commits integrated changes to the version control repository. Meanwhile, the CI server on the integration build machine is polling this repository for changes (eg, every few minutes). (2) Soon after a commit occurs, the CI server detects that changes have occurred in the version control repository (cron-scheduled builds are not continuous). The CI server retrieves the latest copy of the code from the repository, and then executes a build script on an integration build machine. (3) The CI server generates feedback by e-mailing build results to specified project members, and posts results and (eg, test and inspection) reports for the latest build. May also short-message build failures, or syndicate build status. (4) CI server continues polling.Build often to find problems earlier. Do all the software components work together? What is my code complexity? Is the team adhering to the established coding standards? How much code is covered by automated tests? Were all the tests successful after the latest change? Does my application still meet the performance requirements? Were there any problems with the last deployment? The reason you want to build continuously is to get rapid feedback so that you can find and fix problems throughout the development lifecycle.

Features of CI

r

Key features. (1) A connection to a version control repository. (2) An automated build script, run with every change to the version control repository. (3) Some sort of feedback mechanism (eg, e-mail). (4) A process for integrating the source code changes (either manual or CI server).Compile. Although you are not generating binaries using dynamic languages (eg, Python, PHP, Ruby), many provide the capability to perform strict checking.Prepare database. We treat the database source code—Data Definition Language (DDL) scripts, Data Manipulation Language (DML) scripts, stored procedure definitions, partitioning, and so on—in the same manner as any other source code in the system. This includes scripts to drop and create tables and the database, apply procedures and triggers, and insert test data.Test. Without automated unit, component, system, load and performance, security, etc. tests, it is difficult for developers or other project stakeholders to have confidence in software changes.Inspect. Automated code inspections (eg, static and dynamic analysis) can be used to enhance the quality of the software by enforcing rules. You can use your CI system to run these rules automatically against a code base. Using a report like this can enable continuous monitoring of coding standards and quality metrics.Deploy. Generate bundled software artifacts with latest code changes and publish to test. CI can automatically deploy or install files to the appropriate environment, and may automatically roll back all changes applied in the deployment.Execute the same automated build, with slightly different parameters, in each (eg, build, test, deploy) environment.Tools periodically generate class diagrams and other information, all based on the committed source code in your version control repository.A critical feature to good CI systems is speed. The essence of a CI system is to provide timely feedback to developers and project stakeholders. It's easy to load so much into a CI system—for the sake of completeness—that it takes an unreasonable amount of time to finish a cycle. As such, a balance must be struck between the breadth and depth of a CI process against the need to provide rapid results. This is especially important when using continuous testing.Are you using a version control repository (or SCM tool)? Is your project's build process automated and repeatable? Does it occur entirely without intervention? Are you writing and running automated tests? Is the execution of your tests a part of your build process? How do you enforce coding and design standards? Which of your feedback mechanisms are automated? Are you using a separate integration machine to build software?

Introducing CI

Commit Code Frequently

Don't Commit Broken Code

Fix Broken Builds Immediately

Write Automated Developer Tests

All Tests and Inspections Must Pass

Run Private Builds

Avoid Getting Broken Code

Continuous Testing

r

Write unit, component, system, and functional test cases. Schedule simple, one-assert test cases by run time and dependencies, to run many frequently. Automate significant dependencies (eg databases) to make repeatable tests. Fix defects by asserting correctness of code and methods of failure.

Automate Unit Tests

r

Test effectively and frequently to ensure and measure reliability at the lowest level.The key aspect for unit tests is having no reliance on outside dependencies such as databases, which have the tendency to increase the amount of time it takes to set up and run tests.A true unit test should run to completion (successfully) in a fraction of a second. If a unit test takes longer, take a close look at it—it's either broken, or instead of being a unit test, it is really a component-level test.In a CI environment, builds are run any time someone applies a change to the version control repository; therefore, unit tests should be run each time someone checks in code (called the commit build). There is little configuration cost, and the resource cost to run them is negligible.

Automate Component Tests

r

If a dependent object itself does depend on an outside entity like a file system or database and isn't mocked, the test becomes a component test.Component or subsystem tests verify portions of a system and may require a fully installed system or some external dependencies, such as databases, file systems, or network endpoints, to name a few.The difference between this type of test and a system test is that integration tests (or component tests or subsystem tests) don't always exercise a publicly preferable API.Component tests have a specific cost to them: dependencies have to be put in place and configured. These tests alone may only take a few seconds; however, in the aggregate, this time adds up.

Automate System Tests

r

System tests exercise a complete software system and therefore require a fully installed system, such as a servlet container and associated database.System tests are fundamentally different than functional tests, which test a system much like a client would use the system.Running system tests with every commit build could be a recipe for disaster, but sometimes these types of tests are run with secondary or periodic builds. Otherwise, nightly (off-hour) runs are good for these tests.

Automate Functional Tests

r

Functional tests, as the name implies, test the functionality of an application from the viewpoint of a client, which means the tests themselves mimic clients. These tests are also known as acceptance tests.Table models for testing are highly effective communication mechanisms that someone can author without needing to be a developer.System and functional tests, which require a fully installed system, take the longest to run. Additionally, the complexity of configuring a fully functional system occasionally limits the full automation of these tests.

Categorize Developer Tests

r

We need a common understanding that tests are differentiated specifically by the setup they require (seeding databases, etc.), which correlates directly to how long they take to run. Categorizing developer tests into respective buckets (unit tests, component tests, system tests, and even functional tests) helps you to run slower running tests after the faster running tests.By defining a common manner for categorizing tests, such as through annotations or naming patterns, you are all set to instruct your CI system to run each category when appropriate, and your build times are completely manageable. This means that tests can be run at regular intervals instead of being abandoned when they take too long to execute.By defining and grouping tests by type—unit, component, and system—development teams can fashion a build process that runs test categories rather than a gigantic test task that runs everything at once.

Run Faster Tests First

r

Typically, the majority of a build's runtime is spent on tests, and the longest tests are those with dependencies on outside objects such as databases, file systems, and Web containers. Unit tests require the least setup (by definition, none), and system tests need the most (everything).Unit tests run most often (with every commit); component tests, system tests, and functional tests can be run with secondary builds or on periodic intervals.Run unit tests every time someone checks code in, as they don't take much time to execute, and then schedule periodic intervals to run component tests (or after commit builds) and then another interval scheme for system tests. Those intervals can be increased as iterations come to a close, and you probably want to run them more often in the initial project stages too.

Write Tests for Defects

r

When a defect is discovered, find and isolate the offending code. If the project has a healthy number of test cases, it's probably a good bet that the defect has occurred in some portion of untested code (maybe an unconsidered path)—and most likely in the interaction of components.Start by writing a test case that triggers the same exact behavior that was reported in the defect summary. Remember that we're writing a test to pass on the behavior, not to fail.This methodology, by the way, is slightly different than the prevailing "defect-driven development" approach, which suggests writing a failing test case first and then to keep running that test (while fixing the defect) until the test stops failing.This next decision is what differentiates this approach from others—in fixing our test case, we will assert the new behavior, by testing that the code fails in the designed manner.

Make Component Tests Repeatable

r

Databases present quite a large dependency for testing, leaving you with two choices. Either mock out as much as possible and avoid the database altogether for as long as possible, or pay the price and utilize the database. The latter choice presents a new series of challenges—how do you control the database during testing? Even better, how do you make those tests repeatable?These frameworks abstract a database's data set into XML files and then offer the developer fine-grained control as to how this data is seeded into a database during testing.Note, though, that this class makes the assumption that the database is located on the same machine on which the test is run. This may be a safe assumption on the developer's workstation, but obviously this configuration can present a challenge in CI environments.This method frees developers from having to provide a path to a file—something especially tricky in different environments.

Limit Test Cases to One Asset

r

Haphazardness tends to lead to an abundance of assert methods ending up in one test case. If the first assert fails, the whole test case is abandoned from the point of failure. This means that the next asserts aren't run during that test run.A more effective practice is to try and limit one assert to each test case. That way, rather than repeating the three-step process just described any number of times, you can get all your failures without intervention in one test run.This practice, of course, leads to a proliferation of test cases. This is why we have the separate directory structure [to categorize them].

Flowers 2006

a

Richardson 2006

r

s/test/task/gTest AutomationAnother practice that a CI system encourages is test automation. Writing and running tests is a huge milestone for many shops and is one of the hallmarks of a great shop. I think test automation is the core of why a CI system adds such benefit. People who recognize the benefit of automating common tasks tend to be more strategic thinkers. They automate everything possible, including building and testing, and it frees them up for more interesting work. (Of course, this doesn’t eliminate manual testing, but that’s another topic.)Leverage YourselfAn automated test is a great way to leverage your experience and expertise. As an expert on your product, you probably know how to test parts of it in a way that few other people can. You can encode that knowledge in a reusable format by creating an automated test. This makes your experience available without diverting your attention. Sometimes a co-worker will run the tests, other times you will. In other cases, a program will run them.Let these bits of your expertise exercise the product while you do other things, like go home on time, or stay late and solve problems that are more interesting. These tests might run while you are coding or at home sleeping, but you are doing something else. Your tests are working in the background.

Fowler 2006

Jeffries et al. 2006

r

Jim Coplien* If you work in the context of all the versions and see them all the time, you get nothing done but trying to anticipate how to accommodate the stuff flying at you.If you ignore the changes, then you end playing catch-up when the changes do become visible.So you don't even get a chance to ever see how the other person's changes affect your code, unless you're looking at all the changes in the system all the time. *The most serious problems related to logical dependencies. Even though you don't change my module, you interact with it in subtly different ways. Or you change both something that calls me and something that I call in a consistent way, but if I depend on the semantics instead of just passing the results through, I'm screwed. You can't automatically detect these things: it takes architectural dialogue.Dave HarrisSo there are lots of advantages to integrating often. On the other hand, I need to be able to focus on fixing the bugs I just put in, and only those. One thing at a time. I need to eliminate the extraneous changes created by other people.The way I manage this is to get my stuff into a decent state before accepting an update. Then I know the new bugs are due to other people's changes and I can concentrate on what they've done. One thing at a time. I alternate between doing real work and merging in the work done by other people.Updates are frequent, but not truly continuous. They are not allowed to interrupt the real work - nothing is. Real work requires periods of unbroken concentration, between half an hour and a few hours. The exact time varies; when I'm done, I'm done, and then I can look around, see what everyone else is up to, update my code base, read email or whatever. This point will be reached at different times for different programmers, but every programmer should reach it sometime during each day. Frequent, but at a time of his choosing.Another issue is that an update is fairly time consuming. Say 20 minutes in my (C++) environment? For that reason alone you wouldn't want to do it every hour. You can gain a little bit by postponing several updates and doing them all at once, especially if a single file changes in each update. But you don't gain much. I'd expect daily integration provides the best balance of forces for quite a fair range of team sizes.Ron JeffriesEven in C3, Continuous Integration isn't continuous. We do try to integrate our individual workspaces after every task, and that can be a very short time, never less often than daily. The idea, however, is that (a) you never want to edit over anyone else's edits, therefore you want to be integrated, and (b) you don't want other guys' edits to get in your way as you work, and (c) the longer you wait to integrate, the harder it is.Brad AppletonThere are a number of issues arising from the sentiment "What if they made their (correct) change, and presto! everyone's computer instantly had that version of the module?"The above is probably not really what you want, nor what you are truly doing. If it was, all you need is to have all of you make your changes in the very same workspace (with some versioning and concurrent editing facilities). Even if you managed to work on separate files all the time, this is still not what you usually want to happen. The problem is that others don't get to choose when they are ready to deal with your code. You may make some changes which impact the coding task I'm working on, even if it doesn't touch the same files, and now you've just broken my code. I was perhaps in the middle of some important train of thought and now I do not have the option of finishing my flow first, and then dealing with your changes.

Appleton et al. 2005-2006

r

http://tech.groups.yahoo.com/group/continuousintegration/message/68http://tech.groups.yahoo.com/group/extremeprogramming/message/121425http://tech.groups.yahoo.com/group/continuousintegration/message/112http://tech.groups.yahoo.com/group/continuousintegration/message/191http://tech.groups.yahoo.com/group/continuousintegration/message/317http://tech.groups.yahoo.com/group/continuousintegration/message/376

a

Zawadzki 2004

Zawadzki 2003b

Zawadzki 2003a

Wells 1999

(Beck) 1999

r

Continuous Integration. Code additions and changes are integrated with the baseline after a few hours, a day at most. You can't just leap from task to task. When a task is done, you wait your turn integrating, then you load your changes on top of the current baseline (resolving any conflicts), and running the tests. If you have broken any tests, you must fix them before releasing. If you can't fix them, you discard your code and start over. In the short term, when the code base is small, the system stays in very tight sync. In the long term, you never encounter integration problems, because you have dealt with them daily, even hourly, over the life of the project. Continuous integration makes collective code ownership and refactoring possible without overwhelming numbers of conflicting changes, and the end of an integration makes a natural point to switch partners.

a

McConnell 1996

article libraries

Agile Development Practices

Agile SCM Articles

Continuous Integration

practice relationships

Kobayashi et al. 2006

r

Test-driven development weakly depends on pair programming, mutually depends on refactoring, and enables simple design, collective code ownership, and continuous integration.Continuous integration depends on test-driven development, and enables small releases.Improve productivity: refactoring reduces cost of changes. According to the policy of XP's simple design, we must keep the growing system simple. We achieved this by using test driven development and by refactoring to keep the code simple.Improve program code quality. By practicing test driven development, pair programming, refactoring, continuous integration and collective code ownership, we could keep the quality of the software satisfactory for users.In test driven development, programmers also create test cases, so it is important to review test cases during pair programming to reduce mistakes and help stop necessary test cases being missed. As a result pair programming is able to contribute to test driven development.Improve requirement quality. In order to keep doing iterations with small releases, we had no time to do integration tests, so we need continuous integration. By doing customer tests in early iterations, we can get rapid feedbacks and find defects relating to the quality of requirements. These enabled us [to] smooth out releases.

Test-Driven Development

glossary

baseline

r

Make a private, static copy of a complete, coherent (eg, built), and validated (eg, tested) set of versions of files (configuration).

build

r

From a numbered configuration, generate source, derive and link objects, and package files.

integrate

r

Incorporate private changes with the development team's codebase on an integration branch.

label

r

A version on a branch specified by a fixed label (ie, not LATEST).

produce

r

Make a build available outside the development organization (eg, company).

rebaseline

r

Incorporate changes from the integration branch made after the workspace's current baseline.

release

r

Make a build available outside the development team (eg, coders).

update

r

Patch a released or produced build.