Source Code

The GeoServer source code is located on GitHub at https://github.com/geoserver/geoserver.

To clone the repository:

% git clone git://github.com/geoserver/geoserver.git geoserver

To list available branches in the repository:

% git branch
   2.1.x
   2.2.x
 * master

To switch to the stable branch:

% git checkout 2.2.x

Git

Those coming from a Subversion or CSV background will find the Git learning curve is a steep one. Luckily there is lots of great documentation around. Before continuing developers should take the time to educate themselves about git. The following are good references:

Committing

In order to commit to the repository the following steps must be taken:

  1. Configure your git client for cross platform projects. See notes below.
  2. Register for commit access as described here.
  3. Fork the canonical GeoServer repository into your github account.
  4. Clone the forked repository to create a local repository
  5. Create a remote reference to the canonical repository using a non-read only URL (git@github.com:geoserver/geoserver.git).

Note

The next section describes how the git repositories are distributed for the project and how to manage local repository remote references.

Repository distribution

Git is a distributed versioning system which means there is strictly no notion of a single central repository, but many distributed ones. For GeoServer these are:

  • The canonical repository located on GitHub that serves as the official authoritative copy of the source code for project
  • Developers’ forked repositories on GitHub. These repositories generally contain everything in the canonical repository, as well any feature or topic branches a developer is working on and wishes to back up or share.
  • Developers’ local repositories on their own systems. This is where development work is actually done.

Even though there are numerous copies of the repository they can all interoperate because they share a common history. This is the magic of git!

In order to interoperate with other repositories hosted on GitHub, a local repository must contain remote references to them. A local repository typically contains the following remote references:

  • A remote called origin that points to the developers’ forked GitHub repository.
  • A remote called upstream that points to the canonical GitHub repository.
  • Optionally, some remotes that point to other developers’ forked repositories on GitHub.

To set up a local repository in this manner:

  1. Clone your fork of the canonical repository (where “bob” is replaced with your GitHub account name):

    % git clone git@github.com:bob/geoserver.git geoserver
    % cd geoserver
  2. Create the upstream remote pointing to the canonical repository:

    % git remote add upstream git@github.com:geoserver/geoserver.git

    Or if your account does not have push access to the canonical repository use the read-only url:

    % git remote add upstream git://github.com/geoserver/geoserver.git
  3. Optionally, create remotes pointing to other developer’s forks. These remotes are typically read-only:

    % git remote add aaime git://github.com/aaime/geoserver.git
    % git remote add jdeolive git://github.com/jdeolive/geoserver.git

Repository structure

A git repository contains a number of branches. These branches fall into three categories:

  1. Primary branches that correspond to major versions of the software
  2. Release branches that are used to manage releases of the primary branches
  3. Feature or topic branches that developers do development on

Primary branches

Primary branches are present in all repositories and correspond to the main release streams of the project. These branches consist of:

  • The master branch that is the current unstable development version of the project
  • The current stable branch that is the current stable development version of the project
  • The branches for previous stable versions

For example at present these branches are:

  • master - The 2.3.x release stream, where unstable development such as major new features take place
  • 2.2.x - The 2.2.x release stream, where stable development such as bug fixing and stable features take place
  • 2.1.x - The 2.1.x release stream, which is at end-of-life and has no active development

Release branches

Release branches are used to manage releases of stable branches. For each stable primary branch there is a corresponding release branch. At present this includes:

  • rel_2.2.x - The stable release branch
  • rel_2.1.x - The previous stable release branch

Release branches are only used during a versioned release of the software. At any given time a release branch corresponds to the exact state of the last release from that branch. During release these branches are tagged.

Release branches are also present in all repositories.

Feature branches

Feature branches are what developers use for day-to-day development. This can include small-scale bug fixes or major new features. Feature branches serve as a staging area for work that allows a developer to freely commit to them without affecting the primary branches. For this reason feature branches generally only live in a developer’s local repository, and possibly their remote forked repository. Feature branches are never pushed up into the canonical repository.

When a developer feels a particular feature is complete enough the feature branch is merged into a primary branch, usually master. If the work is suitable for the current stable branch the changeset can be ported back to the stable branch as well. This is explained in greater detail in the Development workflow section.

Codebase structure

Each branch has the following structure:

build/
doc/
src/
data/
  • build - release and continuous integration scripts
  • doc - sources for the user and developer guides
  • src - java sources for GeoServer itself
  • data - a variety of GeoServer data directories / configurations

Git client configuration

When a repository is shared across different platforms it is necessary to have a strategy in place for dealing with file line endings. In general git is pretty good about dealing this without explicit configuration but to be safe developers should set the core.autocrlf setting to “input”:

% git config --global core.autocrlf input

The value “input” essentially tells git to respect whatever line ending form is present in the git repository.

Note

It is also a good idea, especially for Windows users, to set the core.safecrlf option to “true”:

% git config --global core.safecrlf true

This will basically prevent commits that may potentially modify file line endings.

Some useful reading on this subject:

Development workflow

This section contains examples of workflows a developer will typically use on a daily basis. To follow these examples it is crucial to understand the phases that a changeset goes though in the git workflow. The lifecycle of a single changeset is:

  1. The change is made in a developer’s local repository.
  2. The change is staged for commit.
  3. The staged change is committed.
  4. The committed changed is pushed up to a remote repository

There are many variations on this general workflow. For instance, it is common to make many local commits and then push them all up in batch to a remote repository. Also, for brevity multiple local commits may be squashed into a single final commit.

Updating from canonical

Generally developers always work on a recent version of the official source code. The following example shows how to pull down the latest changes for the master branch from the canonical repository:

% git checkout master
% git pull upstream master

Similarly for the stable branch:

% git checkout 2.2.x
% git pull upstream 2.2.x

Making local changes

As mentioned above, git has a two-phase workflow in which changes are first staged and then committed locally. For example, to change, stage and commit a single file:

% git checkout master
# do some work on file x
% git add x
% git commit -m "commit message" x

Again there are many variations but generally the staging process involves using git add to stage files that have been added or modified, and git rm to stage files that have been deleted. git mv is used to move files and stage the changes in one step.

At any time you can run git status to check what files have been changed in the working area and what has been staged for commit. It also shows the current branch, which is useful when switching frequently between branches.

Pushing changes to canonical

Once a developer has made some local commits they generally will want to push them up to a remote repository. For the primary branches these commits should always be pushed up to the canonical repository. If they are for some reason not suitable to be pushed to the canonical repository then the work should not be done on a primary branch, but on a feature branch.

For example, to push a local bug fix up to the canonical master branch:

% git checkout master
# make a change
% git add/rm/mv ...
% git commit -m "making change x"
% git pull upstream master
% git push upstream master

The example shows the practice of first pulling from canonical before pushing to it. Developers should always do this. In fact, if there are commits in canonical that have not been pulled down, by default git will not allow you to push the change until you have pulled those commits.

Note

A merge commit may occur when one branch is merged with another. A merge commit occurs when two branches are merged and the merge is not a “fast-forward” merge. This happens when the target branch has changed since the commits were created. Fast-forward merges are worth reading about.

An easy way to avoid merge commits is to do a “rebase” when pulling down changes:

% git pull --rebase upstream master

The rebase makes local changes appear in git history after the changes that are pulled down. This allows the following merge to be fast-forward. This is not a required practice since merge commits are fairly harmless, but they should be avoided where possible since they clutter up the commit history and make the git log harder to read.

Working with feature branches

As mentioned before, it is always a good idea to work on a feature branch rather than directly on a primary branch. A classic problem every developer who has used a version control system has run into is when they have worked on a feature locally and made a ton of changes, but then need to switch context to work on some other feature or bug fix. The developer tries to make the fix in the midst of the other changes and ends up committing a file that should not have been changed. Feature branches are the remedy for this problem.

To create a new feature branch off the master branch:

% git checkout -b my_feature master
% # make some changes
% git add/rm, etc...
% git commit -m "first part of my_feature"

Rinse, wash, repeat. The nice about thing about using a feature branch is that it is easy to switch context to work on something else. Just git checkout whatever other branch you need to work on, and then return to the feature branch when ready.

Note

When a branch is checked out, all the files in the working area are modified to reflect the current state of the branch. When using development tools which cache the state of the project (such as Eclipse) it may be necessary to refresh their state to match the file system. If the branch is very different it may even be necessary to perform a rebuild so that build artifacts match the modified source code.

Merging feature branches

Once a developer is done with a feature branch it must be merged into one of the primary branches and pushed up to the canonical repository. The way to do this is with the git merge command:

% git checkout master
% git merge my_feature

It’s as easy as that. After the feature branch has been merged into the primary branch push it up as described before:

% git pull --rebase upstream master
% git push upstream master

Porting changes between primary branches

Often a single change (such as a bug fix) has to be committed to multiple branches. Unfortunately primary branches cannot be merged with the git merge command. Instead we use git cherry-pick.

As an example consider making a change to master:

% git checkout master
% # make the change
% git add/rm/etc...
% git commit -m "fixing bug GEOS-XYZ"
% git pull --rebase upstream master
% git push upstream master

We want to backport the bug fix to the stable branch as well. To do so we have to note the commit id of the change we just made on master. The git log command will provide this. Let’s assume the commit id is “123”. Backporting to the stable branch then becomes:

% git checkout 2.2.x
% git cherry-pick 123
% git pull --rebase upstream 2.2.x
% git push upstream 2.2.x

Cleaning up feature branches

Consider the following situation. A developer has been working on a feature branch and has gone back and forth to and from it making commits here and there. The result is that the feature branch has accumulated a number of commits on it. But all the commits are related, and what we want is really just one commit.

This is easy with git and you have two options:

  1. Do an interactive rebase on the feature branch
  2. Do a merge with squash

Interactive rebase

Rebasing allows us to rewrite the commits on a branch, deleting commits we don’t want, or merging commits that should really be done. You can read more about interactive rebasing here.

Warning

Much care should be taken with rebasing. You should never rebase commits that are public (that is, commits that have been copied outside your local repository). Rebasing public commits changes branch history and results in the inability to merge with other repositories.

The following example shows an interactive rebase on a feature branch:

% git checkout my_feature
% git log

The git log shows the current commit on the branch is commit “123”. We make some changes and commit the result:

% git commit "fixing bug x" # results in commit 456

We realize we forgot to stage a change before committing, so we add the file and commit:

% git commit -m "oops, forgot to commit that file" # results in commit 678

Then we notice a small mistake, so we fix and commit again:

% git commit -m "darn, made a typo" # results in commit #910

At this point we have three commits when what we really want is one. So we rebase, specifying the revision immediately prior to the first commit:

% git rebase -i 123

This invokes an editor that allows indicating which commits should be combined. Git then squashes the commits into an equivalent single commit. After this we can merge the cleaned-up feature branch into master as usual:

% git checkout master
% git merge my_feature

Again, be sure to read up on this feature before attempting to use it. And again, never rebase a public commit.

Merge with squash

The git merge command takes an option --squash that performs the merge against the working area but does not commit the result to the target branch. This squashes all the commits from the feature branch into a single changeset that is staged and ready to be committed:

% git checkout master
% git merge --squash my_feature
% git commit -m "implemented feature x"

More useful reading

The content in this section is not intended to be a comprehensive introduction to git. There are many things not covered that are invaluable to day-to-day work with git. Some more useful info:

Previous: Tools