Developing ExpressionEngine sites with MAMP, Git (Tower) and Beanstalk [Part 2]

This is the second and final instalment in this series and in this part we're going to be looking more closely at using Git with new projects and deploying them to remote servers using Beanstalk. If you haven't read Part 1 which deals more with setting everything up in the first place, take a few minutes to check it out and then come back here. Unless you're already familiar with all that, in which case just read on.

Last time we covered several steps that involved installing and setting up MAMP, Git and ExpressionEngine and creating a base install which we'll be using as the starting point for new sites. The steps that we still have left to cover are:

  1. Create a local Git repository (repo) using Tower
  2. Clone this base EE repo when starting a new project
  3. Work on your project
  4. Create a Beanstalk account
  5. Create a remote Beanstalk repo
  6. Set up a remote database
  7. Set up a Beanstalk deployment environment and deploy your site
  8. Have a nice cup of tea and milk arrowroot biscuit

8. Create a local Git repository (repo) using Tower

Git Tower logo The first thing you'll need to do is download the awesome Git Tower. If you've been put off getting into Git because you think it all has to be done from the command line with a whole host of new commands to learn, then be afraid no longer and install this little beauty. As mentioned in Part 1, you should still take the time to learn the commands required to work with Git, but using Tower will not only make the process quicker and easier, but it will also help you get your head around what those commands are actually doing. And on top of that, it also allows you to easily connect to remote hosting and deployment services like Github and Beanstalk which we'll cover further below.

Once you've installed the program and fired her up, the first thing you'll want to do is create a local repository (repo) by clicking on the nice big button on the right that says Create Local Repository. You can then locate the folder on your machine where you've stored your EE installation and give the repo a title.

Screeshot of the window in Git Tower where you locate your local folder and give your repo a name

The important thing to do before you start this process though is to ensure that you've set up your local project with the exact same folder structure that it'll have on a remote server, so it should be inside a public_html or httpdocs folder. And if you're moving your system folder above the webroot, which you should do from a security point of view, then the folder that you'll want to point Tower to for creating your repo will be the one that contains both your system and public_html folders.

After you've hit OK, Tower will create a hidden .git folder in the folder that you specified. The last thing to do in this step is to open up the repo you've just created and from the Browse tab, click the Stage All button which will add everything in the repo to the stage ready to be committed, and then click the Commit button from the main toolbar. With every commit you make, you need to add a commit message and it's a good idea to make it as detailed as possible so you can quickly review what the commit was about if you ever need to look back t it, but for this first one there's not much to say other than First commit. After making this first commit, it'll also make your life easier if you click on the Modified button to the right so that you only see those files which have been updated since the previous commit. If there's nothing to commit and you're all up-to-date, you should see this:

Tower screenshot which shows the 'No Local Changes. Your working directory is clean. There are no modified files.' message

From now on, whenever you make a change to your base install, like updating or adding a new add-on, you'll want to come back to your repo in Tower and add a new commit. Or if you update an add-on and it doesn't work or starts breaking other stuff, you can roll back to a previous commit by clicking on the History tab, finding the commit you want to go back to, right clicking and selecting Reset HEAD to This Commit… and Git will restore all your files to their previous, working-properly state. This is where one of the real benefits of using Git comes to light: the ability to easily and quickly test stuff out and undo any changes that aren't working for you.

Now that we've established this first base repo, we'll be able to use it for future projects.

9. Clone this base EE repo when starting a new project

So now we're ready to start working on a real website project. You might've also received some content from the client, or if it's an agency you're subcontracting for, some artwork or layout comp files too. You don't want all this stuff to end up on the server when you deploy the site, so my process is to create a folder for the project, and then inside that, two separate folders, one for local and one for server. In the local folder goes all the content, artwork and docs associated with the project, and inside server, as the name suggests, is everything that should end up on the remote server. If you wanted to version the contents of the local folder as well, then you'd create a separate Git repo for it.

Screenshot of Tower's Clone Remote Repository buttonRather than creating the server folder from within Finder or Explorer, we'll use Tower to create it. The reason being that Git won't let you create a repo in a folder that isn't empty, and as soon as you create a folder via Finder, OS X adds a hidden .DS_Store file to it. So just head straight for the Clone Remote Repository button, browse to your project's folder, give your repo a descriptive name, and in the Remote URL field copy the path to the folder on your system. If you're on a Mac, you can get the path by clicking on the folder and then ⌘ + i to open the Info panel, or on Windows, click into the address bar for the folder window.

Tower will then take a little while to copy the contents of your base EE install into your new project's folder. At this point you might be thinking Why not just copy everything across using Finder/Explorer? And the reason is you want Git to be able to track and detect any changes to your base EE install to make updating all the projects created from it a snap. This is another of Git's killer features.

Easy add-on updates

Say you want to update an add-on to the latest version. Without using Git, you'd download the add-on, extract its contents, and then copy it across to the project that you're working on (potentially navigating through several levels of folders unless you have symlinks set up). If the changes didn't work out, you'd have to find a copy of the previous version of the add-on that was working and then copy those files back again. After you're happy that the latest changes are working OK, if you wanted to update other sites to the latest version of the add-on, you'd have to go through the same process for each one, navigating around Finder/Explorer and copying files back and forth.

Now imagine that it's EE itself that you're updating to the latest version. Suddenly there's a lot more files to copy and a lot more changes to undo if it turns out the update doesn't work out as planned.

Enter Git and Tower and all that pain goes away. Rather than updating the project you're working on directly, if it's an add-on that you use in lots of projects, or if you're updating EE itselft, you update your base EE install first, verify that the changes are working OK, and then do a Git Pull (using the button in the main toolbar) from within Tower to copy all the changes across to the project you're working on. Verify that the change are working OK on the site you're working on and you're all good. If not, revert back to the previous working version in your history.

Screenshot of Tower's Fetch, Pull and Push buttonsIt should be pointed out that you don't have to Pull, you can also Fetch. Pull is the same as fetching and then automatically merging those changes into your repo's master branch (HEAD). With Fetch, you copy the files over, but you can then review the changes before deciding to merge them into your master branch. From my experience of using Pull with Tower so far, it does a good job of flagging any conflicts between files that you're copying across and ones that already exist in your project. When these conflicts come up, you won't be able to commit the changes without resolving them. Tower gives you a choice of using the remote version, your version, or in some cases, discarding changes from both. There is also a Diff tool at the bottom of the window so you can easily see what's changed in a file.

That might look like a big nasty error has occurred, but Tower will let you fix it up after you click OK
A view of Tower's resolve conflicts window. Use the little cog icons to the right of each row to choose what action to take

10. Work on your project

This step is all down to you. Work on your site as you normally would except as you go along, remember to commit your changes on a regular basis. If you need to update add-ons that are in your base install, do it on your base install first, commit the changes and then pull them into the site you're working on with Git. You don't have to do it that way, you could just copy the files into the project you're working on. But if you decided to then do the same to another project, you'd need to do more file copying, rather than being able to do it with a few simple clicks from within Tower.

If you're including add-ons that will only be specific to the project you're working on, then you'd update the files in the project's third-party folder directly, but would still commit the changes in the project's repo so that you could roll back to a previous state easily enough.

You don't have to commit every five minutes, but you also shouldn't commit once a week either. Make a commit whenever you hit a milestone that you'd like to mark or that you think you might need to roll back to if things go wrong. I have to admit that I sometimes fall into the bad habit of only remembering to commit at the end of the day, but luckily I haven't been hit by a situation yet where I wished I'd remembered to commit earlier.

11. Create a Beanstalk account

Beanstalk logoYou've finished your project (or it's at a state that you want to get it onto a staging server) and you're ready to move it to its new home. Now we're going to say goodbye to the FTP program, and instead use Tower and Beanstalk to do it for us. I'm going to talk about Beanstalk here, but you can also use DeployHQ in combination with CodebaseHQ (Codebase for storing your repo and Deploy just for deploying to servers), Springloops or possibly phpfog (although I've never been able to create a repo there because the application keeps returning errors).

The reason for using a service like one of these is because to be able to transfer a Git repo to a remote server, Git also needs to be installed on that server, and installing Git isn't always possible or practical depending on who the server is hosted with. Services like Beanstalk remove the need for Git to be on the server that your site will be deployed to. The way it works is that you create a Git repo with Beanstalk, push your local repo to it and then use Beanstalk to send the files to your server via FTP or sFTP.

What's the point in not using FTP myself if the files are just going to be transferred by FTP anyway? I hear you ask. There's a couple of differences:

  1. When deploying via Git, it only sends the files that have been added or modified and will also remove any files that have been deleted from the destination, so it's like using an FTP program that does synchronised transfers. Yes, yes, I'm still talking about things that can be done with an FTP program, but…
  2. When deploying with Beanstalk, it can be set up to automatically deploy your changes, i.e. you push from Tower and it's done; a few minutes later (depending on the size of the changes you've made) and your changes are on the site.

I must admit the first time I did a deployment via Git, it took ages, way longer than zipping up a file of your project, FTPing it to the server and then extracting it. And even now, after having got used to the process and refining it a bit more (using sFTP wherever possible helps), making that initial commit does still take a while to do. But any subsequent commits will be much quicker and changes easier to make, and you'll be able to quickly revert to a previous commit and undo any changes that aren't working on the server if necessary.

So with all the explanations out of the way, the only thing left to do is sign up for a free account and…

12. Create a remote Beanstalk repo

Screenshot of Tower showing the buttons for creating Github and Beanstalk reposThere's a couple of ways we can do this: either via your Beanstalk account's web interface, or you can conveniently do it from within Tower using the Create Beanstalk Repository button. Once you've created the repo on Beanstalk, you'll want to connect it to your local repo.

Open your repo in Tower and you'll see in the left column headings for Branches, Tags, Remotes and Stashes. Branches will be showing your current working branch and under Remotes you should have Origin for the repo that you cloned originally. Right click on Remotes and select Add Remote Repository… . In the window that pops up, URL will be the URL that Beanstalk assigns to your repo with them which will be something like git@your_account_name.beanstalkapp.com:/repo_name.git.

The bit that had me stumped originally because it's not obvious and there doesn't appear to be any mention of it in any docs is how to get the contents of your local repo into the one you've created on Beanstalk. All you have to do is drag your master (HEAD) branch from under Branches onto your remote Beanstalk repo so you should end up with something that looks like:

Tower screenshot showing your repos branches and remotes

You can check to see if your repo was published successfully to Beanstalk by logging into your account's control panel. You'll need to be here in a couple of steps anyway to set up your deployment environment, but first we need to create a database on the remote server.

13. Set up a remote database

The reason to do this step now rather than after deploying to Beanstalk is that you'll want to add the username, database name and password for the remote database to your config bootstrap file. One of the key features of the config boostrap is setting a different environment variable based on the server from which the page is being called, so that you can have database details (and other variables too) for as many different environments as you like which means you don't have to make any changes to files after they've been transferred to the server.

So after having exported a copy of the database for the site you've been working on, log in to your remote server's hosting control panel, set up a user, create a database and import the database you just exported. You're now ready for your first deployment.

14. Set up a Beanstalk deployment environment

Select your repo, then the Deployments tab and then the Edit Server Environments button. Apart from giving the environment a name, the only other thing you really need to change here is to set the Deployment Mode to Automatic which means that as soon as you push a commit to Beanstalk, the changes will be transferred automatically to the server. You could leave it in manual deployment mode but that would mean logging into your Beanstalk control panel to push buttons yourself.

Screenshot from Beanstalk showing the first step of setting up a deployment server

After clicking Next Step, on the next page you'll give the server a name and choose the Repository Path which in most cases will just be / and then choose either FTP or sFTP (which tends to be quicker) and enter your server details. The one setting here that has caught me out before is the Remote Path which sometimes is just / and other times is something like /home/username/.

The final step in the process involves setting up Deployment Hooks which I haven't really looked into yet, but can be used for things like clearing database caches after a deployment.

The one slight peeve I have with this process is that you can't create a deployment environment until after you've made a first commit with Beanstalk which means that your first deployment has to be a manual one. This first commit, because it involves so many files, will take a while, so you can actually go and grab that cup of tea now while you're waiting if you like.

This will transfer all your files to your server, and assuming you've set up your database correctly, you should now be able to see an exact copy of the site you've been working on locally on the remote server.

15. Have a nice cup of tea and milk arrowroot biscuit

Job done, time for… erm, another cup of tea or maybe a beer this time! Depends if you want a biscuit too; milk arrowroot biscuits don't really go with beer. ;)

But what if I need to deploy the site before it’s finished?

The description of the workflow presented here does assume that you'd work on the whole site build locally then deploy it to a live server when it's finished, but of course, not every build will work that way. In some cases, you might have a staging server between local and production, or you might launch a site with some sections that are still ‘coming soon’. In these situations, you're going to end up having more than one database that is currently being worked on, which then brings into play the issue of data synchronisation. I think the general consensus is that this is the weak point in a version-control-focussed production workflow.

Navicat logoThere seems to be a few different ideas around but none of them are perfect:

  • You could simply recreate changes from database to database, but that runs counter to everything else we've been doing so far, which is to avoid repeating ourselves wherever possible.
  • You could use something like Navicat which does database structure and data synchronisation.
  • You could use Git itself to backup and version your database (not sure exactly whether that could be extended to include data synchronisation).
  • After deploying your site you could connect to your remote database locally and just work on the one database.

A thorough discussion of these different approaches would require an article of its own so I won't attempt to here, but will talk a little bit about the last option because it's what I do a bit on this site.

 

Working on a remote database locally

As I've mentioned before, this site is a work in progress and I'm constantly updating templates and styles as I add new posts that contain new types of content. Because the changes I want to make usually involve posts that haven't been published yet, I need a way to work with this new content.

I could write the post then export a copy of the live site and reimport it into my local database, but as the only things that I'm changing are templates and CSS or javascript, I instead connect to the remote database from my local install and use environment conditionals (established with the config bootstrap), e.g. if (NSM_ENV == "local") { do stuff } to only show the new stuff I'm working on when I'm accessing my local version of the site.

So even though the changes to templates are being made to the database that is on the live server, the environment conditionals mean that only I can see the changes until I'm happy with the changes and then remove the conditionals. I also use the environment conditionals to permanentlly show me future-dated posts with a non-open status, i.e. ones that haven't been published yet.

This sort of approach, of working with a live database, only has limited application though. If you were adding new features that involved adding tables or changing table structure, or making changes to lots of data, the risk that something might go wrong is too large. But it works fine for the way I'm working on this site at the moment. 

Summary

Being a relative newcomer to Git, I've only really covered the aspects of it that have found their way into the way I work. There's a lot more to it than what I've discussed here, particularly the ability to create new branches that let you develop new, untested features alongside your main branch which you can then later merge back into the main branch, so I encourage you to look through some of the resources mentioned in Part 1.

But to sum up, the main benefits to me of using Git in a web development workflow are:

  • The ability to version control your files which makes it easy to undo changes that don't work out or sometimes to redo changes that somehow get lost or overwritten (including graphics files).
  • Makes it much easier to update add-ons and EE itself. You have to manually copy files over into your base install first (unless you've got things set up so that your base install itself connects to a series of remote repos containing add-ons), but once you've done it once, updating other local installations based on that one install is as simple as clicking a few buttons within Tower. In fact, you can update several local EE installations in seconds or minutes (depending on the amount of files that change).
  • You can deploy sites to a live server which means you can almost do away with FTP. Once done the first time, updating via Git and Beanstalk is much quicker and easier than by FTP.
  • And because of Git's ability to revert easily to a previous state, undoing changes that go awry on the live server also should be painless.

So that's it! Hopefully some people have found this useful. If not, you probably already knew all this stuff anyway. ;)

I'd also like to send a shout out to @johnwbaxter, @iain and @gunstone for putting up with me badgering them with questions about this topic as I worked my way through it. Thanks guys!