Cobalt Edge

 
Filed under

git

 

Front End Rails Developer Job at DealBase.com

At DealBase, we have an opening for a part-time front end Rails developer at DealBase.com. The opening is for US residents only, and for individuals, no agencies or recruiters please. Most likely you'd be working remotely/telecommuting. The job posting, which is posted in several places, such as Rubynow, Working With Rails, and Rubyjobs.in, covers all the details, but I'll relist it here for ease:


DealBase.com, a startup hotel deals site, is looking for a stellar front end web developer who will adapt our current look/feel to new features, leverage JavaScript for useful and fun features, and is eager to apply their skills to enhance the user experience of our site. We're looking for you to share your knowledge and make an impact, be passionate about your work, and up-to-date on the latest technologies. If this is you, and you enjoy working with a small, distributed, agile team, then we'd love to talk with you.

Requirements for this position:



  • Deep knowledge of XHTML and CSS

  • Familiarity with browser capabilities and restrictions for all major browsers

  • Solid JavaScript skills

  • Experience with/demonstrated use of Git

  • You use and demand MacOS X as your primary development environment

  • Comfortable at the command line

  • Basic skills for image editing and optimization for the web

  • Exposure to and basic knowledge of Ruby on Rails

  • Great communication skills

  • Attention to detail

  • Ability to work both independently and on a team

  • Eagerness to share ideas and problem-solve creatively

  • Experience working on consumer oriented web applications/consumer focus

  • Quick learner, and good at digging in to problems

  • Agile development practices

  • You are based in the US.

  • Individuals only (no multi-person firms, agencies, etc.)
  • Nice to have:



    • jQuery experience

    • GitHub experience

    • MySQL experience

    • Use of test frameworks, TDD, and BDD

    • experience with Linux

    • If you'd like to work with us at DealBase.com and think you're a good fit for this position, send us a resume and sample work, or let us know where we can see your resume and work/code, by emailing jobs@dealbase.com. Please note, we are only considering candidates based in the US.


I'm excited to find a great developer to work with. DealBase has been an awesome company and app to work on, and we're already experiencing great success. We have some pretty cool features planned, and it'd be ideal to get some real CSS and JavaScript ninja skills making those features even better. So, if this is you, please do get in touch, making sure to send email to the right email address as outlined in the job description.

Loading mentions Retweet
Filed under  //   DealBase   git   JavaScript   Jobs   jQuery   Rails   RubyOnRails  

Comments [0]

Renaming a GitHub Account and Forked Repository

Today I finally bit the bullet and renamed the DealBase GitHub account and repository because it was previously named after an early incarnation of the business name (before we'd actually decided on a name). I had expected this to be a bit tedious. In particular, I had started the original repository under my own GitHub account, and then forked it into the company account. Being a private repo, you can't delete the original repository without it deleting any forks (and back in the day ;-) you couldn't even delete repos on GitHub). But, as it turned out, it was pretty easy and didn't take long or involve that much fixing or things that use the code base.

I did ask the GitHub folks for some tips, so these steps factor that info in. Also, before you do any of this, you should of course heed the typical disclaimers, make backups, etc. That said, first, rename the account. You can do this in your Account page on GitHub. Look at the bottom of your account page for this:

You don't even need to re-clone your repo for this. Once you've done the rename, you can simply edit your local .git/config file to fix up the account name. Do this anywhere you have cloned the repo - for example on your continuous integration server, deployment scripts, cached copies of code on your staging and deployment servers, Tracker-GitHub post-receive hook service, etc.

Now on to renaming a private forked repo. If you just need to rename a repo, you can do that on the Edit page for a repository and then repeat the above steps:



But, if you have forked a private repo, it's slightly more involved, but don't fear!



  1. First, make sure you (and anyone else working on the project) have no work in progress, or that you somehow save off that work outside of the repo.

  2. Do a pull from GitHub so you have the most up to date codebase.

  3. On GitHub, delete the original repository (not your forked copy, but from the location you forked it from). This will cascade and delete your forked copy as well. You'll find this right below where you can rename it on the repo's Edit page:

  4. Optionally rename the directory, on your local machine, of the codebase/repo.

  5. Now create a new repository, with your new choice of name, on GitHub.

  6. Then, follow the instructions to import an existing repository. In doing so, on your local machine, go into your codebase, and use that. This will preserve the full Git history and everything from the repository, pushing it up to GitHub just as it was before, but under the new name and rooted at the [new] account.

  7. Fix up all things that use the GitHub account, as mentioned above, like CI servers, deployment scripts, and so on.


That's it, you're done. Pretty straight forward and shouldn't take much time. Thanks again to GitHub for making life so much better in source control land!


Loading mentions Retweet
Filed under  //   DealBase   git  

Comments [0]

My Setup and Software

I too read Al3x's interview the other day, and like John Nunemaker, figured I'd share my setup, as I enjoy reading what others use and often can pick up a few interesting tools or tidbits.

Unlike Mr. Nunemaker, my desk is too messy, IMHO, to photograph right now :) However, many similarities aside from that. On with it...

I use a 17" MacBook Pro with 4GB RAM as my only machine these days. Like Alex and John, I really like having just a single machine, and I no longer work for a corporation where I'd worry about that. DealBase is cool and wouldn't try to make some wacko claim to some work not relevant (and we've explicitly discussed my use of a single machine, etc.). I have my MBP open on a laptop arm from Ergotron, and then my primary monitor is a 30" Dell. Really love the big monitor. I do my main work o the 30", and then the laptop screen has TweetDeck, iChat, Things, some Fluid apps, and other things that I tend to more glance at, and aren't primary work items.

Further, I use a wireless Apple keyboard, and like John, I just love this thing. I can't tell you how long I'd been looking for a keyboard that was just a keyboard (but with arrow keys). I hate normal keyboards that take up so much extra space on the right side (my mouse side) with stuff I rarely use - which only exacerbates problems with having my arm/elbow canitlevered further out to use the mouse, sometimes causing arm strain after long days of coding. I use Logitech MX Revolution cordless mouse, which I like quite a lot.

Transitioning to music... I use JBL Creature speakers, and listen to a variety of things, or nothing. Pandora, via a Fluid app, iTunes (my own playlists, or various Ambient "radio" stations), etc. Either that, or we have a whole-house NuVo Concerto audio system, so sometimes I have that on either with XM satellite radio, or to a playlist from the iPod we have hooked into it. The NuVo setup is nice because it fills my office with sound a bit better (via in-ceiling speakers), but I have more variety via the computer.

As with Alex and John, I am absolutely in love with my iPhone 3G. It is even better than expected. It has essentially replaced my 80GB iPod in the car, typically because it's more up to date, and I like it's UI better; I can remotely work on servers if I have to via iSSH, play games if I'm bored, use InstaPaper to read things I've set for reading later, sync with Address Book and iCal, and of course Twitter, via Tweetie. So, yes, I use Apple's Address Book and iCal, for great sync, simplicity, etc.

Ok, onto dev stuff. My primary work is on Rails-based web-apps, although I dabble with other things as well. DealBase is my day job, and I'm also involved with Bring Light.

Yet again, like Alex and John, I spend the bulk of my time in TextMate, iTerm (a better Terminal, IMHO), and Safari. And actually, I do my development testing in nightly builds of WebKit/Safari, and all my other browsing in standard Safari. I do pull up Firefox for testing, and to use YSlow and sometimes Firebug (although I've been finding the dev tools in WebKit nightlies work well). I've used Emacs - did so for about a year when working with Linux as my desktop. I ditched it back then in favor of Visual SlickEdit, but these days TextMate just rules. I don't get the Emacs passion - why do you want to press two keys for everything, especially the most common things? Yes, I know, you can setup different bindings, etc., but come on the most basic things like saving, opening, copy, paste, etc. should be "single" key (and by single I mean some meta+key) strokes by default. I do fire up vi all the time at the command line on remote servers, and even occasionally on my MBP for some real quick edit. Also, I spend the bulk of my day in my text editor, so yes, appearance matters, and TextMate kills others. I've also used a lot of IDE's in the past, from IDEA, to Eclipse, to Visual Studio. Visual Studio is actually quite good if you have to suffer in that world, but I find Eclipse just plain crappy. IDEA was great for Java, and their Ruby setup will be something to keep an eye on, but generally, the setup I have now works well.

I have all my code for nearly everything I do (e.g. both private and open source/public) on GitHub, and truly love it. Git has been a huge win, and gives me the best of, as well as improving SVN and Perforce. I'm using GitX for most of my commits and history browsing these days.

I use RSpactor for continuously running our RSpec suite, and we also use RSpec stories (but haven't converted to Cucumber yet). I recently added speech output to RSpactor, and that is my preferred notification instead of Growl. We use Pivotal Tracker for tasks/stories/features as well as bug tracking. We used to use Lighthouse, but having it all in one place was nicer, and Tracker wins big time in my opinion. If you want GitHub post-receive hook for Tracker, I recently whipped that up, and its been a real nice addition. We too use Hoptoad for exception notification, and really like it. Also, New Relic is in use at DealBase. I also like viewing Google Analytics with Analytics Reporting Suite, a slick AIR app.

I really like Navicat as a GUI for database stuff. It's proprietary/pay software, but honestly, it's worth it to me. I can do all this stuff command line fine, but the GUI simply makes it a heck of a lot faster to view the results, quickly re-sort on a column, mess around with queries, etc. Also, it has great SSH support, so I can tunnel into all my server's DB's with ease.

I have CruiseControl.rb setups for all my Rails apps, and make use of CCMenu for a nice little status menu item showing me what's going on with those.

I pretty much can't live without LaunchBar. Same goes for 1Password.

Skitch is quite handy for showing sharing and annotating screen shots, and we use Google Docs and Gmail. Speaking of email, I am a huge fan of Mailplane, which is a Mac app for Gmail. Integration is superb, and I can quickly switch around my 15 or so Gmail accounts with ease. I find it superior to a Fluid app for Gmail, since the integration is better and it handles multiple accounts.

I host most of my own web apps on Slicehost, and DealBase is at EngineYard.

I also use Backpack some, although not nearly as much as I used to, and access it about 99% of the time via Packrat. MarsEdit is my blog authoring tool of choice. NetNewsWire is my RSS reader.

All of my photography and photo processing, etc. are done in Adobe Lightroom. I use the Flickr plugin for it as well.

Various other bits:



  • TextPander

  • WeatherDock

  • Pukka

  • Flickr

  • Del.icio.us

  • xScope - a great screen ruler app

  • Photoshop CS3 (look for my name in the about box too :)

  • JungleDisk - I do some backups with this

  • SuperDuper! Still my favorite backup, although I use TimeMachine too

  • CSS Edit and XyleScope sometimes

  • Last.fm - is running all the time, but I really don't actually make use of it, kinda silly.

  • Acrobat Pro and Reader

  • XCode (or TextMate) if I'm working on an Objective-C/Cocoa app.

  • iStat menus

  • YouControl Tunes

  • p.s. One other bit I can't live without but really isn't computing hardware/software, is my espresso setup. I use an Expobar Brewtus II machine, Macap MC4 stepless doserless grinder and a variety of cups (mostly Nuova Pointe and Illy). I use only totally fresh beans from a variety of places (favorites include Blue Bottle, Ecco Caffe, PT's, 49th Parallel (unfortunately not often, since shipping from Canada makes it a bit cost prohibitive), etc.). Coffelab tamper and Bumper stand and knock box. My espresso bar is kept clean (unlike my desk). The pictures are a bit older, so don't show bottomless portafilter in use these days.

    Whew, that's more than plenty. What's your setup?


    Loading mentions Retweet
    Filed under  //   ContinuousIntegration   CruiseControl   DealBase   environment   Espresso   git   Gmail   iPhone   laptop   Mac   Nuvo   Office   Pivotal Tracker   Rails   RSpactor   Ruby   TextMate  

Comments [0]

GitHub Post-Receive Hook for Pivotal Tracker

Over the holiday, I whipped up a quick GitHub Post-Receive Hook for use with Pivotal Tracker. This is just a small web service, implemented in Sinatra. It was my first time using Sinatra, so any suggestions on improvements are of course welcome (as are they in general, this is open source). I've put the code up on GitHub in the somewhat painfully named tracker_github_hook repo.

The service supports multiple GitHub repos and Tracker projects, so you can run a single service that integrates multiple projects. The service will figure out which commits go to which projects based on a config file on the server that associates a GitHub repo URL (make sure to use the http version of the URL, not https), to a Tracker project ID. For example:


tracker_github_hook:
github_url: http://github.com/chris/tracker_github_hook
tracker_api_token: a1234b56789c0defa12b3c4def56a78b
tracker_project_id: 123

You will need to take care of running the service within your particular server setup. I'm personally running it via Thin/Rack, behind Nginx. I have it setup on the same server that runs our continuous integration system, so these two are differentiated by subdomain.

It should be noted, I will not claim this thing is secure. You run it at your own risk, etc.

Aside from getting the service running on your own server, you'll need to add the URL to it as a GitHub post-receive hook for each project you want to integrate. To do that, go to the Admin tab of your GitHub repo, and then the Services tab. At the top you'll see where you put the URL in. The URL is just the root of the service. Also see GitHub's docs on post-receive hooks as it illustrates just how I built this, how to set it up, etc.

Hopefully others find this useful. Or, what I really hope is that the Pivotal guys get with the GitHub guys and add a standard integration service, where it's automatically configured on the Tracker side, and you just need to turn it on on the GitHub side much like the other service integrations.

Loading mentions Retweet
Filed under  //   git   Pivotal Tracker   Ruby   Sinatra  

Comments [0]

Changelogs and Deployment Notification for Capistrano and Git

Early warning: this is a hack, which doesn't mean it's bad, just that it's not polished. However, I am documenting my solution for myself thus far, as well as figured others might find it useful...

Update: Added my shell command for doing deploys (see end of this post).

I wanted a way to automate a few things around deployments, and integrate this a bit with my continuous integration server. I use CruiseControl for the CI server, and previously blogged about setting up CC.rb with Git. The goals for this next task, and subject of this blog post are:


  • Tag the code on successful deploys. My CI server already tags the code anytime it does a successful build, but since I didn't cover that previously, I'll mention it here as well.

  • Notify a list of people via email whenever a new deploy happens.

  • Generate a changelog, based on Git commit messages (better make sure they're suitable reading for whoever gets your deploy notices!), and include this changelog in the deploy emails.

  • Have the CI tag I want to deploy as the only required piece of info/parameter when issuing a deploy command.

Tagging


First, I tag the code on any successful CI run. This tag is what I can then use as the Git tag to deploy. Capistrano supports this via the branch variable (set its value to the tag name). As you can guess, you can use pretty much any Git ID/tag/branch name for this. To do this, add a task to your cruise.rake file (or similar - wherever you define your custom CruiseControl command), and then ensure you run that task during a CruiseControl session. Here's my task:

desc "Tag the code on successful CI build"
task :ci_tag do
timestamp = Time.now.strftime("%Y%m%d%H%M%S")
tag_name = "CI_#{timestamp}"
# Create an empty file with our tag name, so we can easily go grab the tagname
# from the CI output page and do deploys, etc.
system("touch #{File.join(ENV['CC_BUILD_ARTIFACTS'], tag_name)}")
system("git tag -a -m 'Successful continuous integration build on #{timestamp}' #{tag_name}")
system("git push --tags")
end

From the above, you can see that I'll get tags of the form: CI_timestamp. Next up, I want to tag a successful deploy to indicate which commit/tag actually got deployed and when. This is handled via an after task in my Capistrano deploy.rb:


after "deploy:restart", "tag_last_deploy"
task :tag_last_deploy do
set :timestamp, Time.now
set :tag_name, "deployed_to_#{rails_env}_#{timestamp.to_i}"
`git tag -a -m "Tagging deploy to #{rails_env} at #{timestamp}" #{tag_name} #{branch}`
`git push --tags`
puts "Tagged release with #{tag_name}."
end

This will create tags like, deployed_to_staging_1213223458, and works for both staging and production (or any environment you're targeting - note the use of the rails_env variable - you may need to use something else). One thing to pay particular attention to, is that this tag is actually tagging another tag, as defined by the branch variable (mentioned above). In order for this to work though, you need to ensure that your tags are up to date locally. Thus, somewhere in your workflow you'll need to do a git pull --tags, if like me, your CI server is elsewhere and is generating those tags.

Ok, we're all tagged up, let's move on...

Notification

It turns out there's a nifty new plugin called Cap Gun that will take care of emailing a list of folks on deploy. Setup is covered in their README, but the one bit they don't mention, is that you can include a comment in the email message that goes out. I wanted to include a changelog in these emails, so I tapped into this comment attribute, setting it to the text of my changelog. To use the comment, you can either set it via -s comment="my lovely comment" on your Capistrano deploy command, or you can set the comment variable in your Capistrano deploy.rb or included script. More on that in a minute.

Changelogs


My changelog, so far, is very simple, it just pulls the comments for the Git commits that occurred since the last deploy (for the appropriate target), up to the tag specified (which in this case will be the CI tag you are about to deploy). To handle this, I use a small Ruby script, combined with the great Grit gem that lets one manipulate Git via a nice Ruby API. The script simply spits out a simple chunk of text that will be what gets put into the comment Capistrano variable for our deployment notifications. This is in particular where the "hack" comes into play. This script is not robust, does essentially no error checking, etc, etc. Use at your own risk! And with that, here it is:

#!/usr/bin/env ruby

require 'rubygems'
require 'mojombo-grit'
include Grit

unless ARGV.length == 2
puts "Usage: changelog.rb staging|production <commit-or-tag>"
puts " where commit-or-tag is the commit ID or tag you are planning to deploy"
exit -1
end

repo_location = File.expand_path(File.dirname(__FILE__) + '/..')
target = ARGV[0]
about_to_deploy_commit = ARGV[1]
repo = Repo.new(repo_location)

# Find the tag for the last deployed
tags = repo.tags.collect {|tag| tag.name }
tags.delete_if {|tag| !(tag =~ /^deployed_to_#{target}_/)}
tags.sort!
last_deployed_tag = tags[-1]

commits_for_changelog = repo.commits_between(last_deployed_tag, about_to_deploy_commit)
commits_for_changelog.reverse!

puts "Changes since last release:"
commits_for_changelog.each do |commit|
puts " "
puts " #{commit.message}"
end

To run through it briefly, it takes two parameters (and clearly, you can change this for your own deployment targets, etc.): a deployment target, and a tag (which can actually be a tag, a commit ID, branch, etc.). It sets up a repo variable for your Git repository using Grit, and then proceeds to find the last deployed tag for that deployment target. After that, it gets all the commits between that last deployed tag and the tag you specified as the second script argument, and prints out the commit messages.

To integrate this, I added this line to my Capistrano deploy.rb:

set :comment, `script/changelog.rb staging #{branch}`

As you can see, that one is specific to my staging environment, and lives inside my "staging" task in deploy.rb. Same, appropriately edited version goes for production.

Deployment Command


Lastly, I define a simple shell function to do my deploys, which ensures I have done a git pull so I have all the tags, and makes the command easier to remember and get right, etc:

stagemyproject () {
git pull
cap -s branch=$1 staging deploy:migrations
}

You would thus have a command line to do a deploy like this:

stagemyproject CI_20080612052417

That's it, and if you've managed to read this far, congrats, and if you've not only managed to read this far, but payed attention and got value out of it, well, cool.

For anyone who uses/adapts this, please do let me know improvements you make, or suggestions, or tweaks/changes, and so on. I've been using this for all of about a half dozen deploys so far. If (more like when) I make improvements, I'll update.

Loading mentions Retweet
Filed under  //   Capistrano   ContinuousIntegration   git   Rails   RubyOnRails  

Comments [0]

Fixing Capistrano 2.3.0 and Git Deploy Problem

If you upgrade to Capistrano 2.3.0, and are doing deploys from a Git repository, you may find that all of a sudden you can no longer deploy. This is the case if you have no tags in your Git repo. Cap 2.3.0 changed one of the Git commands it uses and that apparently doesn't work right if you don't have tags. So, to solve the problem, you can simply create a single tag in your Git repository. The tag does not have to relate to your build at all, you only need one tag in the repo (not one per build or anything like that), etc. Once you create the tag, you can now deploy again.

To create a tag in Git, or, I think the "cooler" kind of tag, an annotated tag, you can do:

git tag -a tag_name

Replace "tag_name" with your tag name of course. The "-a" option says to make it an annotated tag, which lets you enter a comment about the tag. You can put whatever you want in there. I'm liking this potential use with my continuous integration server when it makes tags on successful builds. Lots of possibilities.

Finally, if you deploy from a remote repo, or if you have a remote repo (say on GitHub), you will need to push your tag. This does not automatically occur on a push, you need to add "--tags" option to git-push to include your tags:

git push --tags

Now you'll have your tags on your remote repo, and listed under the "all tags" tab on GitHub.

Loading mentions Retweet
Filed under  //   Capistrano   git   Rails   RubyOnRails  

Comments [0]

Setting up CruiseControl.rb with/for Git Based Projects

[Updated to refer to official ThoughtWorks CC.rb Git repo.]

I have a new Rails project I'm working on and I use Git/GitHub for source control. It was time to setup continuous integration, and my usual weapon of choice for that is CruiseControl.rb. Here's what I did to get my project setup under CruiseControl.rb with Git, on an Ubuntu 7.10 machine...

Setup for accessing GitHub repo


All I needed to do here was generate an SSH key for my account on the host machine, and then add that key to the allowed keys for my GitHub account.

Prerequisites



  • I setup a builder@mydomain.com email address which will get used by CruiseControl for sending build related emails/notifications.

  • You'll need to determine a port you want CruiseControl to run on, and your strategy for accessing it. For example, I run mine on a port other than port 80, and other than the default 3333. I then proxy that via Nginx, and also use Nginx to password protect access to it (since this is not a public project, etc. This will affect the CC dashboard URL setting specified below. Some notes on this:


    • I did my initial Nginx configuration using err's Nginx config generator. However, this makes a lot of path assumptions, and various other things, so you'll definitely want to go through the resulting file closely. I had a few sites on this server, so it was relatively useful to use this as a base starting point, and then just fix up paths to the access and error logs, and the PID file.

    • Here's a quicky on how to add password protection to an Nginx server (and a specific location).


Install CruiseControl and Do Site Configuration



  1. Cloned the Git version of CruiseControl.rb in location I wanted it (you could also simply download it and expand the tarball): git clone git://github.com/benburkert/cruisecontrolrb.git

  2. The DEPENDENCIES file indicated I needed to have the grit and mime-types gems, so installed those.

  3. Where your projects get stored for CruiseControl.rb is now defined by the CRUISE_DATA_ROOT environment variable, and if you don't set this, it defaults to $HOME/.cruise. I personally changed this to be /var/cruisecontrolrb.
  4. Edit the config/site_configuration.rb (probably need to rename the example version accordingly) to set site-wide settings, such as your email config and so on.

    • For email setup, I use Gmail for domains, so I have a block like this:

      ActionMailer::Base.smtp_settings = {
      :address => "smtp.gmail.com",
      :port => 587,
      :domain => "mydomain.com",
      :authentication => :plain,
      :user_name => "builder@ mydomain.com",
      :password => "password"
      }

    • You'll want to specify the Configuration.dashboard_url setting so URL's work properly.

    • There are a variety of other settings available in the file that you may want to tweak.

Add Project and Configure



  1. Did the usual usual cruise add command to add my project, but with the Git variant: ./cruise add MyProjectName --git-url git@github.com:mylogin/myproject.git (modify the Git project URL for your Git repo of course). Note that you can see all the options by doing a ./cruise add

  2. Create the test database for your project. The easiest way is just to go into $CRUISE_DATA_ROOT/projects/MyProjectName/work and do a rake db:create RAILS_ENV=test. Your first build will have already failed because this hasn't been made, this step hopefully fixes that.

  3. If your log directory isn't in Git, you'll need to go mkdir it, so something like:mkdir $CRUISE_DATA_ROOT/projects/MyProjectName/work/log.

Setup CruiseControl.rb Service/Daemon



  1. Copy the cruisecontrolrb file into /etc/init.d.

  2. I set the port for CruiseControl.rb to run on in the above /etc/init.d/cruisecontrolrb daemon file, by adding "--port 1234" (for example) to the DAEMON_ARGS variable.
  3. Start the CruiseControl.rb daemon as appropriate for your system (e.g. "sudo /etc/init.d/cruisecontrolrb start").

Finally, surf to your cc.rb site on the web and see how your build has done. If you run into build problems, you'll want to look at the cc.rb build logs (if it was your project test/build that failed) which are in the $CRUISE_DATA_ROOT/projects/MyProjectName directory (or rather, the subdirectory in there for the particular build). And Enjoy!

Loading mentions Retweet
Filed under  //   ContinuousIntegration   CruiseControl   git   Ruby  

Comments [0]

SVN Externals are Evil; Use Piston or Braid

I've recently spent a considerable amount of time rectifying problems caused by SVN externals. In one of the codebases I work on, it had been developed with a heavy number of Rails plugins as SVN externals. In general, it was a good approach as these were external code, or shared code, etc. This I think is at least better than directly checking the code in, as you have a more precise record of where it's from, etc. I should also note that our externals were all set to specific tags or branches specific to our code (i.e. not to trunk, where you'd be getting updates without your control). Sounds good, what about this "evil"?

The problem comes in when you need to make changes to the code of an external. You might think, well, go change the root code and then adjust your tag, etc. In some cases you can't do that - maybe it's not code you have commit rights to, or you're making a change that's specific to your app and can't be done another way, or, as was often in the case I had, we were on a much older version, and the trunk and other tags had major differences that I didn't want to integrate.

Thus, what I needed to do was remove this as an external, and check the code in directly. Another approach would be to branch it from where you were and modify that, etc. I wasn't able to do that due to various Subversion permissions (probably not a common case, but I had no choice). This action itself (remove external, add code) is not a real problem in SVN. But, it IS a problem when you go to update. A simple "svn up" on other machines failed. That is pathetic. Instead, what I had to do was go delete the existing (svn externaled) directories, then do "svn up". This of course broke our continuous integration server, and I also had to go manually fix this up on machines I was deploying to. Crappy, but if that was the end of it, I'd probably not be as unhappy...

When it comes to merging these kinds of changes into branches, watch out! This is where SVN just flails. First if you happen to use svnmerge.py to manage your branch merging, forget it. It just can't deal with it, and will leave you with a partially complete merge. Doing it manually, even with things like --ignore-ancestors, does not work either. I had to do something similar as to the "svn up" fix: I had to go in and delete all the directories that were previously svn externals, and then do my merge. And note, do NOT delete the parent directories. For example, if all of your Rails app's plugins were externals, do go and nuke "vendor/plugins". It will then be totally confused and just not do anything, and fail. Nope, you need to specifically delete each offending svn external directory. I make extensive use of branches (I do most work on a branch for daily work), so you can multiply these problems across the number of branches you might need to be merging to, etc.

Having said all that, this problem isn't really all that illogical. I don't know how SVN works internally, but the whole svn:externals thing seems a bit like a hack, or at least not a first class citizen in SVN land. SVN merge or update, should be able to see: hey, you were up to date (for your current revision) on directory X, but this update is going to replace that with new code with the same dir name. But, it doesn't, maybe because it doesn't look at the externals properly in relation. I don't know, and I don't care, since it's broken, and my fix is that I'm moving to Git soon enough :) Also, as another point of view, I know Perforce handles this kind of thing just fine (we used remote mounted Perforce depots all the time at Adobe, and made seriously extensive use of branches (in fact, we required working on a branch)).

Now that I've spent entirely too much time on the build-up, what's the solution? Simple: use Piston (or Braid if using Git). What Piston does, is to not use svn:externals, and instead check the code in directly, yet maintain linkage to the external it came from. My take is this is really probably how svn:externals should've worked (I presume that constantly updating an external is actually a rarely desired trait). You import an svn external using Piston, and it will pull the latest code from whatever SVN URL you supply. In this case, you could use trunk, or you could as usual use a tag or branch. But then it's fixed - it will not update that anytime you do "svn update". Instead, it is up to you to explicitly tell it to update. This avoids svn externals as far as your daily operations go, and also causes zero problems for merges. It does more though.

The second benefit of Piston is that you can then modify the external code, but still bring down updates from the external, allowing a synergy between using external code and your app's specific needs. This is exactly what I needed on a couple of plugins we use, where those plugins' code had deviated significantly from our codebase so I couldn't use a newer version, but I needed to make some changes.

To summarize, the evil is SVN itself not handling changing of externals (i.e. to/from an external) in basic operations like updates and merges, which may cause a lot of manual work on your end, and break automated builds or similar. The solution: use Piston or Braid and get the best of everything.

Loading mentions Retweet
Filed under  //   git   Rails   Versioin Control  

Comments [0]

Git: Start As a Superior SVN, then Leverage Even More

Git has been getting a fair bit of attention lately. I am relatively new to Git, but am definitely a convert and big fan after on a short time using it. I'm to the point where I really don't want to use anything else. I have existing projects using SVN, and also have extensive experience with Perforce, both of these being centralized version control systems.

So, why Git, why as a superior SVN, and so on? If you are using Subversion, or for that matter, many other choices, it is worth a serious look at Git, if at least to provide a superior solution to existing centralized version control. You can ignore the distributed version control aspects to start out. I am a strong proponent of using developer "sandboxes." My definition of this stems from our use of version control at Adobe. Put simply, a sandbox is really a developer's private branch. Those familiar with Git, Mercurial, or other distributed SCM's will immediately see the parallel. With team development, each team member works in a sandbox, and then when they have completed some amount of work that they deem suitable for the main line, or that follows with their team's checkin policies, etc., they merge their branch into the mainline (aka trunk). Doing this in SVN is fairly painful (svnmerge.py helps, but it's still weak; SVN 1.5's merge abilities may help, but it's still not even up to what SVK does). Perforce has great support for this, but it's not all that fast, and setup isn't quite as easy as Git. Also, Perforce has a locking model (i.e. to edit files, you must check them out first), which annoys me to no end after having also used SVN, etc.

A sandbox is like your own private repository, and while I don't recommend ever checking in code that doesn't compile, etc., you can if you want, and thus gain the security of your code at least being backed up/in a second location, check pointing it as much as you want, and leveraging version control, all without hosing your teammates. On larger projects at Adobe, like Photoshop, we even took this a level further, and had a sandbox for the sub-team, so you would merge your sandbox to that, your sub-team's QA would test that, and then that got pushed to main, etc.

With Git however, this "sandbox" model would be had for free, due to the distributed/decentralized model. But, do not fear, you can/do still have a central repository that is the official mainline/trunk of code! The mainline is set up as a repository, then each developer to begin work, "clones" that mainline, which creates a FULL repository on their machine (as in, not just the latest version of the code, but all history, etc.). Now, said developer can simply do their work, committing changes at will, taking full advantage of the version control system. Then, when they are ready to push their changes to the mainline/rest of the team, they simply do a push and all their changes get merged into the main repository. Very much like working on a branch/in a sandbox model, but the beauty is that you aren't having to set up a branch, you don't have to manage your branch (more painful in SVN, fairly easy in P4), and it's all VERY fast (the speed is crazy fast compared to both SVN and P4 for all these operations).

What's also cool, is that you can create branches off your own repository to do experiments or sub-projects, or isolate changes for say a bug fix, or whatever. Creating branches is so dang easy in Git that there is no reason not to do it for even the smallest thing.

Thus, it's not that you can't do any of this without Git, but Git simply makes it far easier and far faster to do this, lowering the barriers to great use of source control, and making management of your code that much better.

And, leveraging this further, you can use Git to collapse a bunch of checkins down into one. So, in this sandbox model, say you were doing a bunch of really small incremental commits, you could "squash" some or all of those prior to pushing your code into main. Here's one blog entry on this kind of thing.

Now, Git does offer one feature that I find really cool, that is not in SVN or P4 (nor any other system that I'm aware of, but of course there are many I haven't used either). This is the "stash" (see git-stash).

You can probably guess what it does. The stash allows you to take some work, and stash it away (without checking it in!) while you then work on something else in the mean time. Maybe you are trundling along on a new feature, and then something quick comes up that you need to make your top priority. Just stash your existing work away, do that new work, then apply the stash back when ready to work on that code again. The stash is like a temporary holding spot - allowing you to keep track of work, but without having to check it in. Sure, you could simply whip up a branch and check it in to that, and that certainly works, but the stash is great when appropriate.

Some of this might seem small, but as we developer's know, some of these small things can make a huge impact on your efficiency and make your day that much nicer. As said, I'm totally sold on Git, and have been converting my SVN projects to Git. I've been using GitHub as my "central" repository, or rather, the way I look at it, it's my offsite copy, or backup. But, setting up your own on a server is relatively simple as well, and you can use gitosis to manage access control and so on. Garry Dolley has a great writeup of the entire process (which is really rather short).

It might seem like a pain to change source/version control systems, but Git has tools to import an SVN repository, including all history, etc. I've used this on a relatively simple SVN repo and it worked fine - I haven't tried it on one with a slew of branches and tags, etc. Regardless, I would highly recommend checking out Git.

Loading mentions Retweet
Filed under  //   git  

Comments [0]