The Devver Blog

A Boulder startup improving the way developers work.

Archive for the ‘Development’ Category

Single-file Sinatra apps with specs baked-in

It’s so easy to create little single-file apps in Sinatra that it almost seems a shame to start a second file just for tests.  The other day Dan and I decided to see if we could create a Sinatra app with everything – including the tests – baked right in.  Here’s what we came up with.

#!/usr/bin/env ruby
require 'rubygems'
gem 'rack', '=0.9.1'
gem 'thin', '=1.0.0'
require 'sinatra'

get '/' do
  content_type 'text/plain'
  "Hello, world"
end

# Run me with 'spec' executable to run my specs!
if $0 =~ /spec$/
  set :environment, :test
  set :run,         false       # Don't autostart server

  require 'spec/interop/test'
  require 'sinatra/test'

  describe "Example App" do
    include Sinatra::Test

    it "should serve a greeting" do
      get '/'
      response.should be_ok
      response.body.should == "Hello, world"
    end

    it "should serve content as text/plain" do
      get '/'
      response.headers['Content-Type'].should == 'text/plain'
    end

  end
end

view this gist
The code switches modes on the name of the executable used to run the file. If we run it with the

spec

command, we get a test run:

$ spec -fs sinatra-tests-baked-in.rb

Example App
- should serve a greeting
- should serve content as text/plain

Finished in 0.007221 seconds

2 examples, 0 failures

Otherwise, if we call it as a Ruby program, it runs the Sinatra server as we would expect:

$ ruby sinatra-tests-baked-in.rb
== Sinatra/0.9.1.1 has taken the stage on 4567 for development with backup from Thin
>> Thin web server (v1.0.0 codename That's What She Said)
>> Maximum connections set to 1024
>> Listening on 0.0.0.0:4567, CTRL+C to stop

And there you have it: a true single-file application, specs and all.

Written by avdi

May 13, 2009 at 9:00 am

Posted in Development, Hacking, Ruby

Tagged with , ,

Our Tools & Practices for Remote Collaboration

Last week, we had Avdi, the newest addition to our team, join us in Boulder, CO. It was great to get some face-to-face time, since Avdi will primarily be working from his home in Pennsylvania while Dan and I continue to work in Boulder.

We are excited about the benefits of having a distributed team, but we’re also aware that there are a number of challenges. As a result, one of the things we worked on last week was figuring out the tools and practices we’ll be using to work effectively from across the country. Luckily, both Avdi and Dan have experience working remotely which we can draw upon.

We evaluated a number of options, but settled on the following tools and practices.

Practices

  • Daily Standup. Every day at the same time, we all get on video chat. We cover what we did yesterday, what we’re working on today, and whether or not we’re blocked on anything. The goal is to keep this meeting at 15 min or less.
  • Minimize interruptions. Whenever we need to communicate with each other, we try to do so on the channel that is the least disruptive (and disrupts the fewest team members). Of course, sometimes we need to be disruptive if an issue is pressing, if someone is blocked, or if we need to have high-bandwidth communication (information, especially cues like body language, don’t come across very effectively on channels like email)
  • Keep it simple. We want to use the smallest number of tools and channels that will allow us to work effectively.

Channels and Tools

Less
disruptive
More
disruptive
Channel Tool Properties
Passive Updates Present.ly
  • Asynchronous
  • Not required reading
Email Any email client (in practice, Gmail)
  • Asynchronous
  • Required reading (usually)
  • Sometimes time-sensitive, sometimes not
IM Skype
  • Semi-synchronous (but usually synchronous)
  • Usually time-sensitive
Voice/video chat Skype
  • Synchronous
  • High bandwidth* (especially video chat)
  • Best for meetings

* By “high bandwidth”, I don’t mean that the tool itself requires a lot of TCP/IP traffic (although this is true, it doesn’t really matter). What I mean is that we can communicate a lot of information between team members in a short amount of time.

Other Tools

  • Lighthouse for issue tracking
  • GitHub for source control and our project wiki
  • RealVNC for screen sharing (essential for remote pair programming)

This is our first attempt at finding a good set of tools and practices for remote collaboration. As time goes on, we’ll undoubtedly iterate and improve upon these.

For another perspective (with a slightly different set of tools), here is a presentation from 2008 about virtual teams.

What tools and practices have worked (and which have not worked) for your team?

Written by Ben

April 28, 2009 at 8:57 am

Managing Amazon EC2 with your iPhone

I wanted a quick way when out and about to easily manage our AWS EC2 instances while out and about. It hasn’t happened often, but occasionally I am away from the computer and I need to reboot the instances. Perhaps I remember our developer cluster isn’t being used and want to shut it down to save some money.

I didn’t find anything simple and free with a quick Google search, so in a about an hour I wrote a nice little Sinatra app that will let me view our instances, shutdown, or reboot any specific instance or all of them. The tiny framework actually turned out to be even more useful as I now have options that let us tail error logs, reboot Apache, reboot mongrel clusters, or execute any common system administration task.

I won’t be going into detail on how to build a iPhone webapp using Sinatra and iUI, because Ben already created an excellent post detailing all of those steps. In fact I used his old project as the template when I created this project. I can’t begin to explain how amazingly simple it is to build an iPhone webapp using Sinatra, so if you have been thinking of a quick project I highly recommend it.

Here are some screen shots showing the final app. (screenshot courtesy of iPhoney):

ec2 manager home view

ec2 manager home view.

ec2 manager describe view

ec2 manager describe instances view.

ec2 manager instance view.

ec2 manager instance view.

This app uses the Amazon EC2 API Tools to do all the heavy lifting. So this app assumes that you already have the tools installed and working on the machine you want this app to run on. This normally involves installing the tools and setting up some environment variables like EC2_HOME, so make sure you can run ec2-describe-instances from the machine. After that you should just have to change EC2_HOME in the Sinatra app to match the path where you installed the EC2 tools.

Let me know if you have any issues, it is quick and dirty, but I have already found it useful.

To run the app:
cmd> ruby -rubygems ./ec2_manager.rb

require 'sinatra'

EC2_HOME = '~/.ec2'

use Rack::Auth::Basic do |username, password|
  [username, password] == ['some_user', 'some_pass']
end

get "/" do
  @links = %w{describe_ec2s restart_all_ec2s shutdown_all_ec2s}.map { |cmd|
    cmd_link(cmd)
  }.join
  erb :index
end

get "/describe_ec2s" do
  results = `cd #{EC2_HOME}; ec2-describe-instances`
  instances = results.scan(/INSTANCE\ti-\w*/).each{|i| i.sub!("INSTANCE\t",'')}
  @links = instances.map { |i|
    instance_link(i)
  }.join
  erb :index
end

get "/restart_all_ec2s" do
  @results = `cd #{EC2_HOME}; ec2-describe-instances`
  instances = @results.scan(/INSTANCE\ti-\w*/).each{|i| i.sub!("INSTANCE\t",'')}
  cmd="cd #{EC2_HOME}; ec2-reboot-instances #{instances.join(' ')}"
  @results = `cmd`
  erb :index
end

get "/shutdown_all_ec2s" do
  @results = `cd #{EC2_HOME}; ec2-describe-instances`
  instances = @results.scan(/INSTANCE\ti-\w*/).each{|i| i.sub!("INSTANCE\t",'')}
  cmd="cd #{EC2_HOME}; ec2-terminate-instances #{instances.join(' ')}"
  @results = `cmd`
  erb :index
end

get "/instance/:id" do
  id = params[:id] if params[:id]
  verify_id(id)
  @results = `cd #{EC2_HOME}; ec2-describe-instances #{id}`
  @links = "<li><a href='/shutdown/#{id}' target='_self'>shutdown #{id}</a></li>"
  @links += " <li><a href='/reboot/#{id}' target='_self'>reboot #{id}</a></li>"
  erb :index
end

get "/reboot/:id" do
  id = params[:id] if params[:id]
  verify_id(id)
  @results = `cd #{EC2_HOME}; ec2-reboot-instances #{id}`
  erb :index
end

get "/shutdown/:id" do
  id = params[:id] if params[:id]
  verify_id(id)
  @results = `cd #{EC2_HOME}; ec2-terminate-instances #{id}`
  erb :index
end

helpers do

  def cmd_link(cmd)
    "<li><a href='#{cmd}' target='_self'>#{cmd}</a></li>"
  end

  def instance_link(instance)
    "<li><a href='/instance/#{instance}' target='_self'>#{instance}</a></li>"
  end

  def verify_id(id)
    raise Sinatra::ServerError, 'bad-id, What you doin?' unless id.match(/i-\w*/)
  end

end

use_in_file_templates!

__END__

@@ index



@import "/stylesheets/iui.css";




<div class="toolbar">
<h1 id="pageTitle"></h1>
</div>


<ul id="home">
<li><a href='/' target='_self'>home</a></li>


</ul>





<li><strong>results</strong></li>

<ul id="home">
<li><a href='/' target='_self'>home</a></li>

&lt;%= @results.gsub(&quot;\n&quot;,&quot;<br />") %&gt;
</ul>




view this gist

Written by DanM

March 5, 2009 at 10:03 am

Using Ruby to Send Update Emails to Our Mentors

At Devver.net, we send out weekly email updates to an awesome set of mentors. We do this for a number of reasons. First and foremost, we get valuable feedback and advice from our mentors on a variety of issues. But it’s also an easy and effective way to keep us on track and even maximize our chances of success. As Paul Graham says in How Not To Die (he was talking directly to YC teams, but you’ll get the idea):

“For us the main indication of impending doom is when we don’t hear from you. When we haven’t heard from, or about, a startup for a couple months, that’s a bad sign.

Maybe if you can arrange that we keep hearing from you, you won’t die.

That may not be so naive as it sounds. … [The] mere constraint of staying in regular contact with us will push you to make things happen, because otherwise you’ll be embarrassed to tell us that you haven’t done anything new since the last time we talked.”

Foodzie started emailing their mentors early in the summer. We actually borrowed (stole) their email format and best practices.

One thing we’ve tried to not do is send out a completely generic email to all our mentors. Depending on the content and the interaction we’ve had with a specific mentor, we’ll adjust his email accordingly. We begin each email with their name and send it directly to them (in other words, we don’t put a huge list of addresses in the To, CC, or BCC fields). We do this because we can tailor it and it helps elicit individual responses from each mentor (it’s easier to ignore a question if it’s sent to a group).

But, of course, sometimes the emails to a few mentors can be identical. In this case, my not-so-well-kept secret is that I just use a simple Ruby script to send out a duplicate email that appears to be hand-crafted (or at least copied and pasted).

I’ve been told that Outlook can perform this functionality easily, but I don’t know of any way to do this within Gmail. If there is, let me know so I can feel a little silly (in any case, the Ruby code was fun to write).

To run this code, you’ll need to install the highline gem. You’ll also need to add your Gmail account, recipients, subject message, etc. Finally, you’ll want to put your message inside a separate file within project directory. That way, you can easily modify, spellcheck, and format to your heart’s content before sending.

You can get the entire gmailr source code (all two files!) at Github. Please use this script for good, not evil – no one likes a spammer. Enjoy!

Written by Ben

January 20, 2009 at 3:46 pm

Installing and running git-svn on Mac OSX 10.4 Tiger

I am shocked at how much time it took me to get git-svn working on my mac. I use MacPorts, which works well most of the time. Sometimes it has problems which makes me really wish for apt-get on OS X. apt-get normally has worked much nicer for me, but can have its issues too. I even occasionally wish for Windows and a simple install.exe which works 95% of the time out of the box. Really I wish Apple would throw some engineer support to MacPorts and make the service rock solid.

I have had git installed and working for awhile, but preparing to switch our main project from Subversion (svn) to git, I thought I should start using git-svn. It seemed smart to use git-svn for awhile to get used to git, before a full switch so I could fall back on svn in a crunch. I decided to start using git-svn, but the first run of the git-svn command caused this error, and I had no idea how much of my night was about to be wasted…

Can't locate SVN/Core.pm in @INC

Searching led to a couple of webpages, but the most useful was getting git to work on OS X Tiger. It had a quick fix that might work or the long route fix. For some lucky people it is just a path problem. I checked if that was the case for me, by the following command

PATH=/opt/local/bin:$PATH; git svn

unfortunately for me I got the same error, OK I need to reinstall SVN with additional bindings…

> sudo port uninstall -f subversion-perlbindings
> sudo port install -f subversion-perlbindings

leading to this error:

--->  Building serf with target all
Error: Target org.macports.build returned: shell command " cd "/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_ports_www_serf/work/serf-0.2.0" && make all " returned error 2
Command output: /opt/local/share/apr-1/build/libtool --silent --mode=compile /usr/bin/gcc-4.0 -O2 -I/opt/local/include -DDARWIN -DSIGPROCMASK_SETS_THREAD_MASK -no-cpp-precomp -I. -I/opt/local/include/apr-1 -I/opt/local/include/apr-1  -c -o buckets/aggregate_buckets.lo buckets/aggregate_buckets.c && touch buckets/aggregate_buckets.lo
libtool: compile: unable to infer tagged configuration
libtool: compile: specify a tag with `--tag'
make: *** [buckets/aggregate_buckets.lo] Error 1

I spent some time searching and eventually I find the solution to the serf error. I couldn’t read the blog because it wasn’t in English, but I could read enough to solve my MacPorts serf install problem. I followed these few lines from the blog

cd /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_ports_www_serf/work/serf-0.2.0
$ sudo ./configure --prefix=/opt/local --with-apr=/opt/local --with-apr-util=/opt/local
$ sudo make all
$ sudo port install serf

Awesome, I have serf. Now what is next? Back to building svn with perl bindings, that works. Now, let’s build git again since svn with perl bindings is finally installed.

sudo port install git-core +svn

Which fails because of p5-svn-simple

dyld: lazy symbol binding failed: Symbol not found: _Perl_Gthr_key_ptr
Referenced from: /usr/local/lib/libsvn_swig_perl-1.0.dylib
Expected in: flat namespace
dyld: Symbol not found: _Perl_Gthr_key_ptr
Referenced from: /usr/local/lib/libsvn_swig_perl-1.0.dylib
Expected in: flat namespace
Error: Status 1 encountered during processing.

OK, I need to get p5-svn-simple working. Searching leads to this thread MacPort errors related to git. Here you will find the amazingly useful comment by Orestis:

“As mentioned move your libsvn_swig_perl* out of /usr/local/lib AND out of /usr/lib into temporary folders.

Uninstall and reinstall subversion-perlbindings

Install p5-svn-simple (and git-core +svn which is what lead me here)

Move the libsvn_swig_perl files back in /usr/lib and /usr/local/lib (or else git svn won’t work).

> cd /usr/local
> mv ./lib/libsvn_swig_perl* ./bak/
> sudo port install p5-svn-simple

Sweet that works now

> sudo port install git-core +svn
> cd /usr/local
> mv ./bak/libsvn_swig_perl* ./lib/

Finally I try to run git-svn, only to see the same ERROR I had from the very beginning! I am about to lose it but decide that I should try the quick fix again to see if it is the path issue…

PATH=/opt/local/bin:$PATH; git svn

It works! Alright now it is just a path problem. So I open up my .bash_profile, and notice I already have that path included

# Setting the path for MacPorts.
export PATH=/opt/local/bin:/opt/local/sbin:/Applications/MzScheme\ v352/bin:$PATH

But I also have an additional path added from when I originally built git from source, and it looks like I was running my old broken version of git-svn. So I just had to remove this one line from my .bash_profile

export PATH=~/projects/git-1.5.6.1:$PATH

and hours later and with a ton of frustration I have a fully functioning git-svn.

Now that it is working, you can move on to learning git-svn in 5 minutes.

Written by DanM

December 9, 2008 at 11:16 am

Revisiting additional Ruby Tools

I have heard about new Ruby tools since I did my Ruby Tools Roundup. I am always interested in tools that can help improve our code, so I had to check some of them out. Similar to my last tools post, I will be trying out a tool and writing my general impressions along with the basic usage.

reek


I have to start with reek, since it has been the most requested and searched on our site since I originally wrote about tools. reek will help identify code smells, allowing you to fix up your code. Instead of looking at cyclomatic complexity or other metrics, reek looks at patterns to warn you about bad code. Reek currently detects a few code smells (Long Method, Large Class, Feature Envy, Uncommunicative Name, Long Parameter List, Utility Function, Nested Iterators, Control Couple, Duplication) but more are on the way.

I think this project is useful but would need to be more customized before a nightly run would yield very useful results. The biggest problem I have is the signal to noise ratio seemed pretty high. Reek was warning me about “long methods” that were only 7 statements long, which just isn’t something I am concerned about. The warnings on duplicate methods calls can be useful, after running reek on a few files I found a couple places where duplicate method calls were wasting time. Many of the other smells are interesting like ‘Feature Envy’, and ‘Utility Function’. I will need to use reek more before I know if these smells are good indicators or often false positives.

Below reek finds a utility function next_tick which is definitely a helper function that actually exists in two of our files, which probably should be moved into a helper mixin.

def next_tick
    if(EM.reactor_running?)
      EM.next_tick do
        yield
      end
    else
      yield
    end
end

I am really looking forward to see how the tool progresses. If the project allows for a simple config customization to change the thresholds as well as ignore some files/smells, this could become a very useful tool to help keep a team maintain a high expectation of code quality. It would be useful to get nightly reports about any code that might not meet expectations, so a quick group code review could decide if it is an exception (which can be quickly added to the config) or if the code should be refactored and cleaned up.

dmayer$ sudo gem install reek
dmayer$ reek ./lib/client/client.rb
[Utility Function] Client#next_tick doesn't depend on instance state
[Long Method] Client#process_done has approx 7 statements
[Duplication] Client#process_ready calls @buffer.create_reload_msg more than once
[Long Method] Client#process_ready has approx 10 statements
[Duplication] Client#report_system_message calls result.msg more than once
[Feature Envy] Client#report_system_message refers to result more than self
[Duplication] Client#send_tests calls Time.now more than once
[Long Method] Client#send_tests has approx 24 statements
[Feature Envy] Client#send_tests refers to tests more than self
#check a whole directory
dmayer$ reek ./lib/client/*

Towelie


Towelie helps discover duplication in Ruby code, it will help keep your code DRY. It doesn’t have a nice interface at the moment and it is pretty young code. That being said, it can still be a really useful tool to help guide refactoring and code cleanup.

~/projects dmayer$ git clone git://github.com/gilesbowkett/towelie.git
dmayer$ cd ~/projects/devver/
dmayer$ irb -r ~/projects/towelie/lib/towelie.rb
irb(main):001:0> @t = Towelie.new
=> #, @model=#>
irb(main):002:0> @t.parse "lib/client"
(string):24: warning: useless use of a variable in void context
=> nil
irb(main):003:0> puts @t.duplicates
found in:
lib/client/test_unit_reporter.rb
lib/client/rspec_reporter.rb

def nl
report_nl
end

... 2 more dupes in the reporters ...

found in:
lib/client/test_unit_reporter.rb
lib/client/rspec_reporter.rb

def report(str)
print(str.to_s)
end

found in:
lib/client/sync_client.rb
lib/client/rev_sync_client.rb
lib/client/rev_client.rb
lib/client/client.rb

def quit
send(@buffer.create_quit_msg)
end

found in:
lib/client/sync_client.rb
lib/client/rev_sync_client.rb
lib/client/rev_client.rb
lib/client/client.rb

def send_quit
send(@buffer.create_quit_msg)
end

=> nil
irb(main):004:0>

There are currently many duplications because we are maintaining two clients while deciding what route to eventually take. We have also moved a lot of our shared client code into a mixin, and Towelie finds some methods that really should be moved there as well such as the methods “quit” and “send_quit”, which is currently duped in 4 files. Towelie also points to the fact that we should refactor our reporters because they both duplicate code.

I have always been annoyed with copied and pasted functions accidentally working its way in code, this could be a useful nightly run to keep a team DRY. Sometimes two team members implement the same functionality without even knowing a solution already exists in the code base. If you want to go a bit more in depth, check out Giles Bowkett’s (creator of Towelie) How to use Towelie

Flay


Flay is another great tool by Ryan Davis who also works on Heckle and Flog which I covered in the past. Flay, like Towelie, helps keep your code DRY, it detects exact and similar code throughout a project. It seems to be more powerful than Towelie, as seen in this Towelie and Flay comparison. My biggest complaint is the current release has some pretty basic output that you see below. The output I got from Towelie was immediately more recognizable and useful, while Flay currently requires you to dig in a bit deeper on your own into its suggestions. An improvement is already being worked on and a verbose output mode should be in the release soon. Once better output is included I think Flay will be immediately useful out of the box even with small amounts of developer effort.

I like that Flay has weight system, which should make it easy to set some threshold to ignore, high level weights are more likely to be worth your time and attention. One piece of code Flay tagged with a low weight was code that rescued and logged different errors thrown, which while similar actually served a purpose.

rescue Errno::EISDIR => ed
      @stderr.puts "Error: #{ed.message}" if @stderr
      @stderr.puts "You can't pass a directory to devver only test files. Quitting." if @stderr
      send_quit
    rescue LoadError => le
      @stderr.puts "Error: #{le.message}" if @stderr
      @stderr.puts "Not all of the files can be found. Quitting." if @stderr
      send_quit
    rescue SyntaxError, NameError => se
      @stderr.puts "Error: #{se.message}" if @stderr
      @stderr.puts "This file doesn't appear to be a valid Ruby file. Quitting." if @stderr
      send_quit
end

Digging into the Flay results turned up some duplicate code that Towelie had missed. Since Towelie also caught a method that was duped in 4 client files that Flay missed (I was expecting Towelie’s results to be a subset of what Flay found), perhaps there is room for both of the tools and learning to work with both a little bit is worth the time. After a little bit of work perhaps one of the projects will become a clearly better option. Until then I will be following both of these projects.

sudo gem install flay
dmayer$ flay lib/client/*.rb
Processing lib/client/client.rb...
Processing lib/client/mod_client.rb...
...
Processing lib/client/syncer.rb...

Matches found in :defn (mass = 84)
lib/client/mod_client.rb:86
lib/client/mod_rev_client.rb:124

Matches found in :block (mass = 57)
lib/client/client.rb:201
lib/client/client.rb:205
lib/client/client.rb:209

... 6 more results ...

Matches found in :if (mass = 34)
lib/client/mod_client.rb:63
lib/client/mod_rev_client.rb:111

Matches found in :defn (mass = 32)
lib/client/mod_rev_client.rb:36
lib/client/mod_rev_client.rb:50

Conclusions


That should cover it for this Ruby tools post, but I am really enjoying checking out the tools showing up in the Ruby scene. So as always let me know if I missed something, or if there is a tool you would like to see a full write up on. After some of the tools mature a little bit I will have to revisit a few of the tools which are currently in the early stages. I hope the Ruby tools scene keeps as active as it has been lately because there are some interesting projects being worked on.

honorable mentions (things I didn’t think really needed a full write up)


  • metric-fu a great gem to give quick access to a bunch of tools and metrics about your code (RCov, Saikuro, Flog, SCM Churn, and Rails Stats)
  • CruiseControl.rb when you start using all of these tools, continuous integration starts to become more important (or doing nightly runs). CruiseControl.rb is dead simple continuous integration.
  • Simian another code duplication tool, which is mentioned in 3 tools for drying your Ruby code (free for OSS, $99 for a license)
  • Ruby Tidy a tool for cleaning up HTML (I haven’t used this in Ruby, but loved the Java version in my Java days)
  • Watir is an open-source library for automating web browsers. It allows you to write tests that are easy to read and maintain. It is simple and flexible.
  • Autotest, if you haven’t heard of autotest, check it out, continuously run your tests every time you save a file in your project.
  • Rufus a tool that checks if code you are about to load is safe. Allows you to look for custom patterns that you don’t want to run.
  • I wrote about a couple benchmarking tools last time and here is a great article / tutorial on Ruby benchmarking

Written by DanM

December 3, 2008 at 10:01 am

Ruby Beanstalkd distributed worker intermediate lessons

This post is a follow up to Ruby beanstalkd basics, I will try to make the example code little more interesting and useful. I am calling this is a Ruby beanstalkd intermediate write up, it sets up a few workers and distributes and receives results simultaneously. In this example the code resembles real code a bit more (using a queue cache and block passing). If there is enough interest in the Ruby/beanstalkd community, I will follow up with beanstalkd advanced lessons, and go into how we deal with failure cases such as worker dying during jobs, random jobs failing, processing multiple ‘projects’ at one time, using job priority settings, and using TTR/timeouts.

So in this example we are making an estimate of PI. Yes I know that there are far better approximations out there than my simple results, but this was what I came up with for an incredibly simple distributed computing problem. I based my example on the PI Calculation problem from an Introduction to Parallel Computing. The basic idea is that you can calculate pi by guessing random points in a square and then seeing how many points are inside a circle that fits inside the square (PI= 4 * points_in_circle/total_points).

I made a bunch of comments in the code that should help you follow but there are a few key sections worth pointing out.

In the Ruby beanstalkd Basics, both the Server and the Clients only used one queue at a time. Now since we are sending on one queue while also listening on another we need access to both queues at once. We simply have a helper function with a queue_cache to make getting and reusing multiple queues incredibly easy.

def get_queue(queue_name)
    @queue_cache ||= {}
    if @queue_cache.has_key?(queue_name)
      return @queue_cache[queue_name]
    else
      queue = Beanstalk::Pool.new(["#{SERVER_IP}:#{DEFAULT_PORT}"])
      queue.watch(queue_name)
      queue.use(queue_name)
      queue.ignore('default')
      @queue_cache[queue_name] = queue
      return queue
    end
  end

In the basic example each class had a function that got a job and did some work and deleted the job. It is easy to imagine workers that might have many different kinds of work to do on jobs. In every case they are going to grab a job, work on the job, and delete the job. We decided to break that up and make it easy to just pass a work block when workers get a job.

def take_msg(queue)
    msg = queue.reserve
    #by calling ybody we get the content of the message and convert it from yml
    body = msg.ybody
    if block_given?
      yield(body)
    end
    msg.delete
  end

#call take_msg like so
take_msg(queue) do |body|
  #work on body
end

One other thing you should keep a look out for in the code below is checking if a queue has any jobs. Many times workers will check if jobs exist and take them, and if there aren’t any jobs the process is free to do something else. I do this in this example, the server continually checks incoming results to immediately display. If no results have arrived yet, the server continues sending out job requests as fast as it can. This is useful since taking jobs from beanstalkd is a blocking call. They did add support for non-blocking calls in beanstalkd 1.1, but I haven’t started using the newest version yet. I think everything else should be pretty self explanatory, feel free to ask me any questions. To run the code it is the same as before: download beanstalk_intermediate.rb, start beanstalkd, and run the example with ruby.

$ beanstalkd &amp;
$ ruby beanstalk_intermediate.rb
starting distributor
starting client(s)
distributor sending out  jobs
.......................................................
.............................................
received all the results our estimate for pi is: 3.142776
# of workers time to complete
1 real 0m7.282s
user 0m4.114s
sys 0m0.978s
2 real 0m5.667s
user 0m2.736s
sys 0m0.670s
3 real 0m4.999s
user 0m2.014s
sys 0m0.515s
4 real 0m4.612s
user 0m1.608s
sys 0m0.442s
5 real 0m4.517s
user 0m1.474s
sys 0m0.416s
require 'beanstalk-client.rb'

DEFAULT_PORT = 11300
SERVER_IP = '127.0.0.1'
#beanstalk will order the queues based on priority, with the same priority
#it acts FIFO, in a later example we will use the priority
#(higher numbers are higher priority)
DEFAULT_PRIORITY = 65536
#TTR is time for the job to reappear on the queue.
#Assuming a worker died before completing work and never called job.delete
#the same job would return back on the queue (in TTR seconds)
TTR = 3

class BeanBase

  #To work with multiple queues you must tell beanstalk which queues
  #you plan on writing to (use), and which queues you will reserve jobs from
  #(watch). In this case we also want to ignore the default queue
  #you need a different queue object for each tube you plan on using or
  #you can switch what the tub is watching and using a bunch, we just keep a few
  #queues open on the tubes we want.
  def get_queue(queue_name)
    @queue_cache ||= {}
    if @queue_cache.has_key?(queue_name)
      return @queue_cache[queue_name]
    else
      queue = Beanstalk::Pool.new(["#{SERVER_IP}:#{DEFAULT_PORT}"])
      queue.watch(queue_name)
      queue.use(queue_name)
      queue.ignore('default')
      @queue_cache[queue_name] = queue
      return queue
    end
  end

  #this will take a message off the queue, and process it with the block
  def take_msg(queue)
    msg = queue.reserve
    #by calling ybody we get the content of the message and convert it from yml
    body = msg.ybody
    if block_given?
      yield(body)
    end
    msg.delete
  end

  def results_ready?(queue)
    queue.peek_ready!=nil
  end

end

class BeanDistributor < BeanBase

  def initialize(chunks,points_per_chunk)
    @chunks = chunks
    @points_per_chunk = points_per_chunk
    @messages_out = 0
    @circle_count = 0
  end

  def get_incoming_results(queue)
    if(results_ready?(queue))
      result = nil
      take_msg(queue) do |body|
        result = body.count
      end
      @messages_out -= 1
      print "." #display that we received another result
      @circle_count += result
    else
      #do nothing
    end
  end

  def start_distributor
    request_queue = get_queue('requests')
    results_queue = get_queue('results')
    #put all the work on the request queue
    puts "distributor sending out #{@messages} jobs"
    @chunks.times do |num|
      msg = BeanRequest.new(1,@points_per_chunk)
      #Take our ruby object and convert it to yml and put it on the queue
      request_queue.yput(msg,pri=DEFAULT_PRIORITY, delay=0, ttr=TTR)
      @messages_out += 1
      #if there are results get them if not continue sending out work
      get_incoming_results(results_queue)
    end

    while @messages_out > 0
      get_incoming_results(results_queue)
    end
    npoints = @chunks * @points_per_chunk
    pi = 4.0*@circle_count/(npoints)
    puts "\nreceived all the results our estimate for pi is: #{pi}"
  end

end

class BeanWorker < BeanBase

  def initialize()
  end

  def write_result(queue, result)
    msg = BeanResult.new(1,result)
    queue.yput(msg,pri=DEFAULT_PRIORITY, delay=0, ttr=TTR)
  end

  def in_circle
    #generate 2 random numbers see if they are in the circle
    range = 1000000.0
    radius = range / 2
    xcord = rand(range) - radius
    ycord = rand(range) - radius
    if( (xcord**2) + (ycord**2) <= (radius**2) )
      return 1
    else
      return 0
    end
  end

  def start_worker
    request_queue = get_queue('requests')
    results_queue = get_queue('results')
    #get requests and do the work until the worker is killed
    while(true)
      result = 0
      take_msg(request_queue) do |body|
        chunks = body.count
        chunks.times { result += in_circle}
      end
      write_result(results_queue,result)
    end

  end

end

############
# These are just simple message classes that we pass using beanstalks
# to yml and from yml functions.
############
class BeanRequest
  attr_accessor :project_id, :count
  def initialize(project_id, count=0)
    @project_id = project_id
    @count = count
  end
end

class BeanResult
  attr_accessor :project_id, :count
  def initialize(project_id, count=0)
    @project_id = project_id
    @count = count
  end
end

#how many different jobs we should do
chunks = 100
#how many points to calculate per chunk
points_per_chunk = 10000
#how many workers should we have
#(normally different machines, in our example fork them off)
workers = 5

# Most of the time you will have two entirely separate classes
# but to make it easy to run this example we will just fork and start our server
# and client separately. We will wait for them to complete and check
# if we received all the messages we expected.
puts "starting distributor"
server_pid = fork {
  BeanDistributor.new(chunks,points_per_chunk).start_distributor
}

puts "starting client(s)"
client_pids = []
workers.times do |num|
  client_pid = fork {
    BeanWorker.new.start_worker
  }
  client_pids << client_pid
end

Process.wait(server_pid)
#take down the clients
client_pids.each do |pid|
  Process.kill("HUP",pid)
end

Written by DanM

November 19, 2008 at 3:19 pm

Posted in Development, Hacking, Ruby

Follow

Get every new post delivered to your Inbox.