The Devver Blog

A Boulder startup improving the way developers work.

Archive for the ‘Tips & Tricks’ Category

Ruby people on Twitter

The Ruby community is always quickly moving, changing, and adopting new things. It is good to keep your ear to the ground so you can learn and adopt things that the community is finding really useful. There are a number of ways to do this, like watching the most popular Ruby projects on GitHub, most active projects on RubyForge, Ruby Reddit, or listening to the Rails podcast. The way I have found most effective is following a good collection of the Ruby community on Twitter, many of the most active Ruby community members and companies are on Twitter. It is where I have first heard of many things going on in Ruby like the recent Merb/Rails merge.

You can find a great list of 50+ (now 100+) Rubyists to follow on Twitter from RubyLearning. I thought we might as well give out a list of some of the Ruby people Devver.net is following on twitter.

technoweenie
jamis / Jamis Buck
obie / Obie Fernandez
chadfowler / Chad Fowler
engineyard / Engine Yard
d2h / DHH
rjs / Ryan Singer
jasonfried / Jason Fried
37signals
foodzie
fiveruns
_why / why the lucky stiff
gilesgoatboy / Giles Bowkett
dlsspy / Dustin Sallings
julien51 / julien
rbates / Ryan Bates
defunkt / Chris Wanstrath
chrismatthieu / Chris Matthieu
littleidea / Andrew Clay Shafer
headius / Charles Nutter
bascule / Tony Arcieri
atmos / Corey Donohoe
ubermajestix / Tyler Montgomery
raganwald / Reg Braithwaite
chriseppstein


of course I have to give a special shout to ourselves:

danmayer / Dan Mayer
bbrinck / Ben Brinckerhoff
devver.net

If we should be following you also send us an email at contact@devver.net, and we can hook up on Twitter as well.

Written by DanM

January 8, 2009 at 2:08 pm

Posted in Misc, Ruby, Tips & Tricks

Boulder CTO December Lunch with Tim Wolters

The Boulder CTO Lunch meets once a month with a guest speaker and covers topics and questions that startup CTOs should find interesting. This month, the group had Tim Wolters from Collective Intellect come lead the discussion. Tim is a serial entrepreneur currently working on using artificial intelligence and semantic analysis to extract knowledge from unstructured text found in social media. Collective Intellect’s customers use this analysis to inform and measure the effectiveness of their PR and marketing strategy.

Tim is considering working on a book, a startup survival guide for CTOs. Some of his ideas for the book helped lead our discussion during our meeting. I will try to present my notes under topic headings that Tim mentioned, but since this was a open free formed discussion, I am sure I couldn’t capture everything and not all my notes are completely accurate.

The Idea
People should keep a journal of ideas. Tim keeps a journal which he updates, tags, and adds ideas. On any idea, keep track of what is near term, what resources are needed, what is the cost, and what does the related market look like. (I highly recommend this! Ben and I keep a wiki, which has grown to be an incredibly useful resource and was the initial starting point for our last two companies)

Ideas should have an “Aha!” factor that makes you wonder why someone else isn’t already doing it (or some emotional appeal that makes lives better).

During the first few years of a startup you can’t work on all the ideas that come to mind, that is why it is best to keep a journal, just add little notes to the idea to keep them in the back burner.

Talk to others about ideas and perhaps have a group move on an idea and lay the groundwork while leading as an adviser.

Don’t be worried about people taking ideas. After starting a few companies you know how hard it is to really bring something to market.

What about brainstorming for ideas with a group?
Brainstorming groups have never worked for Tim, it just hasn’t worked out. If you have the right people around the table (people that can make things happen), it could work, but Tim hasn’t seen it.

Ideas depend a lot on timing in the marketplace. If the market is moving slow you can slowly look at an idea. If the market it really moving fast you need to spin it up quick and get a lot of people working on it to really make a move on the idea.

Look over your ideas once in awhile and see what still really interest you.

The Role
As a CTO, you paint a landscape of the product and market.

There are two kinds of CTOs: tactical and visionary. Tactical CTOs are internally focused, manages the team, makes the day-to-day tactics so the product gets out there. The visionary CTO sees where the product could go in the market place, signs early deals and customers, looks for features that lead towards or away from markets/competitors/partnerships. The visionary isn’t working on architecture but the market landscape, what partners will benefit the product or get it out sooner.

CTO should be thinking about things such as the three hardest problems that the company faces, so they know what will also be affecting their competitors.

People who liked architectural purity but learned it isn’t as import at winning at the business end up making great CTOs.

CTOs need to stay involved with customers to make decisions about the project innovation and development. Stay active on sales calls, talk with sales people, read all the RFPs.

Becoming the CTO vs VP of engineering?
Are you good at managing or not? VP of engineering is a managing role. If not, divide off the management as soon as possible (in his case that wasn’t possible until the company was about 20 people).

Good sales people leverage a CTO as a company evangelist. If you are a CTO you have to be comfortable with presenting and publicity. You will be at conferences, sales calls, giving presentations, and fund raising. If you aren’t comfortable with these things, get comfortable with it.

time spent:

  • 10% guiding research
  • 30% Sales
  • 30% Partnerships
  • 30% Biz Dev Dealings

Reputation
After some startups, successes, and expanding your network things like getting a team, funding, and getting a startup off the ground are much easier the next time.

It will take 3 to 5 times longer than you think to get a project going if you are an unknown entrepreneur with no reputation.

Don’t solve the big unsolvable problems first, the first time start with smaller problems and develop a reputation while solving them. Angels and VCs aren’t funding research efforts, don’t just chase after big impossible goals.

After a company is bought, it makes sense to make the purchaser successful. It builds on your reputation.

Become a big fish in a small pond and then move to a bigger pond.

Putting together the team
The ideal size for an engineering team is 6-8 people, bigger teams have difficulties maintaining the right amount of communication.

For hiring, Tim personally sits down with the key hires, and if it is research he does interviews with applicants as well.

The Traps and Pitfalls of Startup Companies
3 things that companies get stuck on that can kill the company.

  • Problem with getting over enamored with their original idea, startups must be able to adapt
  • Getting enamored with the research technology, for technology’s sake
  • Getting emotionally tied to architecturally purity. Working on layers of abstraction on abstraction to avoid some possible future problem.

Other things that kill companies (which are kind of like a marriage)

  • Not the right chemistry
  • Bad culture or losing company culture
  • Employees need some sense of allegiance. If they don’t have it cut them immediately
  • Lacks a culture of adaptability
  • Not thinking about how to quickly get to the market and solve problems

Continual code death march. Sometimes companies go on code marches to get something to the marketplace. This can’t be done many nights or it will start taking a toll on other aspects of your life. Strive for balance.

During a startup, you continually are hitting false summits, you think that if you could just get that contact, solve that roadblock, pass this milestone, or make this key hire then everything will fall into place. While these are important as milestones and you should celebrate them you are not done. Or rather, it typically doesn’t get any easier. What it does is takes more risk out allowing you to go solve bigger/other problems.

When founders or others in a company argue, which they need to do sometimes, don’t do it in front of everyone. Discuss disputes offline, reach agreement and present a unified front to the company.


Thanks so much to Tim for sharing some of his thoughts with our group. I will leave you with a final question and quote. Someone once asked why Tim likes to start companies?
“I like to pick where I work and who I work with.”

Written by DanM

December 11, 2008 at 9:50 am

Installing and running git-svn on Mac OSX 10.4 Tiger

I am shocked at how much time it took me to get git-svn working on my mac. I use MacPorts, which works well most of the time. Sometimes it has problems which makes me really wish for apt-get on OS X. apt-get normally has worked much nicer for me, but can have its issues too. I even occasionally wish for Windows and a simple install.exe which works 95% of the time out of the box. Really I wish Apple would throw some engineer support to MacPorts and make the service rock solid.

I have had git installed and working for awhile, but preparing to switch our main project from Subversion (svn) to git, I thought I should start using git-svn. It seemed smart to use git-svn for awhile to get used to git, before a full switch so I could fall back on svn in a crunch. I decided to start using git-svn, but the first run of the git-svn command caused this error, and I had no idea how much of my night was about to be wasted…

Can't locate SVN/Core.pm in @INC

Searching led to a couple of webpages, but the most useful was getting git to work on OS X Tiger. It had a quick fix that might work or the long route fix. For some lucky people it is just a path problem. I checked if that was the case for me, by the following command

PATH=/opt/local/bin:$PATH; git svn

unfortunately for me I got the same error, OK I need to reinstall SVN with additional bindings…

> sudo port uninstall -f subversion-perlbindings
> sudo port install -f subversion-perlbindings

leading to this error:

--->  Building serf with target all
Error: Target org.macports.build returned: shell command " cd "/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_ports_www_serf/work/serf-0.2.0" && make all " returned error 2
Command output: /opt/local/share/apr-1/build/libtool --silent --mode=compile /usr/bin/gcc-4.0 -O2 -I/opt/local/include -DDARWIN -DSIGPROCMASK_SETS_THREAD_MASK -no-cpp-precomp -I. -I/opt/local/include/apr-1 -I/opt/local/include/apr-1  -c -o buckets/aggregate_buckets.lo buckets/aggregate_buckets.c && touch buckets/aggregate_buckets.lo
libtool: compile: unable to infer tagged configuration
libtool: compile: specify a tag with `--tag'
make: *** [buckets/aggregate_buckets.lo] Error 1

I spent some time searching and eventually I find the solution to the serf error. I couldn’t read the blog because it wasn’t in English, but I could read enough to solve my MacPorts serf install problem. I followed these few lines from the blog

cd /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_ports_www_serf/work/serf-0.2.0
$ sudo ./configure --prefix=/opt/local --with-apr=/opt/local --with-apr-util=/opt/local
$ sudo make all
$ sudo port install serf

Awesome, I have serf. Now what is next? Back to building svn with perl bindings, that works. Now, let’s build git again since svn with perl bindings is finally installed.

sudo port install git-core +svn

Which fails because of p5-svn-simple

dyld: lazy symbol binding failed: Symbol not found: _Perl_Gthr_key_ptr
Referenced from: /usr/local/lib/libsvn_swig_perl-1.0.dylib
Expected in: flat namespace
dyld: Symbol not found: _Perl_Gthr_key_ptr
Referenced from: /usr/local/lib/libsvn_swig_perl-1.0.dylib
Expected in: flat namespace
Error: Status 1 encountered during processing.

OK, I need to get p5-svn-simple working. Searching leads to this thread MacPort errors related to git. Here you will find the amazingly useful comment by Orestis:

“As mentioned move your libsvn_swig_perl* out of /usr/local/lib AND out of /usr/lib into temporary folders.

Uninstall and reinstall subversion-perlbindings

Install p5-svn-simple (and git-core +svn which is what lead me here)

Move the libsvn_swig_perl files back in /usr/lib and /usr/local/lib (or else git svn won’t work).

> cd /usr/local
> mv ./lib/libsvn_swig_perl* ./bak/
> sudo port install p5-svn-simple

Sweet that works now

> sudo port install git-core +svn
> cd /usr/local
> mv ./bak/libsvn_swig_perl* ./lib/

Finally I try to run git-svn, only to see the same ERROR I had from the very beginning! I am about to lose it but decide that I should try the quick fix again to see if it is the path issue…

PATH=/opt/local/bin:$PATH; git svn

It works! Alright now it is just a path problem. So I open up my .bash_profile, and notice I already have that path included

# Setting the path for MacPorts.
export PATH=/opt/local/bin:/opt/local/sbin:/Applications/MzScheme\ v352/bin:$PATH

But I also have an additional path added from when I originally built git from source, and it looks like I was running my old broken version of git-svn. So I just had to remove this one line from my .bash_profile

export PATH=~/projects/git-1.5.6.1:$PATH

and hours later and with a ton of frustration I have a fully functioning git-svn.

Now that it is working, you can move on to learning git-svn in 5 minutes.

Written by DanM

December 9, 2008 at 11:16 am

Ruby Beanstalkd distributed worker basics

At Devver we have a lot of jobs to do quickly, so we distribute our work out to a group of EC2 workers. We have tried and used a number of queuing solutions with Ruby, but in the end beanstalkd seemed to be the best solution for us at the time.

I have only seen a few posts about the basics of using beanstalkd with Ruby. I decided to make two posts evolving a simple Ruby beanstalkd example into a more complicated example. This way people new to beanstalkd could see how easy it can be to get up and running with distributed processing using Ruby and beanstalkd. Then people that are doing more advanced work with beanstalkd could see some examples of how we are working with it here at Devver. It would also be great for more experienced beanstalkd warriors to share their thoughts as there aren’t many examples out in the wild. The lack of examples makes it harder to learn and difficult to decide what the best practices are when working with beanstalkd queues.

I have also shared two scripts we have found useful while working with beanstalkd. beanstalk_monitor.rb, which lets you see all the queue statistics about current usage, or to monitor the information of a single queue you are interested in. Finally, beanstalk_killer.rb, which is useful if you want to work on how your code will react to beanstalkd getting backed up or stalling (in beanstalkd speak, “Putting on the brakes”). It was a little harder to pull everything out and make a simple example from our code than I thought, and obviously the example is a bit useless. It should still give a solid example of how to do the basics of distributing jobs with beanstalkd.

For those new to beanstalk, there are a few things you will need to know like how to get a queue object, how to put objects on the queue, how to take objects off the queue, and how to control which queue you are working with. For a higher level overview or more detailed information, I recommend checking out the beanstalkd FAQ. The full example code is below, but first taking a look at the basic snippets might help.

#to work with beanstalk you need to get a client connection
queue = Beanstalk::Pool.new(["#{SERVER_IP}:#{DEFAULT_PORT}"])
#by default you will be working on the 'default' tube or queue
#if we wanted to work on a different queue we could change tubes, like so
queue.watch('test_queue')
queue.use('test_queue')
queue.ignore('default')
#to put a simple string on a queue
queue.put('hello queue world')
#to receive a simple string
job = queue.reserve
puts job.body #prints 'hello queue world'
#if you don't delete the job when you're done, the queue assumes there is an error
#and the job will show back up on the queue again
job.delete

How to run this example (on OS X, with macports installed)

> sudo port install beanstalkd
> sudo gem install beanstalk-client
> beanstalkd
> ruby beanstalk_tester.rb

Download: beanstalk_tester.rb

require 'beanstalk-client.rb'

DEFAULT_PORT = 11300
SERVER_IP = '127.0.0.1'
#beanstalk will order the queues based on priority, with the same priority
#it acts FIFO, in a later example we will use the priority
#(higher numbers are higher priority)
DEFAULT_PRIORITY = 65536
#TTR is time for the job to reappear on the queue.
#Assuming a worker died before completing work and never called job.delete
#the same job would return back on the queue (in seconds)
TTR = 3

class BeanBase


  #To work with multiple queues you must tell beanstalk which queues
  #you plan on writing to (use), and which queues you will reserve jobs from
  #(watch). In this case we also want to ignore the default queue
  def get_queue(queue_name)
    queue = Beanstalk::Pool.new(["#{SERVER_IP}:#{DEFAULT_PORT}"])
    queue.watch(queue_name)
    queue.use(queue_name)
    queue.ignore('default')
    queue
  end

end

class BeanDistributor < BeanBase

  def initialize(amount)
    @messages = amount
  end

  def start_distributor
    #put all the work on the request queue
    bean_queue = get_queue('requests')
    @messages.times do |num|
      msg = BeanRequest.new(1,num)
      #Take our ruby object and convert it to yml and put it on the queue
      bean_queue.yput(msg,pri=DEFAULT_PRIORITY, delay=0, ttr=TTR)
    end

    puts "distributor now getting results"
    #get all the results from the results queue
    bean_queue = get_queue('results')
    @messages.times do |num|
      result = take_msg(bean_queue)
      puts "result: #{result}"
    end

  end

  #this will take a message off the queue, process it and return the result
  def take_msg(queue)
    msg = queue.reserve
    #by calling ybody we get the content of the message and convert it from yml
    count = msg.ybody.count
    msg.delete
    return count
  end

end

class BeanWorker < BeanBase

  def initialize(amount)
    @messages = amount
    @received_msgs = 0
  end

  def start_worker
    results = []
    #get and process all the requests, on the requests queue
    bean_queue = get_queue('requests')
    @messages.times do |num|
      result = take_msg(bean_queue)
      results << result
      @received_msgs += 1
    end

    #return all of the results, by placing them on the separate results queue
    bean_queue = get_queue('results')
    results.each do |result|
      msg = BeanResult.new(1,result)
      bean_queue.yput(msg,pri=DEFAULT_PRIORITY, delay=0, ttr=TTR)
    end

    #this is just to pass information out of the forked process
    #we return the number of messages we received as our exit status
    exit @received_msgs
  end

  #this will take a message off the queue, process it and return the result
  def take_msg(queue)
    msg = queue.reserve
    #by calling ybody we get the content of the message and convert it from yml
    count = msg.ybody.count
    result = count*count
    msg.delete
    return result
  end

end

############
# These are just simple message classes that we pass using beanstalks
# to yml and from yml functions.
############
class BeanRequest
  attr_accessor :project_id, :count
  def initialize(project_id, count=0)
    @project_id = project_id
    @count = count
  end
end

class BeanResult
  attr_accessor :project_id, :count
  def initialize(project_id, count=0)
    @project_id = project_id
    @count = count
  end
end

#write X messages on the queue
numb = 10

recv_count = 0

# Most of the time you will have two entirely seperate classes
# but to make it easy to run this example we will just fork and start our server
# and client seperately. We will wait for them to complete and check
# if we received all the messages we expected.
puts "starting distributor"
server_pid = fork {
  BeanDistributor.new(numb).start_distributor
}

puts "starting client"
client_pid = fork {
  BeanWorker.new(numb).start_worker
}

Process.wait(client_pid)
recv_count = $?.exitstatus
puts "client finished received #{recv_count} msgs"
if(numb==recv_count)
  puts "received the expected number of messages"
else
  puts "error didn't receive the correct number of messages"
end

Process.wait(server_pid)

Written by DanM

October 28, 2008 at 2:35 pm

Tracking down open files with lsof

The other day I was running in a weird error on Devver. After running around twenty test runs on the system, the component that actually runs individual unit tests was crashing due to “Too many open files – (Errno::EMFILE)”

Unfortunately, I didn’t know much more than that. Which files were being kept open? I knew that this component loaded quite a few files, and that by default, OS X only allows 256 open file descriptors (

ulimit -n

will tell you the default on your system). If this was a valid case of needing to load more files, I could just up the limit using

ulimit -n <bigger_number>

.

Fortunately, a quick Google or two pointed the way to

lsof

. Unfortunately, my Unix-fu is never nearly as good as I wish and I didn’t know much about this handy utility. But I quickly discovered that it’s very useful for tracking down problems like this. I quickly used

ps

to find the PID of the Devver process and then a quick

lsof -p <PID>

displayed all the files that the process had open. So easy!

Sure enough, there were a ton of redundant file handles to the file that we use to store information about the Devver run. Armed with this information, it was easy to find the buggy code where we called File.open but failed to ever close the file.

Unfortunately, I still don’t know how to write a good unit test for this case. I guess I could do something ugly like call sytem(“lsof -p pid | wc -l”) before and after calling the code and make sure the number of descriptors stays constant, but that’s really ugly. Is there a way to test this within Ruby? I’m open to ideas.

Still, it’s always good to learn more about a powerful Unix tool. I’m constanly amazed by the power and depth of the Unit tool set.

Written by Ben

October 9, 2008 at 12:23 pm

Ruby Tools Roundup

Update: Devver now offers a hosted metrics service for Ruby developers which can give you useful feedback about your code. Check out Caliper, to get started with metrics for your project.

I collected all of the Ruby tools posts I made this week into a single roundup. You can quickly jump to any tool that interests you or read my reviews start to finish. If you just want to read a individual section here are the previous posts Ruby Code Quality Tools, Ruby Test Quality Tools, and Ruby Performance Tools.

There have been a bunch of interesting tools released for Ruby lately. I decided to write about a few of my favorite Ruby tools and give some of the new tools a shot as well. Simply put, better tools can help you be a better developer. I am ignoring the entire topic of IDEs as tools, as I have written about Ruby IDEs before, and it is basically a religious war. If you use any Ruby tools I don’t mention be sure to let me know as I am always interested in trying something new out.

Tool Name Description
Code Quality Tools
Roodi Roodi gives developers information about common mistakes in their Ruby code. It makes it easy to clean up your code before things start to get ugly.
Dust Dust is a new tool that will analyze your code, detect unsafe blocks and unused code. Dust is being created by the same mind behind Heckle
Flog Flog essentially scores an ABC metric, giving you a good understanding of the overall code complexity of any give file or method.
Saikuro When given Ruby source code, Saikuro will generate a report listing the cyclomatic complexity of each method found.
Test Quality Tools
Heckle Heckle helps test your Ruby tests (how cool is that?). Heckle is a mutation tester. It alters/breaks code and verifies that tests fail.
rcov rcov is the easiest way to get information about your current code coverage.
Ruby/Rails Performance Tools
ruby-prof ruby-prof is a fast and easy-to-use Ruby profiler. The first of four tools that can help you solve performance issues.
New Relic New Relic is one of the three Rails plugin performance debugging and monitoring tools recently released.
TuneUp TuneUp a Rails performance tool from FiveRuns. This tool has an interesting community built around it as well.
RubyRun Ruby Run is a Rails performance tool similar to New Relic and TuneUp

Lets get into it…

Roodi


Roodi gives you a bunch of interesting warnings about your Ruby code. We are about to release some code, so I took the opportunity to fix up anything Roodi complained about. It helped identify refactoring opportunities, both with long methods, and overly complex methods. The code and tests became cleaner and more granular after breaking some of the methods down. I even found and fixed one silly performance issue that was easy to see after refactoring, which improved the speed of our code. Spending some time with Roodi looks like it could easily improve the quality and readability of most Ruby projects with very little effort. I didn’t solve every problem because in one case I just didn’t think the method could be simplified anymore, but the majority of the suggestions were right on. Below is an example session with Roodi

dmayer$ sudo gem install roodi
dmayer$ roodi lib/client/syncer.rb
lib/client/syncer.rb:136 - Block cyclomatic complexity is 5.  It should be 4 or less.
lib/client/syncer.rb:61 - Method name "excluded" has a cyclomatic complexity is 10.  It should be 8 or less.
lib/client/syncer.rb:101 - Method name "should_be_excluded?" has a cyclomatic complexity is 9.  It should be 8 or less.
lib/client/syncer.rb:132 - Method name "find_changed_files" has a cyclomatic complexity is 10.  It should be 8 or less.
lib/client/syncer.rb:68 - Rescue block should not be empty.
lib/client/syncer.rb:61 - Method name "excluded" has 25 lines.  It should have 20 or less.
lib/client/syncer.rb:132 - Method name "find_changed_files" has 27 lines.  It should have 20 or less.
Found 7 errors.

After Refactoring:

~/projects/gridtest/trunk dmayer$ roodi lib/client/syncer.rb
lib/client/syncer.rb:148 - Block cyclomatic complexity is 5.  It should be 4 or less.
lib/client/syncer.rb:82 - Rescue block should not be empty.
Found 2 errors.

I did have one problem with Roodi – the errors about rescue blocks just seemed to be incorrect. For code like the little example below it kept throwing the error even though I obviously am doing some work in the rescue code.

Roodi output: lib/client/syncer.rb:68 - Rescue block should not be empty.
begin
  socket = TCPSocket.new(server_ip,server_port)
  socket.close
  return true
rescue Errno::ECONNREFUSED
  return false
end

Dust


Dust detects unused code like unused variables,branches, and blocks. I look forward to see how the project progresses. Right now there doesn’t seem to be much out there on the web, and the README is pretty bare bones. Once you can pass it some files to scan, I think this will be something really useful. For now I didn’t think there wasn’t much I could actually do besides check it out. Kevin, who also helped create the very cool Heckle, does claim that code scanning is coming soon, so I look forward to doing a more detailed write up eventually.

Flog


Flog gives feedback about the quality of your code by scoring code using the ABC metric. Using Flog to help guide refactoring, code cleanup, and testing efforts can be highly effective. It is a little easier to understand the reports after reading how Flog scores your code, and what is a good Flog score. Once you get used to working with Flog you will likely want to run it often against your whole project after making any significant changes. There are two easy ways to do this a handy Flog Rake task or MetricFu which works with both Flog and Saikuro.

Running Flog against any subset of a project is easy, here I am running it against our client libraries

find ./lib/client/ -name \*.rb | xargs flog -n -m &gt; flog.log

Here some example Flog output when run against our client code.

Total score = 1364.52395469781

Client#send_tests: (64.3)
    14.3: assignment
    13.9: puts
    10.7: branch
    10.5: send
     4.7: send_quit
     3.4: message
     3.4: now
     2.0: create_queue_test_msg
     1.9: create_run_msg
     1.9: test_files
     1.8: dump
     1.7: each
     1.7: report_start
     1.7: length
     1.7: get_tests
     1.7: -
     1.7: open
     1.7: load_file
     1.6: empty?
     1.6: nil?
     1.6: use_cache
     1.6: exists?
ModClient#send_file: (32.0)
    12.4: branch
     5.4: +
     4.3: assignment
     3.9: send
     3.1: puts
     2.9: ==
     2.9: exists?
     2.9: directory?
     1.9: strftime
     1.8: to_s
     1.5: read
     1.5: create_file_msg
     1.4: info
Syncer#sync: (30.8)
    13.2: assignment
     8.6: branch
     3.6: inspect
     3.2: info
     3.0: puts
     2.8: +
     2.6: empty?
     1.7: map
     1.5: now
     1.5: length
     1.4: send_files
     1.3: max
     1.3: >
     1.3: find_changed_files
     1.3: write_sync_time
Syncer#find_changed_files: (26.2)
    15.6: assignment
     8.7: branch
     3.5: <<
     1.8: to_s
     1.7: get_relative_path
     1.7: >
     1.7: mtime
     1.6: exists?
     1.6: ==
     1.5: prune
     1.4: should_be_excluded?
     1.3: get_removed_files
     1.3: find
... and so on ...

Saikuro


Saikuro is another code complexity tool. It seems to give a little less information than some of the others. It does generate nice HTML reports. Like other code complexity tools it can be helpful to discover the most complex parts of your projects for refactoring and to help focus your testing. I liked the way Flog broke things down for me into a bit more detail, but either is a useful tool and I am sure it is a matter of preference depending on what you are looking for.

saikuro screenshot
Saikuro Screenshot

Heckle


Heckle is an interesting tool to do mutation testing of your tests. Heckle currently supports Test:Unit and RSpec, but does have a number of issues. I had to run it on a few different files and methods before I got some useful output that helped me improve my testing. The first problem was it crashing when I passed it entire files (crashing the majority of the time). I then began passing it single methods I was curious about, which still occasionally caused Heckle to get into an infinite loop case. This is a noted problem in Heckle, but -T and providing a timeout should solve that issue. In my case it was actually not an infinite loop timing error, but an error when attempting to rewrite the code, which lead to a continual failure loop that wouldn’t time out. When I found a class and method that Heckle could test I got some good results. I found one badly written test case, and one case that was never tested. Lets run through a simple Heckle example.

#install heckle
dmayer$ sudo gem install heckle

#example of the infinite loop Error Heckle run
heckle Syncer should_be_excluded? --tests test/unit/client/syncer_test.rb -v

Setting timeout at 5 seconds.
Initial tests pass. Let's rumble.

**********************************************************************
*** Syncer#should_be_excluded? loaded with 13 possible mutations
**********************************************************************
...
2 mutations remaining...
Replacing Syncer#should_be_excluded? with:

2 mutations remaining...
Replacing Syncer#should_be_excluded? with:
... loops forever ...

#Heckle run against our Client class and the process method

dmayer$ heckle Client process --tests test/unit/client/client_test.rb

Initial tests pass. Let's rumble.

**********************************************************************
*** Client#process loaded with 9 possible mutations
**********************************************************************

9 mutations remaining...
8 mutations remaining...
7 mutations remaining...
6 mutations remaining...
5 mutations remaining...
4 mutations remaining...
3 mutations remaining...
2 mutations remaining...
1 mutations remaining...

The following mutations didn't cause test failures:

--- original
+++ mutation

def process(command)

case command
when @buffer.Ready then
process_ready
- when @buffer.SetID then
+ when nil then
process_set_id(command)
when @buffer.InitProject then
process_init_project
when @buffer.Result then
process_result(command)
when @buffer.Goodbye then
kill_event_loop
when @buffer.Done then
process_done
when @buffer.Error then
process_error(command)
else
@log.error("client ignoring invalid command #{command}") if @log
end
end

--- original
+++ mutation
def process(command)
case command
when @buffer.Ready then
process_ready
when @buffer.SetID then
process_set_id(command)
when @buffer.InitProject then
process_init_project
when @buffer.Result then
process_result(command)
when @buffer.Goodbye then
kill_event_loop
when @buffer.Done then
process_done
when @buffer.Error then
process_error(command)
else
- @log.error("client ignoring invalid command #{command}") if @log
+ nil if @log
end
end

Heckle Results:

Passed : 0
Failed : 1
Thick Skin: 0

Improve the tests and try again.

#Tests added / changed to improve Heckle results

def test_process_process_loop__random_result
    Client.any_instance.expects(:start_tls).returns(true)
    client = Client.new({})
    client.stubs(:send_data)
    client.log = stub_everything
    client.log.expects(:error).with("client ignoring invalid command this is random")
    client.process("this is random")
  end

  def test_process_process_loop__set_id
    Client.any_instance.expects(:start_tls).returns(true)
    client = Client.new({})
    client.stubs(:send_data)
    client.log = stub_everything
    cmd = DataBuffer.new.create_set_ids_msg("4")
    client.expects(:process_set_id).with(cmd)
    client.process(cmd)
  end

#A final Heckle run, showing successful results

dmayer$ heckle Client process --tests test/unit/client/client_test.rb

Initial tests pass. Let's rumble.

**********************************************************************
*** Client#process loaded with 9 possible mutations
**********************************************************************

9 mutations remaining...
8 mutations remaining...
7 mutations remaining...
6 mutations remaining...
5 mutations remaining...
4 mutations remaining...
3 mutations remaining...
2 mutations remaining...
1 mutations remaining...
No mutants survived. Cool!

Heckle Results:

Passed : 1
Failed : 0
Thick Skin: 0

All heckling was thwarted! YAY!!!

rcov


rcov is a code coverage tool for Ruby. If you are doing testing you should probably be monitoring your coverage with a code coverage tool. I don't know of a better tool for code coverage than rcov. It is simple to use and generates beautiful, easy-to-read HTML charts showing the current coverage broken down by file. An easy way to make you project more stable is to occasionally spend some time increasing the coverage you have on your project. I have always found it a great way to get back into a project if you have been off of it for awhile. You just need to find some weak coverage points and get to work.
Rcov Screenshot
rcov screenshot

ruby-prof


ruby-prof does what every other profiler does, but it is much faster than the one built in to Ruby. It also makes it easy to output the information you are seeking to HTML pages, such as call graphs. If you are just looking for a simple write up to get started with ruby-prof I recommend the previous link. I will talk a little more about the kinds of problems I find and how I have solved them with ruby-prof.

I have used ruby-prof a number of times to isolate the ways to speed up my code. I haven't used it to identify why an entire Rails application is slow (there are better tools I discuss later for that), but if you have a small but highly important piece of code ruby-prof is often the best way to isolate the problem. I used ruby-prof to identified the two slowest lines of code of a spellchecker, which was rewritten to become twice as fast.

Most recently I used it to identify where the code was spending all of its time in a loop for a file syncer. It turns out that for thousands of files each time through the loop we were continually calling Pathname.new(path).relative_path_from(@dir_path) over and over. Putting a small cache around that call essentially eliminated all delays in our file synchronization. Below is a simple example of how a few lines of code can make all the difference in performance and how easily ruby-prof can help you isolate the problem areas and where to spend your time. I think seeing the code that ruby-prof helped isolate, and the changes made to the code might be useful if you are new to profiling and performance work.

changes in our spellchecker / recommender

#OLD Way
 alteration = []
    n.times {|i| LETTERS.each_byte {
        |l| alteration << word[0...i].strip+l.chr+word[i+1..-1].strip } }
 insertion = []
     (n+1).times {|i| LETTERS.each_byte {
        |l| insertion << word[0...i].strip+l.chr+word[i..-1].strip } }
 #NEW Way
    #pre-calculate the word breakups
    word_starts = []
    word_missing_ends = []
    word_ends = []
    (n+1).times do |i|
      word_starts << word[0...i]
      word_missing_ends << word[i+1..-1]
      word_ends << word[i..-1]
    end

 alteration = []
    n.times {|i|
      alteration = alteration.concat LETTERS.collect { |l|
        word_starts[i]+l+word_missing_ends[i] } }
 insertion = []
    (n+1).times {|i|
      insertion = insertion.concat LETTERS.collect { |l|

        word_starts[i]+l+word_ends[i] } }

Changes in our file syncer

#OLD
 path_name = Pathname.new(path).relative_path_from(@dir_path).to_s
 #NEW
 path_name = get_relative_path(path)

  def get_relative_path(path)
    return @path_cache[path] if @path_cache.member?(path)
    retval = Pathname.new(path).relative_path_from(@dir_path).to_s
    @path_cache[path] = retval
    return retval
  end

New Relic


New Relic is a performance monitoring tool for Rails apps. It has a great development mode that will help you track down performance issues before they even become a problem, and live monitoring so that you can find any hiccups that are slowing down the production application. The entire performance monitoring space for Ruby/Rails seems to be heating up. I guess it is easy to see why, when scaling has been such an issue for some Rails apps. Just playing around with New Relic was exciting and fun. I could quickly track down the slowest pages, and our most problematic SQL calls, in this case I was testing New Relic on Seekler (an old project of ours) since I didn't think I would find much interesting on our current Devver site. Seekler had some glaring performance issues and I think if we had New Relic from the beginning we could have avoided many of them. Sounds like I might have a day project involving New Relic and giving Seekler as much of a performance boost as possible. New Relic turned out to be my favorite of the performance monitoring tools. For a much more detailed writeup check out RailsTips New Relic Review.

newrelic screenshot
New Relic screenshot

TuneUp


TuneUp another easy-to-install and use Rails performance monitoring solution. The problem I had with TuneUp was I couldn't get it working on test app for these sorts of things. I tried running Seekler with TuneUp, but had no luck. I found that many people on the message boards seemed to be having various compatibility issues. I looked at the TuneUp screencast and the kind of information that they give you and I feel like this would be equal to New Relic if it works for you. I am emailing back and forth with FiveRuns support who have been very attentive and helpful, so if I get it working I will update this section.

Update: FiveRuns is pretty amazing with their support. I haven't got TuneUp fully working yet, but have made some progress. Some good things to know are that some plugins like safe_erb and output_compression can cause problems with TuneUp. They are aware of the issues, and actively looking into it.

Ruby Run


RubyRun provides live performance monitoring and debugging tools. I hadn't ever heard of this product before I started doing some research while writing this blog article. I am sorry to say but this was the hardest to set up, and gave back less valuable information. I think they need a simple screencast on how to get set up and get useful information back. After getting setup and running I could only get ugly CSV reports that didn't tell me much more than the regular Rails log files. I started reading the RubyRun Manual but it was about as long as Moby Dick and all I wanted was how to view simple easy-to-read reports which is a snap in New Relic and TuneUp. Since the site didn't mention RubyRun providing better data than New Relic or TuneUp which were much more user friendly, I don't think I would recommend RubyRun.

UPDATE: After reading about my difficulties with RubyRun the great folks from Rubysophic got in touch with me. They offered to help me get the tool working and posted a RubyRun quick start guide to their site. I got it working in a snap thanks to an email from their dev and the amazingly simple quick start guide. I still didn't get the same depth of information that I got with New Relic, although RubyRun has a ton of settings so it is likely you can get more depth to the reports. Something worth pointing out is that RubyRun is working on Seekler, which I haven't been able to get TuneUp running on. So if you have been having problems with TuneUp or New Relic, definitely give RubyRun a look. In the end I think the other offerings are slightly more user friendly (less complex settings), and easier to explore the data (link in the feed to both reports, at least when in developer mode). That being said RubyRun offers some great information and options that the others don't and with a bit more UI tuning RubyRun would be at the top of the pack. Thanks to the helpful devs at Rubysophic for helping me to get the most out of RubyRun.

RubyRun screenshot
RubyRun screenshot
RubyRun second screen shot
screenshot of a different RubyRun report

That is it, hope you learned about a new Ruby tool. So get to work, try a new tool, and get to know your code a little better than you did before.

While I was writing this article, people pointed out to me two more tools worth pointing out. I didn't get a chance to try them out or review them, but thought I should point them out. Towlie, helps keep your code dry by finding redundant methods. and finally Source ANalysis (SAN), which is described as, "a Ruby gem for analyzing the contents of source code including comment to script ratios, todo items, declared functions, classes, and much more".

Written by DanM

October 3, 2008 at 10:25 am

Ruby Performance Tools

Update: We now offer a hosted metrics service for Ruby developers which can give you useful feedback about your code. Check out Caliper, to get started with metrics for your project.

I have been interested in all of the tools that exist for Ruby developers. I give you the first of a series of posts where I will be looking at some of the more interesting Ruby tools.

Recently a few different Rail performance tools / web services have been released. They are leaps and bounds better than just watching your developer log and trying to fix performance issues, or just tracking down slow MySQL queries. I went ahead and installed and worked with a few of these tools and wrote a bit about my thoughts and experience with each of them.

ruby-prof


ruby-prof does what every other profiler does, but it is much faster than the one built in to Ruby. It also makes it easy to output the information you are seeking to HTML pages, such as call graphs. If you are just looking for a simple write up to get started with ruby-prof I recommend the previous link. I will talk a little more about the kinds of problems I find and how I have solved them with ruby-prof.

I have used ruby-prof a number of times to isolate the ways to speed up my code. I haven’t used it to identify why an entire Rails application is slow (there are better tools I discuss later for that), but if you have a small but highly important piece of code ruby-prof is often the best way to isolate the problem. I used ruby-prof to identified the two slowest lines of code of a spellchecker, which was rewritten to become twice as fast.

Most recently I used it to identify where the code was spending all of its time in a loop for a file syncer. It turns out that for thousands of files each time through the loop we were continually calling Pathname.new(path).relative_path_from(@dir_path) over and over. Putting a small cache around that call essentially eliminated all delays in our file synchronization. Below is a simple example of how a few lines of code can make all the difference in performance and how easily ruby-prof can help you isolate the problem areas and where to spend your time. I think seeing the code that ruby-prof helped isolate, and the changes made to the code might be useful if you are new to profiling and performance work.

changes in our spellchecker / recommender

 #OLD Way
 alteration = []
    n.times {|i| LETTERS.each_byte {
        |l| alteration << word[0...i].strip+l.chr+word[i+1..-1].strip } }
 insertion = []
     (n+1).times {|i| LETTERS.each_byte {
        |l| insertion << word[0...i].strip+l.chr+word[i..-1].strip } }
 #NEW Way
    #pre-calculate the word breakups
    word_starts = []
    word_missing_ends = []
    word_ends = []
    (n+1).times do |i|
      word_starts << word[0...i]
      word_missing_ends << word[i+1..-1]
      word_ends << word[i..-1]
    end

 alteration = []
    n.times {|i|
      alteration = alteration.concat LETTERS.collect { |l|
        word_starts[i]+l+word_missing_ends[i] } }
 insertion = []
    (n+1).times {|i|
      insertion = insertion.concat LETTERS.collect { |l|

        word_starts[i]+l+word_ends[i] } }

Changes in our file syncer

#OLD
 path_name = Pathname.new(path).relative_path_from(@dir_path).to_s
 #NEW
 path_name = get_relative_path(path)

  def get_relative_path(path)
    return @path_cache[path] if @path_cache.member?(path)
    retval = Pathname.new(path).relative_path_from(@dir_path).to_s
    @path_cache[path] = retval
    return retval
  end

New Relic


New Relic is a performance monitoring tool for Rails apps. It has a great development mode that will help you track down performance issues before they even become a problem, and live monitoring so that you can find any hiccups that are slowing down the production application. The entire performance monitoring space for Ruby/Rails seems to be heating up. I guess it is easy to see why, when scaling has been such an issue for some Rails apps. Just playing around with New Relic was exciting and fun. I could quickly track down the slowest pages, and our most problematic SQL calls, in this case I was testing New Relic on Seekler (an old project of ours) since I didn’t think I would find much interesting on our current Devver site. Seekler had some glaring performance issues and I think if we had New Relic from the beginning we could have avoided many of them. Sounds like I might have a day project involving New Relic and giving Seekler as much of a performance boost as possible. New Relic turned out to be my favorite of the performance monitoring tools. For a much more detailed writeup check out RailsTips New Relic Review.

newrelic screenshot
New Relic screenshot

TuneUp


TuneUp another easy-to-install and use Rails performance monitoring solution. The problem I had with TuneUp was I couldn’t get it working on test app for these sorts of things. I tried running Seekler with TuneUp, but had no luck. I found that many people on the message boards seemed to be having various compatibility issues. I looked at the TuneUp screencast and the kind of information that they give you and I feel like this would be equal to New Relic if it works for you. I am emailing back and forth with FiveRuns support who have been very attentive and helpful, so if I get it working I will update this section.

Ruby Run


RubyRun provides live performance monitoring and debugging tools. I hadn’t ever heard of this product before I started doing some research while writing this blog article. I am sorry to say but this was the hardest to set up, and gave back less valuable information. I think they need a simple screencast on how to get set up and get useful information back. After getting setup and running I could only get ugly CSV reports that didn’t tell me much more than the regular Rails log files. I started reading the RubyRun Manual but it was about as long as Moby Dick and all I wanted was how to view simple easy-to-read reports which is a snap in New Relic and TuneUp. Since the site didn’t mention RubyRun providing better data than New Relic or TuneUp which were much more user friendly, I don’t think I would recommend RubyRun.

UPDATE: After reading about my difficulties with RubyRun the great folks from Rubysophic got in touch with me. They offered to help me get the tool working and posted a RubyRun quick start guide to their site. I got it working in a snap thanks to an email from their dev and the amazingly simple quick start guide. I still didn’t get the same depth of information that I got with New Relic, although RubyRun has a ton of settings so it is likely you can get more depth to the reports. Something worth pointing out is that RubyRun is working on Seekler, which I haven’t been able to get TuneUp running on. So if you have been having problems with TuneUp or New Relic, definitely give RubyRun a look. In the end I think the other offerings are slightly more user friendly (less complex settings), and easier to explore the data (link in the feed to both reports, at least when in developer mode). That being said RubyRun offers some great information and options that the others don’t and with a bit more UI tuning RubyRun would be at the top of the pack. Thanks to the helpful devs at Rubysophic for helping me to get the most out of RubyRun.

RubyRun screenshot
RubyRun screenshot
RubyRun second screen shot
screenshot of a different RubyRun report

If there are any other Ruby performance tools worth checking out let me know. Obviously, if you have experience with any of these tools I would love to hear your thoughts on them.

Written by DanM

September 28, 2008 at 2:35 pm

Useful Gem Shortcuts Tip

When you are working with projects in the command line all day, it can be really annoying to have to remember the exact location of everything. Often when programming against a gem in Ruby, it can be really useful to read over the documentation. Adding a couple lines to your .bash_profile can make loading up the documentation files for your gems much easier. Thanks to Stephen Celis for sharing this great tip.

Written by DanM

July 29, 2008 at 9:08 am

Learning RSpec and Merb

WARNING: This is basically completely out of date Merb changed very fast before 1.0. please see merbivore.com for current information!

We have been trying to work with some different Ruby technologies lately. We are moving to RSpec from Test::Unit, because we believe it has several advantages. It also seems all the cool projects are moving to RSpec: Rubinius, Typo, Mephisto, and of course Merb.

In learning these two technologies together, I have found a few resources that I found to be really useful. I thought it would be good to share the information for anyone looking to write specs for their Merb projects.

If you are first learning Merb and want to create a basic project and learn to test with Rspec along with development, I can’t recommend enough that you follow the Merb Slapp tutorial. This is a great source for Merb basics that is very up to date, and gives good examples of RSpec tests.

If you are new to Merb, the newest documentation will be your friend. I also recommend checking out the Merb Wiki. For RSpec, specifically check out these wiki pages: Merb Controller Specs, Merb Model Specs, and Merb View Specs.

There were some things I had to search and stumble around a bit for, session variables and mock objects. The reason I needed to mock the session was that a user is expected to be logged in verified by a session variable before allowing the action to continue. I needed a mock object of my ProjectWriter, because it normally makes live calls to a web service. These are easy to do, but are both done differently than with Test::Unit with Rails. I found out about RSpec mocking and Merb session mocking at the links provided.

Here is some code that demonstrates mocking both sessions and model objects.

#create a mock object named ProjectWriter
project_writer = mock("ProjectWriter")
#mock expects this call
project_writer.should_receive(:get_all_user_projects).with('ben')
@controller = dispatch_to(Project, :index) do |controller|
  #mock the session hash
  controller.stub!(:session).and_return({:logged_in => true})
  #return my mocked object
  controller.stub!(:get_project_writer).and_return(project_writer)
  #we aren't testing the view don't render it
  controller.stub!(:render) # don't render this action
end

@controller.should respond_successfully

Written by DanM

July 24, 2008 at 2:22 pm

Crappy email filtering on GoDaddy

People are complaining that GoDaddy censored a security site, bid against customers in domain auctions (more), and totally screwed up .me registrations. There’s even an entire site dedicated to exposing GoDaddy’s problems.

We had been lucky enough to never really have any problems with them. Since everything worked for us (even though we hated their UI with a passion) we have stuck around. Moving 20+ domains to another registrar just seems like a hassle. We run most of our services on our own server so we didn’t think we had much of a problem.

One service we didn’t run was our own mail servers. We didn’t want to use GoDaddy’s webmail, but thought just having GoDaddy forward mail would work fine. We forwarded all mail to our Gmail accounts, and after that it was simple to send and receive from Gmail, essentially taking GoDaddy out of the picture. We had run GoDaddy email forwarding for a few sites for over a year with no problems.

In the last couple months we started having problems with our email. We were getting reports from friends having problems emailing us. People were receiving random bounced emails, some emails when retried never would reach us, others would arrive successfully the second try. We were concerned but the issue seemed to occur rarely and usually simply resending would solve the problem.

Then it seems the problem got worse. After sending out an email and not hearing back, we received emails explaining that every time anyone responded to our last email it would bounce back. The bounce message warned that the message was a spam or virus and it would not be delivered. Hmmm… not good. We looked into it and found that we couldn’t send the original email to each other, without getting bounced error responses either. The email include a video link to download our presentation from DropBox, which GoDaddy filtered as spam. We had the same issue receiving responses to emails with RightScale’s developers. After making sure I hadn’t accidentally turned on any spam protection for our forwarded email accounts I called GoDaddy support.

Convincing GoDaddy that emails were being marked as spam by their servers, as opposed to other email servers took awhile. Finally after talking to internal GoDaddy tech support they acknowledged it was their email system. They explained that all emails including forwarding accounts go through a GoDaddy-wide spam and virus scanner which won’t let anything flagged through. I explained I wanted to disable their filtering and trust Google to do my spam filtering for me after my mail is forwarded. This was not an option as the shared filter is in place for all GoDaddy email. I then asked about the criteria of flagging emails, which I know in our case contained no spam links or viruses. I was told that if a single virus or spam message was sent from a domain it would block all emails linking to that domain. This is clearly a bad policy.

Blocking all of DropBox is an example of why this shared filter is bound to block valid emails. DropBox allows any arbitrary files to be uploaded by users, of course some virus-infected file has ended up on their domain. After some virus-infected file hosted on DropBox was emailed to a GoDaddy user, it was added to the blocked domain list for GoDaddy’s email filter. As a result we can never mail a presentation video file hosted on DropBox.

That’s pretty amazing, but the best part is there is simply no way to opt out. There was no way to get domains removed from their blacklist. One of the scariest things about this is that they only filter incoming mail, so we can email out supposedly virus-filled emails, but then if anyone hits reply it will bounce because our link will be a part of the response message. This leaves the sender blind to the problem. Who knows how many people we have emailed in the last 3 months that tried and failed to ever respond to us.

After learning all of this, I did the only reasonable thing. I switched mail providers so our email wouldn’t touch GoDaddy at all. We are now hosting devver.net email accounts with Gmail for your domain. It was easy enough to set up, and we have already seen that we can send supposedly risky emails through our new email servers. I guess I should start listening to everyone’s horror stories about GoDaddy and the next time I purchase a domain, I can slowly start moving away from the terrible beast.

Written by DanM

July 22, 2008 at 3:22 pm

Posted in Misc, Tips & Tricks

Follow

Get every new post delivered to your Inbox.