The Devver Blog

A Boulder startup improving the way developers work.

Archive for the ‘Tips & Tricks’ Category

Speeding up multi-browser Selenium Testing using concurrency

I haven’t used Selenium for awhile, so I took some time to dig into the options to get some mainline tests running against Caliper in multiple browsers. I wanted to be able to test a variety of browsers against our staging server before pushing new releases. Eventually this could be integrated into Continuous Integration (CI) or Continuous Deployment (CD).

The state of Selenium testing for Rails is currently in flux:

So there are multiple gems / frameworks:

I decided to investigate several options to determine which is the best approach for our tests.

selenium-on-rails

I originally wrote a couple example tests using the selenium-on-rails plugin. This allows you to browse to your local development web server at ‘/selenium’ and run tests in the browser using the Selenium test runner. It is simple and the most basic Selenium mode, but it obviously has limitations. It wasn’t easy to run many different browsers using this plugin, or use with Selenium-RC, and the plugin was fairly dated. This lead me to try simplest next thing, selenium-client

open '/'
assert_title 'Hosted Ruby/Rails metrics - Caliper'
verify_text_present 'Recently Generated Metrics'

click_and_wait "css=#projects a:contains('Projects')"
verify_text_present 'Browse Projects'

click_and_wait "css=#add-project a:contains('Add Project')"
verify_text_present 'Add Project'

type 'repo','git://github.com/sinatra/sinatra.git'
click_and_wait "css=#submit-project"
verify_text_present 'sinatra/sinatra'
wait_for_element_present "css=#hotspots-summary"
verify_text_present 'View full Hot Spots report'

view this gist

selenium-client

I quickly converted my selenium-on-rails tests to selenium-client tests, with some small modifications. To run tests using selenium-client, you need to run a selenium-RC server. I setup Sauce RC on my machine and was ready to go. I configured the tests to run locally on a single browser (Firefox). Once that was working I wanted to run the same tests in multiple browsers. I found that it was easy to dynamically create a test for each browser type and run them using selenium-RC, but that it was increadly slow, since tests run one after another and not concurrently. Also, you need to install each browser (plus multiple versions) on your machine. This led me to use Sauce Labs’ OnDemand.

browser.open '/'
assert_equal 'Hosted Ruby/Rails metrics - Caliper', browser.title
assert browser.text?('Recently Generated Metrics')

browser.click "css=#projects a:contains('Projects')", :wait_for => :page
assert browser.text?('Browse Projects')

browser.click "css=#add-project a:contains('Add Project')", :wait_for => :page
assert browser.text?('Add Project')

browser.type 'repo','git://github.com/sinatra/sinatra.git'
browser.click "css=#submit-project", :wait_for => :page
assert browser.text?('sinatra/sinatra')
browser.wait_for_element "css=#hotspots-summary"
assert browser.text?('View full Hot Spots report')

view this gist

Using Selenium-RC and Sauce Labs Concurrently

Running on all the browsers Sauce Labs offers (12) took 910 seconds. Which is cool, but way too slow, and since I am just running the same tests over in different browsers, I decided that it should be done concurrently. If you are running your own Selenium-RC server this will slow down a lot as your machine has to start and run all of the various browsers, so this approach isn’t recommended on your own Selenium-RC setup, unless you configure Selenium-Grid. If you are using¬† Sauce Labs, the tests run concurrently with no slow down. After switching to concurrently running my Selenium tests, run time went down to 70 seconds.

My main goal was to make it easy to write pretty standard tests a single time, but be able to change the number of browsers I ran them on and the server I targeted. One approach that has been offered explains how to setup Cucumber to run Selenium tests against multiple browsers. This basically runs the rake task over and over for each browser environment.

Althought this works, I also wanted to run all my tests concurrently. One option would be to concurrently run all of the Rake tasks and join the results. Joining the results is difficult to do cleanly or you end up outputting the full rake test output once per browser (ugly when running 12 times). I took a slightly different approach which just wraps any Selenium-based test in a run_in_browsers block. Depending on the options set, the code can run a single browser against your locally hosted application, or many browsers against a staging or production server. Then simply create a separate Rake task for each of the configurations you expect to use (against local selenium-RC and Sauce Labs on demand).

I am pretty happy with the solution I have for now. It is simple and fast and gives another layer of assurances that Caliper is running as expected. Adding additional tests is simple, as is integrating the solution into our CI stack. There are likely many ways to solve the concurrent selenium testing problem, but I was able to go from no Selenium tests to a fast multi-browser solution in about a day, which works for me. There are downsides to the approach, the error output isn’t exactly the same when run concurrently, but it is pretty close.¬† As opposed to seeing multiple errors for each test, you get a single error per test which includes the details about what browsers the error occurred on.

In the future I would recommend closely watching Webrat and Capybara which I would likely use to drive the Selenium tests. I think the eventual merge will lead to the best solution in terms of flexibility. At the moment Capybara doesn’t support selenium-RC, and the tests I originally wrote didn’t convert to the Webrat API as easily as directly to selenium-client (although setting up Webrat to use Selenium looks pretty simple). The example code given could likely be adapted easily to work with existing Webrat tests.

namespace :test do
  namespace :selenium do

    desc "selenium against staging server"
    task :staging do
      exec "bash -c 'SELENIUM_BROWSERS=all SELENIUM_RC_URL=saucelabs.com SELENIUM_URL=http://caliper-staging.heroku.com/  ruby test/acceptance/walkthrough.rb'"
    end

    desc "selenium against local server"
    task :local do
      exec "bash -c 'SELENIUM_BROWSERS=one SELENIUM_RC_URL=localhost SELENIUM_URL=http://localhost:3000/ ruby test/acceptance/walkthrough.rb'"
    end
  end
end

view this gist

require "rubygems"
require "test/unit"
gem "selenium-client", ">=1.2.16"
require "selenium/client"
require 'threadify'

class ExampleTest  1
      errors = []
      browsers.threadify(browsers.length) do |browser_spec|
        begin
          run_browser(browser_spec, block)
        rescue => error
          type = browser_spec.match(/browser\": \"(.*)\", /)[1]
          version = browser_spec.match(/browser-version\": \"(.*)\",/)[1]
          errors < type, :version => version, :error => error}
        end
      end
      message = ""
      errors.each_with_index do |error, index|
        message +="\t[#{index+1}]: #{error[:error].message} occurred in #{error[:browser]}, version #{error[:version]}\n"
      end
      assert_equal 0, errors.length, "Expected zero failures or errors, but got #{errors.length}\n #{message}"
    else
      run_browser(browsers[0], block)
    end
  end

  def run_browser(browser_spec, block)
    browser = Selenium::Client::Driver.new(
                                           :host => selenium_rc_url,
                                           :port => 4444,
                                           :browser => browser_spec,
                                           :url => test_url,
                                           :timeout_in_second => 120)
    browser.start_new_browser_session
    begin
      block.call(browser)
    ensure
      browser.close_current_browser_session
    end
  end

  def test_basic_walkthrough
    run_in_all_browsers do |browser|
      browser.open '/'
      assert_equal 'Hosted Ruby/Rails metrics - Caliper', browser.title
      assert browser.text?('Recently Generated Metrics')

      browser.click "css=#projects a:contains('Projects')", :wait_for => :page
      assert browser.text?('Browse Projects')

      browser.click "css=#add-project a:contains('Add Project')", :wait_for => :page
      assert browser.text?('Add Project')

      browser.type 'repo','git://github.com/sinatra/sinatra.git'
      browser.click "css=#submit-project", :wait_for => :page
      assert browser.text?('sinatra/sinatra')
      browser.wait_for_element "css=#hotspots-summary"
      assert browser.text?('View full Hot Spots report')
    end
  end

  def test_generate_new_metrics
    run_in_all_browsers do |browser|
      browser.open '/'
      browser.click "css=#add-project a:contains('Add Project')", :wait_for => :page
      assert browser.text?('Add Project')

      browser.type 'repo','git://github.com/sinatra/sinatra.git'
      browser.click "css=#submit-project", :wait_for => :page
      assert browser.text?('sinatra/sinatra')

      browser.click "css=#fetch"
      browser.wait_for_page
      assert browser.text?('sinatra/sinatra')
    end
  end

end

view this gist

Written by DanM

April 8, 2010 at 10:07 am

Making Rack::Reloader work with Sinatra

According to the Sinatra FAQ, source reloading was taken out of Sinatra in version 0.9.2 due to “excess complexity” (in my opinion, that’s a great idea, because it’s not a feature that needs to be in minimal a web framework like Sinatra). Also, according to the FAQ, Rack::Reloader (included in Rack) can be added to a Sinatra application to do source reloading, so I decided to try it out.

Setting up Rack::Reloader is easy:

require 'sinatra'
require 'rack'

configure :development do
  use Rack::Reloader
end

get "/hello" do
  "hi!"
end
$ ruby hello.rb
== Sinatra/0.9.4 has taken the stage on 4567 for development with backup from Thin
>> Thin web server (v1.2.4 codename Flaming Astroboy)
>> Maximum connections set to 1024
>> Listening on 0.0.0.0:4567, CTRL+C to stop
[on another terminal]
$ curl http://localhost:4567/hello
hi!

If you add another route, you can access it without restarting Sinatra:

get "/goodbye" do
  "bye!"
end
$ curl http://localhost:4567/goodbye
bye!

But what happens when you change the contents of a route?

get "/hello" do
  "greetings!"
end
$ curl http://localhost:4567/hello
hi!

You still get the old value! What is going on here?

Rack::Reloader simply looks at all files that have been required and, if they have changed on disk, re-requires them. So each Sinatra route is re-evaluated when a reload happens.

However, identical Sinatra routes do NOT override each other. Rather, the first route that is evaluated is used (more precisely, all routes appended to a list and the first matching one is used, so additional identical routes are never run).

We can see this with a simple example:

require 'sinatra'

get "/foo" do
 "foo"
end

get "/foo" do
 "bar"
end
$ curl http://localhost:4567/foo
foo   # The result is 'foo', not 'bar'

Clearly, Rack::Reloader is not very useful if you can’t change the contents of any route. The solution is to throw away the old routes when the file is reloaded using

Sinatra::Application.reset!

, like so:

configure :development do
  Sinatra::Application.reset!
  use Rack::Reloader
end
$ curl http://localhost:4567/hello
greetings!

Success!

A word of caution: you MUST call

reset!

very early in your file – before you add any middleware, do any other configuration, or add any routes.

This method has worked well enough for our Sinatra application. However, code reloading is always tricky and is bound to occasionally produce some weird results. If you want to significantly reduce the chances for strange bugs (at the expense of code loading time), try Shotgun or Rerun. Happy reloading!

Written by Ben

December 21, 2009 at 3:20 pm

Lone Star Ruby Conf 2009 Wrapup Review

I recently went to the Lone Star Ruby Conference (LSRC), in Austin TX. It was great to be able to put faces to many people I had interacted with in the Ruby community via blogs and twitter. I also got to meet Matz and briefly talk with him, which was cool. Meeting someone who created a language which is such a large part of your day to day life is just an interesting experience. I enjoyed LSRC, and just wanted to give a quick summary of some of the talks that I saw and enjoyed. This is by no means full coverage of the event, but hopefully sharing my experience with others is worth something. If you are interested in seeing any of the talks keep an eye out for Confreaks, they taped the event and many of the talks should be coming online soon.

Dave Thomas
Dave was the first speaker for LSRC, and it was a great way to kick off the event. Dave gave a talk about Ruby not being perfect and that is why he likes it. I have heard Dave speak before, and I always enjoy his talks. It isn’t like you learn anything specific about Ruby development, but you learn about the Ruby community. Actually, Dave would say we are a collection of Ruby communities, and that having a collection of communities is a good thing. It was also interesting to hear Dave speak about the entire Zed, “Rails is a Ghetto” incident. Sometimes when you are angrily ranting around online, it is easy to forget that there are real people attached to things. Feelings can get hurt, and while Dave agrees there is some valid points in the post, I think it shows that it probably isn’t a good way to go about fixing them. Dave really loves Ruby and the weird things you can do with the language and it shows.

Glenn Vanderburg, Programming Intuition
Glenn talked about phyical emotions tied to code, such as a sense of touch or smell. The talk generally just evoked memories of Paul Graham’s “Hackers and Painters” in my head, in fact Glenn talked about PG during his talk. The best programmers talk about code as if they can see it. The talk explored ways to feel the code and react to it. It tried to promote the idea that it is OK to just have a gut reaction that some code is a bad way to do things, because we should sense the code. Glenn also played a video showing Bobby McFerrin teaching the audience the Pentatonic scale, which I really enjoyed.

James Edward Gray II, Module Magic
James visited Japan recently and went to a Ruby conference, and he really enjoyed it. About half his talk was why Japan is awesome… He then found little ways to tie this back into his talk about Ruby and Modules. It covered some interesting topics like load order that many people just don’t know enough about, but use every day. Examples of the differences between include and extend. Modules are terrific at limiting scope, limit the scope of scary magic and keep it hidden away. I enjoyed talking with James a good amount through out the conference. I had never met him before LSRC, but I used to practice Ruby working on Ruby Quiz which he ran for a long time.

James has his slides are up, Module Magic

Fernand Galiana, R-House
Fernand gave a really cool and demo heavy talk about home automation. He has a web front end that lets him interact with all his technology. His house tweets most of the events that it runs. The web interface has a iPhone front in, so he can, on the go, change the temperature or turn off lights. I have always been a real home automation geek. When I was growing up, my dad loved playing with an X-10 system that we had in the house. I am really interested in playing with some of this stuff when I have my own place, mostly looking at ways I could use it to cut waste on my energy usage.

Mike Subelsky, Ruby for Startups: Battle Scars and Lessons Learned
* You Ain’t Gonna Need It (YAGNI), don’t worry about being super scaling at the beginning…
* Early days focus on learning more data about what your building and what your customers want concentrate on the first 80% solution.
* Don’t over build, over design, or over engineer.
* Eventually plan to move everything out of your web request, build it so that it will be easy to do in the future, but it isn’t worth doing at first. (delayed job, EM, etc)
* Make careful use of concurrency, prefer processes communicating via messages (SQS etc…) If you are doing threading in your software EM is your friend.
* Avoid touching your RDBMS when you are storing not critical data:
– Storing large text blogs in S3, message across SQS, tons of logging SDB
* Don’t test all the time at the beginning, it gets in the way of exploration… Things that is mission critical maybe should be BDD as it will be the most stable and least likely to change part of your code

Mike posted his slides on his blog, Ruby for Startups.

Jeremy Hinegardner, Playing nice with others. — Tools for mixed language environments

Jeremy wanted to show how easy it is to use some code to make it easy to work with a system that uses multiple languages. He brought up that most projects in the room utilize more than one language. That it will be more common as systems grow in complexity. He looked at a lot of queues, key value stores, and cache-like layers that can be talked to by a variety of language. He then showed some code that would quickly demonstrate how easy it was to work with some of these tools. Extra points because he talked about Beanstalkd, which I personally think is really cool. I think nearly everyone is starting to look at work queues, messaging systems, and non standard databases for their project and this was a good overview of options that are out there.

Yukihiro Matsumoto (Matz), Keynote and Q&A
Matz gave a talk about why we, as a community, love Ruby. In this talk there weren’t really any takeaways that were specifically about Ruby code but more about the community and why Ruby is fun. He spent a good amount of time talking about Quality Without A Name, QWAN. More interesting than the talk was the Q&A session. I thought the most interesting question was why Ruby isn’t on Git yet. He said the teams doesn’t have time to convert all the tools they use from SVN to git. He also mentioned that the git project tracking SVN is very close to the SVN master and is a good way to work on the Ruby code.

Evan Light, TDD: More than just “testing”
Evan first covered that the tools we as a community keep getting excited about aren’t really what matters. What matters is TDD technique. After discussing why tools aren’t as important for awhile, Evan began live coding with the audience. Something I thought was pretty impressive as it would be difficult to do. It made for a weird pair programming exercise with the entire audience trying to drive. Which sometimes worked well and sometimes lead to conflicting ideas / discussion (which made for interesting debate). It was over all a really interesting session, but it is hard to pull out specific tidbits of wisdom from the talk.

Jake Scruggs, What’s the Right Level of Testing?
I have known of Jake for awhile from his work on the excellent Metric Fu gem. Jake explored what the right level of testing for a project is, from his experience on his last nine projects over the years. He explored what worked, what didn’t and what sometimes works but only depending on the people and the project. I think it comes to this conclusion: what works for one project won’t work for all projects. Having some testing and getting the team on a similar testing goal will make things much better. He also stressed the importance of metrics along with testing (really? From the metric-fu guy? Haha). If testing is old and broken, causing team backlash, low morale, and gridlock, it might be better to lessen the testing burden or throw away difficult to maintain tests. Getting rid of them and getting them out of the way, might be worth more than the value the tests were providing. In general he isn’t big into view testing, he likes to avoid slow testing. He likes to have a small ‘smoke screen’ of integration tests, to help verify the system is all working together. In the end, what is the right level of testing for a project? The answer: what level of assurance does the given project really need? In a start-up you probably don’t need a huge level of assurance, speed matters and market feedback matter more. If your building parts for a rocket or medical devices it is entirely different.

I enjoyed this talk quite a bit, and it inspired me to fix our broken metric_fu setup and start tracking on projects metrics again. Jake also wrote a good roundup of LSRC

Corey Donohoe @atmos, think simple
Corey gave interesting quick little thoughts and ideas about how to stay productive, happy, learn more, do more, fail less, and keep things simple and interesting… Honestly with something like 120+ slides, I can’t even begin to summarize this talk. I checked around and couldn’t find his slides online, but they couldn’t really do the talk justice anyways. Keep your eyes peeled for the video as it was a fun talk, which I enjoyed. Until then here is a post he made about heading to LSRC.

Joseph Wilk, Outside-in development with Cucumber
Cucumber is something I keep hearing and reading about, but haven’t really gotten a chance to play with it myself. Joseph’s talk was a good split between a quick intro to Cucumber, and diving in deeper to actually show testing examples and how it worked. From the talk it sounded to me like Cucumber was mostly a DSL to talk between the customer and the developer/tester. I don’t know if that is how others would describe it. I thought Cucumber was an odd mix of English and and Ruby, but it helps effectively tell a story. Since returning form LSRC, I have started working on my first Cucumber test.

Yehuda Katz, Bundler
This was just a lightening talk about Bundler, which I had read about briefly online. Seeing the work that was done for this blew me away. I can honestly say I hope this takes over the Ruby world. We have been dealing with so many problems related to gems at Devver, and if Bundler becomes a standard, it would make the Ruby community a better place. I am really excited about this so go check out the Bundler project now.

Rich Kilmer, Encoding Domains
The final keynote of the event was about encoding domains. I didn’t really know what to expect going into this talk, but I was happily surprised. Rich talked about really encapsulating a domain in Ruby and then being able to make the entire programming logic much simpler. He gave compelling examples of working with knowledge workers in the field and just writing code with them to express their domain of knowledge in Ruby code. Live coding with the domain with experts he jokingly called “syntax driven development” – you write code with them until it doesn’t raise syntax errors. Rich spoke energetically and keep a tiring audience paying attention to his stories about projects he has worked on through out the years. Just hearing about people who have created successful projects who have been working with Ruby in the industry this long is interesting. I thought it had great little pieces of knowledge that were shared during the talk, but again this was a talk where it was to hard to pull out tiny bits of information, so I recommend looking for the video when it is released.

Final Thoughts
LSRC was a good time besides hearing all the speakers. In fact like most conferences some of the best knowledge sharing happened during breaks, at dinner, and in the evenings. It also gave me a chance to get to know some of the community better than just faceless Twitter avatars. It was fun to talk with Ruby people about things that had nothing to do with Ruby. I also am interested in possibly living in Austin at some point in my life so it was great to check it out a bit. Friday night after the conference I went out with a large group of Rubyists to Ruby’s BBQ, which was delicious. We ate outside with good food, good conversation, and live music playing next door. As we were leaving someone pointed out that the guitarist playing next door was Jimmy Vaughn, brother of the even more famous Stevie Ray Vaughan. We went over to listen to the show and have a beer, which quickly changed into political speeches and cheers. Suddenly I realized we were at a libertarian political rally. I never expected to end up at a Texan political rally with a bunch of Rubyists, but I had a good time.

Hopefully the next Ruby conference I attend with be as enjoyable as LSRC was, congrats to everyone who helped put the conference together and all those that attended the event and made it worth while.

Written by DanM

September 3, 2009 at 3:58 pm

A command-line prompt with timeout and countdown

Have you ever started a long operation and walked away from the computer, and come back half an hour later only to find that the process is hung up waiting for some user input? It’s a sub-optimal user experience, and in many cases it can be avoided by having the program choose a default if the user doesn’t respond within a certain amount of time. One example of this UI technique in the wild is powering off your computer – most modern operating systems will pop up a dialogue to confirm or cancel the shutdown, with a countdown until the shutdown proceeds automatically.

This article is about how to achieve the same effect in command-line programs using Ruby.

Let’s start with the end result. We want to be able to call our method like this:

puts ask_with_countdown_to_default("Do you like pie?", 30.0, false)

We pass in a question, a (possibly fractional) number of seconds to wait, and a default value. The method should prompt the user with the given question and a visual countdown. If the user types ‘y’ or ‘n’, it should immediately return true or false, respectively. Otherwise when the countdown expires it should return the default value.

Here’s a high-level implementation:

def ask_with_countdown_to_default(question, seconds, default)
  with_unbuffered_input($stdin) do
    countdown_from(seconds) do |seconds_left|
      write_then_erase_prompt(question, seconds_left) do
        wait_for_input($stdin, seconds_left % 1) do
          case char = $stdin.getc
          when ?y, ?Y then return true
          when ?n, ?N then return false
          else                  # NOOP
          end
        end
      end
    end
  end
  return default
ensure
  $stdout.puts
end                             # ask_with_countdown_to_default

Let’s take it step-by-step.

By default, *NIX terminals operate in “canonical mode”, where they buffer a line of input internally and don’t send it until the user hits RETURN. This is so that the user can do simple edits like backspacing and retyping a typo. This behavior is undesirable for our purposes, however, since we want the prompt to respond as soon as the user types a key. So we need to temporarily alter the terminal configuration.

  with_unbuffered_input($stdin) do

We use the POSIX Termios library, via the ruby-termios gem, to accomplish this feat.

def with_unbuffered_input(input = $stdin)
  old_attributes = Termios.tcgetattr(input)
  new_attributes = old_attributes.dup
  new_attributes.lflag &= ~Termios::ECHO
  new_attributes.lflag &= ~Termios::ICANON
  Termios::tcsetattr(input, Termios::TCSANOW, new_attributes)

  yield
ensure
  Termios::tcsetattr(input, Termios::TCSANOW, old_attributes)
end                             # with_unbuffered_input

POSIX Termios defines a set of library calls for interacting with terminals. In our case, we want to disable some of the terminal’s “local” features – functionality the terminal handles internally before sending input on to the controlling program.

We start by getting a snapshot of the terminal’s current configuration. Then we make a copy for our new configuration. We are interested in two flags: “ECHO” and “ICANON”. The first, ECHO, controls whether the terminal displays characters that the user has types. The second controls canonical mode, which we explained above. After turning both flags off, we set the new configuration and yield. After the block is finished, or if an exception is raised, we ensure that the original terminal configuration is reinstated.

Now we need to arrange for a countdown timer.

    countdown_from(seconds) do |seconds_left|

Here’s the implementation:

def countdown_from(seconds_left)
  start_time   = Time.now
  end_time     = start_time + seconds_left
  begin
    yield(seconds_left)
    seconds_left = end_time - Time.now
  end while seconds_left > 0.0
end                             # countdown_from

First we calculate the wallclock time at which we should stop waiting. Then we begin looping, yielding the number of seconds left, and then when the block returns recalculating the number. We keep this up until the time has expired.

Next up is writing, and re-writing, the prompt.

      write_then_erase_prompt(question, seconds_left) do

This method is implemented as follows:

def write_then_erase_prompt(question, seconds_left)
  prompt_format = "#{question} (y/n) (%2d)"
  prompt = prompt_format % seconds_left.to_i
  prompt_length = prompt.length
  $stdout.write(prompt)
  $stdout.flush

  yield

  $stdout.write("\b" * prompt_length)
  $stdout.flush
end                             # write_then_erase_prompt

We format and print a prompt, flushing the output to insure that it is displayed immediately. The prompt includes a count of the number of seconds remaining until the query times out. In order to make it a nice visually consistent length, we use a fixed-width field for the countdown (“%2d”). Note that we don’t use

puts

to print the prompt – we don’t want it to advance to the next line, because we want to be able to dynamically rewrite the prompt as the countdown proceeds.

After we are done yielding to the block, we erase the prompt in preparation for the next cycle. In order to erase it we create and output string of backspaces (“\b”) the same length as the prompt.

Now we need a way to wait until the user types something, while still periodically updating the prompt.

        wait_for_input($stdin, seconds_left % 1) do

We pass

wait_for_input

an input stream and a (potentially fractional) number of seconds to wait. In this case we only want to wait until the next second-long “tick” so that we can update the countdown. So we pass in the remainder of dividing seconds_left by 1. E.g. if seconds_left was 5.3, we would set a timeout of 0.3 seconds. After 3/10 of a second of waiting for input, the wait would time out, the prompt would be erased and rewritten to show 4 seconds remaining, and then we’d start waiting for input again.

Here’s the implementation of

wait_for_input

:

def wait_for_input(input, timeout)
  # Wait until input is available
  if select([input], [], [], timeout)
    yield
  end
end                             # wait_for_input

We’re using

Kernel#select

to do the waiting. The parameters to

#select

are a set of arrays – one each for input, output, and errors. We only care about input, so we pass the input stream in the first array and leave the others blank. We also pass how long to wait until timing out.

If new input is detected,

select

returns an array of arrays, corresponding to the three arrays we passed in. If it times out while waiting, it returns

nil

. We use the return value to determine whether to execute the given block or note. If there is input waiting we yield to the block; otherwise we just return.

While it takes some getting used to, handling IO timeouts with

select

is safer and more reliable than using the

Timeout

module. And it’s less messy than rescuing

Timeout::Error

every time a read times out.

Finally, we need to read and interpret the character the user types, if any.

          case char = $stdin.getc
          when ?y, ?Y then return true
          when ?n, ?N then return false
          else                  # NOOP
          end

If the user types ‘y’ or ‘n’ (or uppercase versions of the same), we return

true

or

false

, respectively. Otherwise, we simply ignore any characters the user types. Typing characters other than ‘y’ or ‘n’ will cause the loop to be restarted.

Note the use of character literals like

?y

to compare against the integer character code returned by

IO#getc

. We could alternately use

Integer#chr

to convert the character codes into single-character strings, if we wanted.

Wrapping up, we make sure to return the default value should the timeout expire without any user input; and we output a newline to move the cursor past our prompt.

  return default

And there you have it; a yes/no prompt with a timeout and a visual countdown. Static text doesn’t really capture the effect, so rather than include sample output I’ll just suggest that you try the code out for yourself (sorry, Windows users, it’s *NIX-only).

Full source for this article at: http://gist.github.com/148765

Written by avdi

July 16, 2009 at 10:45 pm

A dozen (or so) ways to start sub-processes in Ruby: Part 2

In the previous article we looked at some basic methods for starting subprocesses in Ruby. One thing all those methods had in common was that they didn’t permit a lot of communication between parent process and child. In this article we’ll examine a few built-in Ruby methods which give us the ability to have a two-way conversation with our subprocesses.

The complete source code for this article can be found at http://gist.github.com/146199.

Method #4: Opening a pipe

As you know, the Kernel#open method allows you to open files for reading and writing (and, with addition of the open-uri library, HTTP sockets as well). What you may not know is that Kernel.open can also open processes as if they were files.

  puts "4a. Kernel#open with |"
  cmd = %Q<|#{RUBY} -r#{THIS_FILE} -e 'hello("open(|)", true)'>
  open(cmd, 'w+') do |subprocess|
    subprocess.write("hello from parent")
    subprocess.close_write
    subprocess.read.split("\n").each do |l|
      puts "[parent] output: #{l}"
    end
    puts
  end
  puts "---"

By passing a pipe (“|”) as the first character in the command, we signal to open that we want to start a process, not open a file. For a command, we’re starting another Ruby process and calling our trusty hello method (see the first article or the source code for this article for the definition of the hello method RUBY and THIS_FILE constants).

open yields an IO object which enables us to communicate with the subprocess. Anything written to the object is piped to the process’ STDIN, and the anything the process writes to its STDOUT can be read back as if reading from a file. In the example above we write a line to the child, read some text back from the child, and then end the block.

Note the call to close_write on line 5. This call is important. Because the OS buffers input and output, it is possible to write to a subprocess, attempt to read back, and wait forever because the data is still sitting in the buffer. In addition, filter-style programs typically wait until they see an EOF on their STDIN to exit. By calling close_write, we cause the buffer to be flushed and an EOF to be sent. Once the subprocess exits, its output buffer wil be flushed and any read calls on the parent side will return.

Also note that we pass “w+” as the file open mode. Just as with files, by default the IO object will be opened in read-only mode. If we want to both write to and read from it, we need to specify an appropriate mode.

Here’s the output of the above code:

4a. Kernel#open with |
[child] Hello, standard error
[parent] output: [child] Hello from open(|)
[parent] output: [child] Standard input contains: "hello from parent"

---

Another way to open a command as an IO object is to call IO.popen:

  puts "4b. IO.popen"
  cmd = %Q<#{RUBY} -r#{THIS_FILE} -e 'hello("popen", true)'>
  IO.popen(cmd, 'w+') do |subprocess|
    subprocess.write("hello from parent")
    subprocess.close_write
    subprocess.read.split("\n").each do |l|
      puts "[parent] output: #{l}"
    end
    puts
  end
  puts "---"

This behaves exactly the same as the Kernel#open version. Which way you choose to use is a matter of preference. The IO.popen version arguably makes it a little more obvious what is going on.

Method #5: Forking to a pipe

This is a variation on the previous technique. If Kernel#open is passed a pipe followed by a dash (“|-“) as its first argument, it starts a forked subprocess. This is like the previous example except that instead of executing a command, it forks the running Ruby process into two processes.

  puts "5a. Kernel#open with |-"
  open("|-", "w+") do |subprocess|
    if subprocess.nil?             # child
      hello("open(|-)", true)
      exit
    else                        # parent
      subprocess.write("hello from parent")
      subprocess.close_write
      subprocess.read.split("\n").each do |l|
        puts "[parent] output: #{l}"
      end
      puts
    end
  end
  puts "---"

Both processes then execute the given block. In the child process, the argument yielded to the block will be nil. In the parent, the block argument will be an IO object. As before, the IO object is tied to the forked process’ standard input and standard output streams.

Here’s the output:

5a. Kernel#open with |-
[child] Hello, standard error
[parent] output: [child] Hello from open(|-)
[parent] output: [child] Standard input contains: "hello from parent"

---

Once again, there is an IO.popen version which does the same thing:

  puts "5b. IO.popen with -"
  IO.popen("-", "w+") do |subprocess|
    if subprocess.nil?             # child
      hello("popen(-)", true)
      exit
    else                        # parent
      subprocess.write("hello from parent")
      subprocess.close_write
      subprocess.read.split("\n").each do |l|
        puts "[parent] output: #{l}"
      end
      puts
    end
  end
  puts "---"

Applications and Caveats

The techniques we’ve looked at in this article are best suited for “filter” style subprocesses, where we want to feed some input to a process and then use the output it produces. Because of the potential for deadlocks mentioned earlier, they are less suitable for running highly interactive subprocesses which require multiple reads and responses.

open/popen also do not give us access to the subprocess’ standard error (STDERR) stream. Any output error generated by the subprocesses will print the same place that the parent process’ STDERR does.

In the upcoming parts of the series we’ll look at some libraries which overcome both of these limitations.

Conclusion

In this article we’ve explored two (or four, depending on how you count it) built-in ways of starting a subprocess and communicating with it as if it were a file. In part 3 we’ll move away from built-ins and on to the facilities provided in Ruby’s Standard Library for starting and controlling subprocesses.

Written by avdi

July 13, 2009 at 4:29 pm

Posted in Ruby, Tips & Tricks

Tagged with ,

A dozen (or so) ways to start sub-processes in Ruby: Part 1

Introduction

It is often useful in Ruby to start a sub-process to run a particular chunk of Ruby code. Perhaps you are trying to run two processes in parallel, and Ruby’s green threading doesn’t provide sufficient concurrency. Perhaps you are automating a set of scripts. Or perhaps you are trying to isolate some untrusted code while still getting information back from it.

Whatever the reason, Ruby provides a wealth of facilities for interacting with sub-processes, some better known than others. In this series of articles I will be focusing on running Ruby as a sub-process of Ruby, although many of the techniques I’ll be demonstrating are applicable to running any type of program in a sub-process. I’ll also be keeping the focus on UNIX-style platforms, such as Linux and Mac OS X. Sub-process handling on Windows differs significantly, and we’ll leave that for another series.

In the first and second articles, I’ll demonstrate some of the facilities for starting sub-processes that Ruby possesses out-of-the-box, no requires needed. In the third article we’ll look at some tools provided in Ruby’s Standard Library which build on the methods introduced in part one. And in the fourth instalment I’ll briefly survey a few of the many Rubygems which simplify sub-process interactions.

Getting Started

To begin, let’s define a few helper methods and constants which we’ll refer back to throughout the series. First, let’s define a simple method which will serve as our “slave” code – the code we want to execute in a sub-process. Here it is:

def hello(source, expect_input)
  puts "[child] Hello from #{source}"
  if expect_input
    puts "[child] Standard input contains: \"#{$stdin.readline.chomp}\""
  else
    puts "[child] No stdin, or stdin is same as parent's"
  end
  $stderr.puts "[child] Hello, standard error"
end

(Note: The full source code for this article can be found at http://gist.github.com/137705)

This method prints a message to the standard output stream, a message to the standard error stream, and optionally reads and prints a message from the standard input stream. One of the things we’ll be exploring in this series is the differing ways in which the various sub-process-starting methods handle standard I/O streams.

Next, let’s define a couple of helpful constants.

require 'rbconfig'
THIS_FILE = File.expand_path(__FILE__)

RUBY = File.join(Config::CONFIG['bindir'], Config::CONFIG['ruby_install_name'])

The first, THIS_FILE, is simply the fully-qualified name of the file containing our demo source code. RUBY, the second constant, is set to the fully-qualified path of the running Ruby executable. These constants will come in handy with sub-process methods which require an explicit shell command to be run.

In order to make the order of events clearer, we’ll force the standard output stream into synchronised mode. This will cause it to flush its buffer after every write.

$stdout.sync = true

Finally, we’ll be surrounding all of the code which follows in the following protective IF-statement:

if $PROGRAM_NAME == __FILE__
# ...
end

This will ensure that the demo code won’t be re-executed when we require the source file within sub-processes.

Method #1: The Backtick Operator

The simplest way to execute a sub-process in Ruby is with the backtick (]`). This method, which harks back to Bourne Shell scripting and Perl, is concise and often gives us exactly as much interaction as we need with a sub-process. The backtick, while it may look like a part of Ruby’s core syntax, is technically an operator defined by Kernel. Like most Ruby operators it can be redefined in your own code, although that’s beyond the scope of this article. Kernel defines the backtick operator as a method which executes its argument in a subshell.

puts "1. Backtick operator"
output = `#{RUBY} -r#{THIS_FILE} -e'hello("backticks", false)'`
output.split("\n").each do |line|
  puts "[parent] output: #{line}"
end
puts

Here, we use backticks to execute a child Ruby process which loads our demo source code and executes the hello method. This yields:

1. Backtick operator
[child] Hello, standard error
[parent] output: [child] Hello from backticks
[parent] output: [child] No stdin, or stdin is same as parent's

The backtick operator doesn’t return until the command has finished. The sub-process inherits its standard input and standard error streams from the parent process. The process’ ending status is made available as a Process::Status object in the $? global (aka $CHILD_STATUS if the English library is loaded).

We can use the %x operator as an alternate syntax for backticks, which enables us to select arbitrary delimiters for the command string. E.g. %x{echo `which cowsay`}.

Method #2: Kernel#system

Kernel#system is similar to the backtick operator in operation, with one important difference. Where the backtick operator returns the STDOUT of the finished command, system returns a Boolean value indicating the success or failure of the command. If the command exits with a zero status (indicating success), system will return true. Otherwise it returns false.

puts "2. Kernel#system"
success = system(RUBY, "-r", THIS_FILE, "-e", 'hello("system()", false)')
puts "[parent] success: #{success}"
puts

This results in:

2. Kernel#system
[child] Hello from system()
[child] No stdin, or stdin is same as parent's
[child] Hello, standard error
[parent] success: true

Just like the backtick operator, system doesn’t return until its process has exited, and leaves the process exit status in $?. The sub-process inherits the parent process’ standard input, output, and error streams.

As we can see in the example above, when system() is given multiple arguments they are assembled into a single command for execution. This feature can make system() a little more convenient than backticks for executing complex commands. For this reason and because it’s more visually apparent in the code, I prefer to use Kernel#system over backticks unless I need to capture the command’s output. Note that there are some other ways system() can be called; see the Kernel#exec documentation for the details.

Method #3: Kernel#fork (aka Process.fork)

Ruby provides access to the *NIX fork() system call via Kernel#fork. On UNIX-like OSes, fork splits the currently executing Ruby process in two. Both processes run concurrently and independently from that point on. Unlike the methods we’ve examined so far, fork enables us to execute in-line Ruby code in a sub-process, rather than explicitly starting a new Ruby interpreter and telling it to load our code.

Traditionally we would need to put in some conditional code to examine the return value of fork and determine whether the code was executing in the parent or child process. Ruby makes it easy to specify what code should be run in the child by allowing us to pass a block to fork. The contents of the block will be run in the child process, after which it will exit. The parent will continue running at the point where the block ends.

puts "3. Kernel#fork"
pid = fork do
hello("fork()", false)
end
Process.wait(pid)
puts "[parent] pid: #{pid}"
puts

This produces the following output:

3. Kernel#fork
[child] Hello from fork()
[child] No stdin, or stdin is same as parent's
[child] Hello, standard error
[parent] pid: 19935

Note the call to Process.wait. Since the process spawned by fork runs concurrently with the parent process, we need to explicitly wait for the child process to finish if we want to synchronize with it. We use the child process ID, returned by fork, as the argument to Process.wait.

The sub-process inherits its standard error and output streams from the parent. Since fork is a *NIX-only syscall, it will only reliably work on UNIX-style systems.

Conclusion

In this first installment in the Ruby Sub-processes series we’ve looked at three of the simplest ways to start another Ruby process from inside a Ruby program. Stay tuned for part 2, in which we’ll delve into some methods for doing more complex communication with spawned sub-processes.

Written by avdi

June 30, 2009 at 8:57 am

Posted in Ruby, Tips & Tricks

Tagged with ,

Spellcheck your files with Aspell and Rake

We recently redid our website. The new site included a new design and much more content explaining what we do. We wanted a quick way to check over everything and make sure we didn’t miss any spelling errors or typos. First I started looking for a web service that could scan the site for spelling errors. I found spellr.us, which is nice but would only catch errors once they were live. It also can’t scan all of the pages which require being logged in.

I was pairing with Avdi who thought we should just run Aspell, which worked out great. We were originally trying to just create a simple Emacs macro to go through all our HTML files and check them but in the end created simple Rake tasks, which makes it really easy to integrate spellcheck into CI. After Avdi figured out the commands we needed to use on each file to get the information we needed from Aspell, it was easy to just wrap the command using Rake’s FileList. To keep everyone on the same setup, we created a local dictionary of words to ignore or accept and keep that checked into source control as well.

The final solution grabs all the files you want to spell check, then runs them through Aspell with HTML filtering. We have two tasks: one that runs in interactive mode the the user can fix mistakes and one mode for CI that just fails if it finds any errors.

def run_spellcheck(file,interactive=false)
  if interactive
    cmd = "aspell -p ./config/devver_dictionary -H check #{file}"
    puts cmd
    system(cmd)
    [true,""]
  else
    cmd = "aspell -p ./config/devver_dictionary -H list  'spellcheck:interactive'

namespace :spellcheck do
  files = FileList['app/views/**/*.html.erb']

  desc "Spellcheck interactive"
  task :interactive do
    files.each do |file|
      run_spellcheck(file,true)
    end
    puts "spelling check complete"
  end

  desc "Spellcheck for ci"
  task :ci do
    files.each do |file|
      success, results = run_spellcheck(file)
      unless success
        puts results
        exit 1
      end
    end
    puts "no spelling errors"
    exit 0
  end
end

view this gist

Written by DanM

May 26, 2009 at 8:33 am