The Devver Blog

A Boulder startup improving the way developers work.

Archive for November 2008

Building a iPhone web app in under 50 lines with Sinatra and iUI

One awesome thing about the iPhone is that it can display documents very nicely, including Word, Excel, and PDF files. However, the other day I was complaining that it’s not very easy to view the documents you store on your computer on your iPhone. Sure, you could email them to yourself, but then you have to search through your mail on your iPhone to find your documents. And I’m sure there is a snazzy iPhone app from the App Store to do this as well. But instead, let’s build a quick web app using Sinatra and iUI.

Here’s what we’ll be building (screenshot courtesy of iPhoney, which rules, by the way):

A screenshot of the butler iPhone web app

It doesn't look like much, but hey, it's less than 50 lines of code

When you click on git-tutorial.pdf, you'll see the full document

When you click on git-tutorial.pdf, you'll see this

Sinatra is a really awesome minimalist web framework. It lets you build web applications with just a few lines of code. iUI is a collection of JavaScript, CSS, and images that lets you easily make your web sites look great on the iPhone. Using these two tools, it’s really easy to build simple iPhone apps.

To begin, install the Sinatra gem:

$ gem install sinatra

Now, let’s start with a simplest version of our app, which we’ll call ‘butler’. Let’s make a directory for butler.

$ mkdir butler
$ cd butler
$ touch butler.rb

Open up butler.rb in your favorite editor and type:

require 'sinatra'
get "/" do
  "<h1>Your files, sir.</h1>"
end

view this gist

Now start butler on your command line:

 $ ruby -rubygems ./butler.rb 

and point your browser to http://localhost:4567 (you can use your computer’s browser or the one on your iPhone – it doesn’t matter. I find it’s better to use the one on my computer while building the app, since it’s easier to read Sinatra’s debugging messages if something goes wrong). You should see a page that just says “Your files, sir.” Congrats! You’ve made your first Sinatra app. Wasn’t that easy?

OK, let’s make butler a little more useful. Sinatra will serve up any files in a subdirectory named public. Since we’ll eventually be using this public directory for holding other JavaScript and CSS files as well, we’ll actually put our files in ./public/files. We’ll also make a link for convenience. Finally, while we’re at it, let’s put a few test files in there.

$ mkdir -p public/files
$ ln -s public/files files
$ echo "foo" > public/files/foo.txt
$ echo "bar" > public/files/bar.txt 

We want butler to link to each file, so let’s build a little helper for that. In Sinatra, you can include helpers within a helper block. We’ll also try out our helper for one file.

require 'sinatra'
require 'pathname'

get "/" do
  html = "<h1>Your files, sir.</h1>"
  dir = "./files/"
  html += file_link("./files/foo.txt")
  html
end

helpers do
  def file_link(file)
    filename = Pathname.new(file).basename
    "<a href='#{file}'>#{filename}</a><br />"
  end
end

view this gist

Go refresh your browser to see the changes. There’s no need to restart your application, because Sinatra automatically reloads changes (very cool!). You should see a link to foo.txt. Click on it, and you’ll see the contents.

Clearly, we don’t want to hardcode this for just one file. Let’s alter butler to look for every file within the ./files directory.

require 'sinatra'
require 'pathname'

get "/" do
  html = "<h1>Your files, sir.</h1>"
  dir = "./files/"
  Dir[dir+"*"].each do |file|
    html+=file_link(file)
  end
  html
end

helpers do

  def file_link(file)
    filename = Pathname.new(file).basename
    "<a href='#{file}'>#{filename}</a><br />"
  end

end

view this gist

OK, refresh your browser and you should see both foo.txt and bar.txt. This is looking pretty good, but we’re not really creating valid HTML right now. We’re missing html, head, and, body tags at the very least. We could add this all within our “get” handler, but that would clutter up the code.

Instead, let’s put this code into a view. Sinatra actually lets you put the view right after your other code, so you can build an entire application in one file. For simplicity, I’m going to do that for this tutorial. However, if this approach bothers you (or just messes up syntax highlighting in your editor), rest assured you can place the view code in a views directory and it would work the same way.

Let’s add the view to the end of our file, and use it in our handler. Notice that I name the view ‘index’ by beginning my declaration with @@ index – if I wanted a separate file, I would just put it in ./views/index.erb (you can also use Haml, if that’s your cup of tea). Note I assign @links in the handler and it automatically is available in the view.

require 'sinatra'
require 'pathname'

get "/" do
  dir = "./files/"
  @links = Dir[dir+"*"].map { |file|
    file_link(file)
  }.join
  erb :index
end

helpers do

  def file_link(file)
    filename = Pathname.new(file).basename
    "<a href='#{file}'>#{filename}</a><br />"
  end

end

use_in_file_templates!

__END__

@@ index
<html>
  <head>
  </head>

  <body>
    <h1>Your files, sir.</h1>
    <%= @links %>
  </body>
</html>

view this gist

Refreshing the browser now isn’t really that exciting, since things look the same, but if you wanted, you could easily play around with the view to make things look different.

One glaring problem is that this page isn’t very usable on the iPhone itself. That’s where iUI comes in. Start by downloading it (URL is in instructions below) to your butler directory, unzipping it, and copying the necessary files into your public directory.

$ mkdir iui
$ cd iui
$ wget http://iui.googlecode.com/files/iui-0.13.tar.gz
$ tar -xzvf iui-0.13.tar.gz
$ cd ..
$ mkdir public/images
$ cp iui/iui/*.png public/images
$ cp iui/iui/*.gif public/images
$ mkdir public/javascripts
$ cp iui/iui/*.js public/javascripts
$ mkdir public/stylesheets
$ cp iui/iui/*.css public/stylesheets

To use iUI, you’ll need to include the JavaScript and CSS in your view. You’ll also need to add some elements to the body of your view. When you’re done, the view will look like this:

<html>
  <head>
    <meta name="viewport" content="width=320; initial-scale=1.0; maximum-scale=1.0; user-scalable=0;"/>
    <style type="text/css" media="screen">@import "/stylesheets/iui.css";</style>
    <script type="application/x-javascript" src="/javascripts/iui.js"></script>
  </head>

  <body>
    <div class="toolbar">
    <h1 id="pageTitle"></h1>
    </div>
    <ul id="home" title="Your files, sir." selected="true">
       <%= @links %>
    </ul>
  </body>

</html>

view this gist

This html is probably a bit confusing, but don’t worry. There are a few examples in ./iui/samples/ to learn from (and good iUI tutorials on the web). Finally, you’ll want to alter the file_link helper to print out iUI code, like so:

helpers do

  def file_link(file)
    filename = Pathname.new(file).basename
    "<li><a href='#{file}' target='_self'>#{filename}</a></li>"
  end

end

view this gist

Note that target='_self' code. You need that to get iUI to open a link in a normal way. If you leave it off, it will use an AJAX call to load the file within the current page, which looks really funny when you try to open a binary file like a PDF.

The final code looks like this:

require 'sinatra'
require 'pathname'

get "/" do
  dir = "./files/"
  @links = Dir[dir+"*"].map { |file|
    file_link(file)
  }.join
  erb :index
end

helpers do

  def file_link(file)
    filename = Pathname.new(file).basename
    "<li><a href='#{file}' target='_self'>#{filename}</a></li>"
  end

end

use_in_file_templates!

__END__

@@ index
<html>
  <head>
    <meta name="viewport" content="width=320; initial-scale=1.0; maximum-scale=1.0; user-scalable=0;"/>
    <style type="text/css" media="screen">@import "/stylesheets/iui.css";</style>
    <script type="application/x-javascript" src="/javascripts/iui.js"></script>
  </head>

  <body>
    <div class="toolbar">
    <h1 id="pageTitle"></h1>
    </div>
    <ul id="home" title="Your files, sir." selected="true">
       <%= @links %>
    </ul>
  </body>

</html>

view this gist

And there you have it – an iPhone web app in less than 50 lines of code, thanks to Sinatra and iUI. Now, whenever you want to view some files on your iPhone, either copy the file:

$ cp path/to/my_file ./files

or if you prefer, link it:

$ ln -s path/to/my_file ./files

… and then run butler

$ ruby -rubygems ./butler.rb

Figure out the IP address of your computer and simply point your iPhone browser to http://<ip&gt;:4567

I use butler primarily within my home network, but if you want to be able to view your files on the go, you’ll need to poke a hole in your firewall. That’s a bit outside the scope of this tutorial, but a quick Google search should give you some good results.

Enjoy!

Update: Removed an unused parameter from the code after pmccann called it to my attention.
Update: Added -z option to tar after Peter pointed out the omission. The tar command without -z worked for me on OS X 10.5, but this is definitely more correct.
Update: Added -rubygems option to ruby command. If you’d prefer to not use this option, check the comments below for ways to use RubyGems in a Ruby script.

Advertisements

Written by Ben

November 25, 2008 at 9:14 am

Posted in Hacking, Tools

Ruby Beanstalkd distributed worker intermediate lessons

This post is a follow up to Ruby beanstalkd basics, I will try to make the example code little more interesting and useful. I am calling this is a Ruby beanstalkd intermediate write up, it sets up a few workers and distributes and receives results simultaneously. In this example the code resembles real code a bit more (using a queue cache and block passing). If there is enough interest in the Ruby/beanstalkd community, I will follow up with beanstalkd advanced lessons, and go into how we deal with failure cases such as worker dying during jobs, random jobs failing, processing multiple ‘projects’ at one time, using job priority settings, and using TTR/timeouts.

So in this example we are making an estimate of PI. Yes I know that there are far better approximations out there than my simple results, but this was what I came up with for an incredibly simple distributed computing problem. I based my example on the PI Calculation problem from an Introduction to Parallel Computing. The basic idea is that you can calculate pi by guessing random points in a square and then seeing how many points are inside a circle that fits inside the square (PI= 4 * points_in_circle/total_points).

I made a bunch of comments in the code that should help you follow but there are a few key sections worth pointing out.

In the Ruby beanstalkd Basics, both the Server and the Clients only used one queue at a time. Now since we are sending on one queue while also listening on another we need access to both queues at once. We simply have a helper function with a queue_cache to make getting and reusing multiple queues incredibly easy.

def get_queue(queue_name)
    @queue_cache ||= {}
    if @queue_cache.has_key?(queue_name)
      return @queue_cache[queue_name]
    else
      queue = Beanstalk::Pool.new(["#{SERVER_IP}:#{DEFAULT_PORT}"])
      queue.watch(queue_name)
      queue.use(queue_name)
      queue.ignore('default')
      @queue_cache[queue_name] = queue
      return queue
    end
  end

In the basic example each class had a function that got a job and did some work and deleted the job. It is easy to imagine workers that might have many different kinds of work to do on jobs. In every case they are going to grab a job, work on the job, and delete the job. We decided to break that up and make it easy to just pass a work block when workers get a job.

def take_msg(queue)
    msg = queue.reserve
    #by calling ybody we get the content of the message and convert it from yml
    body = msg.ybody
    if block_given?
      yield(body)
    end
    msg.delete
  end

#call take_msg like so
take_msg(queue) do |body|
  #work on body
end

One other thing you should keep a look out for in the code below is checking if a queue has any jobs. Many times workers will check if jobs exist and take them, and if there aren’t any jobs the process is free to do something else. I do this in this example, the server continually checks incoming results to immediately display. If no results have arrived yet, the server continues sending out job requests as fast as it can. This is useful since taking jobs from beanstalkd is a blocking call. They did add support for non-blocking calls in beanstalkd 1.1, but I haven’t started using the newest version yet. I think everything else should be pretty self explanatory, feel free to ask me any questions. To run the code it is the same as before: download beanstalk_intermediate.rb, start beanstalkd, and run the example with ruby.

$ beanstalkd &amp;
$ ruby beanstalk_intermediate.rb
starting distributor
starting client(s)
distributor sending out  jobs
.......................................................
.............................................
received all the results our estimate for pi is: 3.142776
# of workers time to complete
1 real 0m7.282s
user 0m4.114s
sys 0m0.978s
2 real 0m5.667s
user 0m2.736s
sys 0m0.670s
3 real 0m4.999s
user 0m2.014s
sys 0m0.515s
4 real 0m4.612s
user 0m1.608s
sys 0m0.442s
5 real 0m4.517s
user 0m1.474s
sys 0m0.416s
require 'beanstalk-client.rb'

DEFAULT_PORT = 11300
SERVER_IP = '127.0.0.1'
#beanstalk will order the queues based on priority, with the same priority
#it acts FIFO, in a later example we will use the priority
#(higher numbers are higher priority)
DEFAULT_PRIORITY = 65536
#TTR is time for the job to reappear on the queue.
#Assuming a worker died before completing work and never called job.delete
#the same job would return back on the queue (in TTR seconds)
TTR = 3

class BeanBase

  #To work with multiple queues you must tell beanstalk which queues
  #you plan on writing to (use), and which queues you will reserve jobs from
  #(watch). In this case we also want to ignore the default queue
  #you need a different queue object for each tube you plan on using or
  #you can switch what the tub is watching and using a bunch, we just keep a few
  #queues open on the tubes we want.
  def get_queue(queue_name)
    @queue_cache ||= {}
    if @queue_cache.has_key?(queue_name)
      return @queue_cache[queue_name]
    else
      queue = Beanstalk::Pool.new(["#{SERVER_IP}:#{DEFAULT_PORT}"])
      queue.watch(queue_name)
      queue.use(queue_name)
      queue.ignore('default')
      @queue_cache[queue_name] = queue
      return queue
    end
  end

  #this will take a message off the queue, and process it with the block
  def take_msg(queue)
    msg = queue.reserve
    #by calling ybody we get the content of the message and convert it from yml
    body = msg.ybody
    if block_given?
      yield(body)
    end
    msg.delete
  end

  def results_ready?(queue)
    queue.peek_ready!=nil
  end

end

class BeanDistributor < BeanBase

  def initialize(chunks,points_per_chunk)
    @chunks = chunks
    @points_per_chunk = points_per_chunk
    @messages_out = 0
    @circle_count = 0
  end

  def get_incoming_results(queue)
    if(results_ready?(queue))
      result = nil
      take_msg(queue) do |body|
        result = body.count
      end
      @messages_out -= 1
      print "." #display that we received another result
      @circle_count += result
    else
      #do nothing
    end
  end

  def start_distributor
    request_queue = get_queue('requests')
    results_queue = get_queue('results')
    #put all the work on the request queue
    puts "distributor sending out #{@messages} jobs"
    @chunks.times do |num|
      msg = BeanRequest.new(1,@points_per_chunk)
      #Take our ruby object and convert it to yml and put it on the queue
      request_queue.yput(msg,pri=DEFAULT_PRIORITY, delay=0, ttr=TTR)
      @messages_out += 1
      #if there are results get them if not continue sending out work
      get_incoming_results(results_queue)
    end

    while @messages_out > 0
      get_incoming_results(results_queue)
    end
    npoints = @chunks * @points_per_chunk
    pi = 4.0*@circle_count/(npoints)
    puts "\nreceived all the results our estimate for pi is: #{pi}"
  end

end

class BeanWorker < BeanBase

  def initialize()
  end

  def write_result(queue, result)
    msg = BeanResult.new(1,result)
    queue.yput(msg,pri=DEFAULT_PRIORITY, delay=0, ttr=TTR)
  end

  def in_circle
    #generate 2 random numbers see if they are in the circle
    range = 1000000.0
    radius = range / 2
    xcord = rand(range) - radius
    ycord = rand(range) - radius
    if( (xcord**2) + (ycord**2) <= (radius**2) )
      return 1
    else
      return 0
    end
  end

  def start_worker
    request_queue = get_queue('requests')
    results_queue = get_queue('results')
    #get requests and do the work until the worker is killed
    while(true)
      result = 0
      take_msg(request_queue) do |body|
        chunks = body.count
        chunks.times { result += in_circle}
      end
      write_result(results_queue,result)
    end

  end

end

############
# These are just simple message classes that we pass using beanstalks
# to yml and from yml functions.
############
class BeanRequest
  attr_accessor :project_id, :count
  def initialize(project_id, count=0)
    @project_id = project_id
    @count = count
  end
end

class BeanResult
  attr_accessor :project_id, :count
  def initialize(project_id, count=0)
    @project_id = project_id
    @count = count
  end
end

#how many different jobs we should do
chunks = 100
#how many points to calculate per chunk
points_per_chunk = 10000
#how many workers should we have
#(normally different machines, in our example fork them off)
workers = 5

# Most of the time you will have two entirely separate classes
# but to make it easy to run this example we will just fork and start our server
# and client separately. We will wait for them to complete and check
# if we received all the messages we expected.
puts "starting distributor"
server_pid = fork {
  BeanDistributor.new(chunks,points_per_chunk).start_distributor
}

puts "starting client(s)"
client_pids = []
workers.times do |num|
  client_pid = fork {
    BeanWorker.new.start_worker
  }
  client_pids << client_pid
end

Process.wait(server_pid)
#take down the clients
client_pids.each do |pid|
  Process.kill("HUP",pid)
end

Written by DanM

November 19, 2008 at 3:19 pm

Posted in Development, Hacking, Ruby

Shazam vs Mashup Music

I had a funny thought while listening to some music recently and decided I needed to do battle between music mashups and music recognition software.

Shazam is a pretty cool app on the iPhone to help you recognize and find music, it uses the microphone and listens to whatever music is near the iPhone. Shazam then compares the few second recorded sound to an online database of music data to try to determine the song. I was curious what would happen when it was given mashup music which contains parts of many songs. Would it just be entirely confused? Would it recognize the mashup? Would it recognize the currently most prominent sample in the mashup?

Artist, Song Title Results
First let’s test it on some normal songs and check the accuracy

(played on iTunes)

Beck, Loser correct
Aquabats, Red Sweater! unrecognized
Beastie Boys, Intergalatic correct
Pearl Jam, Elderly Woman Behind the Counter in a Small Town correct
Billy Joel, Captain Jack correct
Flobots, Handlebars correct
Weezer, Surf Wax America correct
Now let’s throw it some mashups and see what happens

(Shazam used randomly during song)

Jay-Zeezer, Surf Wax Off your Shoulder rating: Good
Fails twice

identifies as Lil Jay, Dirt off your Shoulda Shurga twice (Should be Jay-Z, Dirt Off Your Shoulder, but it is basically a cover)

identifies as Weezer, Surf Wax America twice

Girl Talk, Set It Off rating: Great
Fails three times

identifies as Jay-Z, Roc Boys (correctly in song)

identifies as Mary J. Blige, Real Love (correctly in song)

identifies as Fatman, Scoop Be Faithful (correctly in song)

Girl Talk, Play Your Part (part 2) rating: Poor
Fails eight times

identifies as Huey Pop, Lock & Drop It once (correctly in song)

Girl Talk, Shut The Club Down rating: OK
Fails two times

identifies as Dolla Feat. T-Pain and Tay Dizm, Who The F**k is That? (correctly in song (there are multiple versions of this song see below))

identifies as T-Pain Feat. Dolla & Akon, Who the F is That (correctly in song (there are multiple versions of this song see above))

identifies as Rich Boy Feat. Polow Da Don, Throw Some D’s (correctly in song)

Danger Mouse, 99 Problems (The Grey Album) rating: Awesome!
No Failures!

identifies as Jay-Z, 99 Problems (correctly in song)

identifies as Dangermouse, 99 Problems/Helter Skelter five times (This is the actual mashup)

Danger Mouse, Public Service Announcement (The Grey Album) rating: Poor
identifies as Jay-Z, Interlude three times (correct artist and album but the wrong song)
Danger Mouse, What More Can I Say (The Grey Album) rating: Great
identifies as Jay-Z, What More Can I Say nine times (correctly in song)

identifies as George Harrison, While My Guitar Gently Weeps twice (correctly in song)

Conclusion
It was far more noticeable when the dominant sample was playing during both Jay-Zeezer and The Grey Album which seemed to mean that Shazam could pick out the individual songs easier. I was quite impressed when it even correctly identified some of the Grey Album as actually being a mashup. Girl Talk gave the program a harder time, which makes sense seeing as many samples are put together for very small amounts of times and many of the samples are slightly altered. Some songs where there were dominant samples for a period of time, it did a decent job. It is a cool way to figure out what songs contribute to a mashup if you are unsure about a piece of a song. I was pretty impressed with Shazam’s abilities to pick apart the pieces of the songs. I guess the results aren’t really that surprising, but it was still a fun way to spend some of a Saturday afternoon listening to music and testing some software.

Written by DanM

November 14, 2008 at 10:21 am

Posted in Uncategorized

Displaying code on your blog

A few weeks ago, I wrote about my desire for an awesome embeddable code widget.The most common solution that people suggested in the comments (and on the thread on Stack Overflow) was SyntaxHighlighter. I had seen the SyntaxHighlighter (SH) widget on a few blogs and was impressed, so I decided to install the WordPress plugin. The installation was easy enough, but unfortunately the plugin comes with a slightly old version of SH. Luckily, it was really easy to download the newest version from the SH site and drop it into the plugin.

Now we can just wrap code in

<pre name='code' class='ruby'>...some code...</pre>

and it will look like this:

class Foo
   def bar
     puts "Hello, world!"
   end
end

Pretty sweet! In fact, we ended up converting all of the code snippets on our blog to use SH.

Although SH is really, really cool, it does suffer from inherent drawbacks: it has to be installed and it has to be periodically updated. What I initially described in my previous post was something simpler – a web app where you could easily post code and then embed it in your blog. In other words, Pastie with a YouTube-like embed feature.

Dirceu Jr. pointed out Gist from GitHub, which is exactly what I was looking for. Even without signing up, you can post code (like so) and then embed it in your blog.

[UPDATE: Since this post was published, we moved to WordPress.com, which doesn’t work with Gist. Now we a code plugin]

class Foo

   def bar
     puts "Hello, world!"
   end

end

view this gist

Dirceu Jr. even wrote a WordPress plugin if you really want to install something! The only weird thing is that while the embedded code is nicely syntax-highlighted, the original gist is not for some reason. Update: Yoan pointed out that you can get syntax-coloring by naming your gist properly (I renamed my gist foo.rb and it worked). It’s still not clear to me why it doesn’t work if I manually select ‘Ruby’ as the language for an unnamed gist, however.

Finally, while SyntaxHighlighter supports the most popular languages (and therefore is likely to be fine for most programming blogs), Gist supports a truly insane number of languages, making it a better choice if you want to post code snippets of say, Lua, Erlang, or Haskell.

Language SyntaxHighlighter Gist
Popular languages
ActionScript X
Bash X
C X
C# X X
C++ X X
CSS X X
Delphi X X
Diff X
Erlang X
HTML X X
Haskell X
Io X
Java X X
JavaScript X X
Lua X
OCaml X
Objective-C X
PHP X X
Perl X
Plain Text X
Python X X
RHTML X
Ruby X X
SQL X X
Scheme X
Smalltalk X
Smarty X
VB or VB.net X X
XML X X
Other crazy stuff
Batchfile X
Befunge X
Boo X
Brainfuck X
Common Lisp X
D X
Darcs Patch X
Dylan X
Fortran X
GAS X
Genshi X
Gettext Catalog X
Groff X
HTML+PHP X
INI X
IRC logs X
Java Server Page X
LLVM X
Literate Haskell X
Logtalk X
MOOCode X
Makefile X
Mako X
Matlab X
MiniD X
MuPAD X
Myghty X
NumPy X
Python Traceback X
Raw token data X
Redcode X
S X
Tcl X
Tcsh X
TeX X
Text only X
VimL X
XSLT X
c-objdump X
cpp-objdump X
d-objdump X
objdump X
reStructuredText X

Both of these projects provide great-looking ways to display code on your blog. If you know of other projects that you like, let me know in the comments.

Written by Ben

November 13, 2008 at 11:19 am

Posted in Tools

Notes from the Boulder CTO lunch (11/3/2008)

Although I’m not actually the CTO of Devver, I had the pleasure of attending the Boulder CTO lunch this past Monday since Dan was out of town.

This week, the group had Todd Vernon from Lijit come lead the discussion. Although Todd is currently CEO of Lijit, he was CTO at his former company, Raindance.

The group that was assembled was small but awesome – I had the opportunity to learn not only from Todd, but also from the CTOs of a few of last years TechStars companies.

The discussion touched on a ton of topics, but two (related) themes that were heavily discussed were the role of the CTO and how a company grows from a technology perspective. I’ve organized my notes below. Keep in mind that these are the collected thoughts from a number of different participants and I may not have captured their ideas with 100% accuracy.

The Role of a CTO

What is the difference between a CTO and a VP of Engineering?

CTO is about leadership for technical issues, interfacing with the business side, guiding the product, get people excited about product from a technical point of view.

VPoE is almost one step above Chief Architect, more on a management side, getting product delivered.

1st time CTOs need to figure out their exact role. It’s a very amorphous role, depending on the company.

CTO needs to be able to tell the whole business story, understand the good parts and bad.

Early on, CTO should insert themselves into the sales process as much as possible (especially right after you hire the sales person). You need to be able to hear what customers say they want, so you can translate that into what they really need.

The Technology Onion

There is a technology onion – make sure the core of the onion is owned by company (the outer layers, not as important). The CTO needs to figure out the relationship between the technologies and the company’s partnerships. The closer a technology is to the core of the onion, the more important it is to own it and to make sure it scales.

For instance, if you’re depending on Google for search, you’re powerless to change features if things don’t work as your customers want/expect. If search is core to your business (near the core of the onion), consider building it internally. It’s the CTOs role to make that case, because business people will never understand the need to spend money to get “the same thing.”

Having to re-architect a core component of a company can really hurt growth. Assume you’re going to be successful, so plan for that.

Along the same lines, one concern about using EC2 is that you get tied to the platform and your business is dependent on an outside force you can’t control. Hosting on EC2 can be quite different than hosting your own boxes.

Acceptable Failures

What is acceptable downtime? It depends on when – between midnight to 1-2 AM, it might be OK to be down for a few minutes. CTOs need to determine what acceptable downtime is and tell that the to the rest of management and have people agree.

CTOs need to make decisions (for instance, what is the acceptable down time, acceptable data loss, or acceptable time for page load) and then tell the entire organization. That way, when something bad happens, you can explain that everyone agreed on the specific numbers. It’s unlikely your business will need to be (or can be) 100% perfect on all metrics, but people need to understand what the goal is and why it’s realistic.

Growing/Scaling/Monitoring

If at least one person is using your service, you should have two web servers. It gives you ton of flexibility. Having two boxes forces you to work out most of the issues early (it’s a lot different getting to 2 boxes than 3 or 4). It’s not about load, it’s about reliability.

No matter how useful you are, if you are not reliable, someone will blog, “it’s cool, but it doesn’t work reliably.”

Downtime spreads very fast across Twitter. Consider tweeting about upcoming service interruptions ahead of time so customers are aware.

After more than 15 people, you need a dedicated operations person. Get some basic monitoring services early – after a server is under load, it’s really hard to diagnose. Try to detect stuff early, it’s easier to debug.

With startups, generally the problem tends to be slow requests rather than complete service downtime. Make sure your monitoring service will alert you with slow requests.

Get app specific stuff – a warning like “High CPU load,” is harder to understand (it might be a problem, or maybe the machine is just handling a lot of requests successfully), but “Page X takes 80 sec to load” is more obvious.

As you grow, try to measure more and more. Things often degrade slowly, and one day you just notice its too slow and it’s hard to go back and find the root of the problem.

Make two lists: a) the most catastrophic things that could happen and b) the most likely things that could happen. Where those lists overlap, you need to fix something. But there will be some risks that you decide are reasonable risks for the business (revisit these risks regularly as things change).

Regarding backup – always make sure you try to restore some data (before you really need it). You need to make sure it works and make sure its fast enough.

You should always be able to describe at a high level how the service will scale infinitely (it doesn’t have to be technically perfect, but it has to be believable). When someone wants to purchase, that’ll be a huge help – the business guy on the other side of the table will want to buy, but the technical guy doesn’t want to buy (he wants to build it in-house).


I hope those notes make some sense and give you a good feel for the discussion we had. I’m looking forward to attending more of these lunches (I hope they’ll continue to let a few CEOs sneak in…)

Written by Ben

November 7, 2008 at 10:21 am

Posted in Devver, Misc, TechStars