The Devver Blog

A Boulder startup improving the way developers work.

Posts Tagged ‘metrics

Announcing Caliper Community Statistics

For the past few months, we’ve been building Caliper to help you easily generate code metrics for your Ruby projects. We’ve recently added another dimension of metrics information: community statistics for all the Ruby projects that are currently in Caliper.

The idea of community statistics is two-fold. From a practical perspective, you can now compare your project’s metrics to the community. For example, Flog measures the complexity of methods. Many people wonder exactly defines a good Flog score for an individual method. In Jake Scruggs’ opinion, a score of 0-10 is “Awesome”, while a score of 11-20 is “Good enough”. That sounds correct, but with Caliper’s community metrics, we can also compare the average Flog scores for entire projects to see what defines a good average score.

To do so, we calculate the average Flog method score for each project and plot those averages on a histogram, like so:

flog_average_method_histogram

Looking at the data, we see that a lot of projects have an average Flog score between 6 and 12 (the mean is 10.3 and the max is is 21.3).

If your project’s average Flog score is 9, does that mean it has only “Awesome” methods, Flog-wise? Well, remember that we’re looking at the average score for each project. I suspect that in most projects, lots of tiny methods are pulling down the average, but there are still plenty of big, nasty methods. It would be interesting to look at the community statistics for maximum Flog score per project or see a histogram of the Flog scores for all methods across all projects (watch this space!).

Since several of the metrics (like Reek, which detects code smells) have scores that grow in proportion to the number of lines of code, we divide the raw score by each project’s lines of code. As a result, we can sensibly compare your project to other projects, no matter what the difference in size.

The second reason we’re calculating community statistics is so we can discover trends across the Ruby community. For example, we can compare the ratio of lines of application code to test code. It’s interesting to note that a significant portion of projects in Caliper have no tests, but that, for the projects that do have tests, most of them have a code:test ratio in the neighborhood of 2:1.

code_to_test_ratio_histogram

Other interesting observations from our initial analysis:
* A lot of projects (mostly small ones) have no Flay duplications.
* Many smaller projects have no Reek smells, but the average project has about 1 smell per 9 lines of code.

Want to do your own analysis? We’ve built a scatter plotter so you can see if any two metrics have any correlation. For instance, note the correlation between code complexity and code smells.

Here’s a scatter plot of that data (zoomed in):

scatter_plot

Over the coming weeks, we’ll improve the graphs we have and add new graphs that expose interesting trends. But we need your help! Please let us know if you spot problems, have ideas for new graphs, or have any questions. Additionally, please add your project to Caliper so it can be included in our community statistics. Finally, feel free to grab the raw stats from our alpha API* and play around yourself!

* Quick summary:

curl http://api.devver.net/metrics

for JSON,

curl -H 'Accept:text/x-yaml' http://api.devver.net/metrics

for YAML. More details. API is under development, so please send us feedback!

Advertisements

Written by Ben

November 19, 2009 at 8:37 am

Posted in Development, Ruby, Tools

Tagged with ,

Improving Code using Metric_fu

Often, when people see code metrics they think, “that is interesting, I don’t know what to do with it.” I think metrics are great, but when you can really use them to improve your project’s code, that makes them even more valuable. metric_fu provides a bunch of great metric information, which can be very useful. But if you don’t know what parts of it are actionable it’s merely interesting instead of useful.

One thing when looking at code metrics to keep in mind is that a single metric may not be as interesting. If you look at a metric trends over time it might help give you more meaningful information. Showing this trending information is one of our goals with Caliper. Metrics can be your friend watching over the project and like having a second set of eyes on how the code is progressing, alerting you to problem areas before they get out of control. Working with code over time, it can be hard to keep everything in your head (I know I can’t). As the size of the code base increases it can be difficult to keep track of all the places where duplication or complexity is building up in the code. Addressing the problem areas as they are revealed by code metrics can keep them from getting out of hand, making future additions to the code easier.

I want to show how metrics can drive changes and improve the code base by working on a real project. I figured there was no better place to look than pointing metric_fu at our own devver.net website source and fixing up some of the most notable problem areas. We have had our backend code under metric_fu for awhile, but hadn’t been following the metrics on our Merb code. This, along with some spiked features that ended up turning into Caliper, led to some areas getting a little out of control.

Flay Score before cleanup

When going through metric_fu the first thing I wanted to start to work on was making the code a bit more DRY. The team and I were starting to notice a bit more duplication in the code than we liked. I brought up the Flay results for code duplication and found that four databases models shared some of the same methods.

Flay highlighted the duplication. Since we are planning on making some changes to how we handle timestamps soon, it seemed like a good place to start cleaning up. Below are the methods that existed in all four models. A third method ‘update_time’ existed in two of the four models.

 def self.pad_num(number, max_digits = 15)
    "%%0%di" % max_digits % number.to_i
  end

  def get_time
      Time.at(self.time.to_i)
  end

Nearly all of our DB tables store time in a way that can be sorted with SimpleDB queries. We wanted to change our time to be stored as UTC in the ISO 8601 format. Before changing to the ISO format, it was easy to pull these methods into a helper module and include it in all the database models.

module TimeHelper

  module ClassMethods
    def pad_num(number, max_digits = 15)
      "%%0%di" % max_digits % number.to_i
    end
  end

  def get_time
      Time.at(self.time.to_i)
  end

  def update_time
    self.time = self.class.pad_num(Time.now.to_i)
  end

end

Besides reducing the duplication across the DB models, it also made it much easier to include another time method update_time, which was in two of the DB models. This consolidated all the DB time logic into one file, so changing the time format to UTC ISO 8601 will be a snap. While this is a trivial example of a obvious refactoring it is easy to see how helper methods can often end up duplicated across classes. Flay can come in really handy at pointing out duplication that over time that can occur.

Flog gives a score showing how complex the measured code is. The higher the score the greater the complexity. The more complex code is the harder it is to read and it likely contains higher defect density. After removing some duplication from the DB models I found our worst database model based on Flog scores was our MetricsData model. It included an incredibly bad high flog score of 149 for a single method.

File Total score Methods Average score Highest score
/lib/sdb/metrics_data.rb 327 12 27 149

The method in question was extract_data_from_yaml, and after a little refactoring it was easy to make extract_data_from_yaml drop from a score of 149 to a series of smaller methods with the largest score being extract_flog_data! (33.6). The method was doing too much work and was frequently being changed. The method was extracting the data from 6 different metric tools and creating summary of the data.

The method went from a sprawling 42 lines of code to a cleaner and smaller method of 10 lines and a collection of helper methods that look something like the below code:

  def self.extract_data_from_yaml(yml_metrics_data)
    metrics_data = Hash.new {|hash, key| hash[key] = {}}
    extract_flog_data!(metrics_data, yml_metrics_data)
    extract_flay_data!(metrics_data, yml_metrics_data)
    extract_reek_data!(metrics_data, yml_metrics_data)
    extract_roodi_data!(metrics_data, yml_metrics_data)
    extract_saikuro_data!(metrics_data, yml_metrics_data)
    extract_churn_data!(metrics_data, yml_metrics_data)
    metrics_data
  end

  def self.extract_flog_data!(metrics_data, yml_metrics_data)
    metrics_data[:flog][:description] = 'measures code complexity'
    metrics_data[:flog]["average method score"] = Devver::Maybe(yml_metrics_data)[:flog][:average].value(N_A)
    metrics_data[:flog]["total score"]   = Devver::Maybe(yml_metrics_data)[:flog][:total].value(N_A)
    metrics_data[:flog]["worst file"] = Devver::Maybe(yml_metrics_data)[:flog][:pages].first[:path].fmap {|x| Pathname.new(x)}.value(N_A)
  end

Churn gives you an idea of files that might be in need of a refactoring. Often if a file is changing a lot it means that the code is doing too much, and would be more stable and reliable if broken up into smaller components. Looking through our churn results, it looks like we might need another layout to accommodate some of the different styles on the site. Another thing that jumps out is that both the TestStats and Caliper controller have fairly high churn. The Caliper controller has been growing fairly large as it has been doing double duty for user facing features and admin features, which should be split up. TestStats is admin controller code that also has been growing in size and should be split up into more isolated cases.

churn results

Churn gave me an idea of where might be worth focusing my effort. Diving in to the other metrics made it clear that the Caliper controller needed some attention.

The Flog, Reek, and Roodi Scores for Caliper Controller:

File Total score Methods Average score Highest score
/app/controllers/caliper.rb 214 14 15 42

reek before cleanup

Roodi Report
app/controllers/caliper.rb:34 - Method name "index" has a cyclomatic complexity is 14.  It should be 8 or less.
app/controllers/caliper.rb:38 - Rescue block should not be empty.
app/controllers/caliper.rb:51 - Rescue block should not be empty.
app/controllers/caliper.rb:77 - Rescue block should not be empty.
app/controllers/caliper.rb:113 - Rescue block should not be empty.
app/controllers/caliper.rb:149 - Rescue block should not be empty.
app/controllers/caliper.rb:34 - Method name "index" has 36 lines.  It should have 20 or less.

Found 7 errors.

Roodi and Reek both tell you about design and readability problems in your code. The screenshot of our Reek ‘code smells’ in the Caliper controller should show how it had gotten out of hand. The code smells filled an entire browser page! Roodi similarly had many complaints about the Caliper controller. Flog was also showing the file was getting a bit more complex than it should be. After picking off some of the worst Roodi and Reek complaints and splitting up methods with high Flog scores, the code had become easily readable and understandable at a glance. In fact I nearly cut the Reek complaints in half for the controller.

Reek after cleanup

Refactoring one controller, which had been quickly hacked together and growing out of control, brought it from a dizzying 203 LOC to 138 LOC. The metrics drove me to refactor long methods (52 LOC => 3 methods the largest being 23 LOC), rename unclear variable names (s => stat, p => project), move some helpers methods out of the controller into the helper class where they belong. Yes, all these refactorings and good code designs can be done without metrics, but it can be easy to overlook bad code smells when they start small, metrics can give you an early warning that a section of code is becoming unmanageable and likely prone to higher defect rates. The smaller file was a huge improvement in terms of cyclomatic complexity, LOC, code duplication, and more importantly, readability.

Obviously I think code metrics are cool, and that your projects can be improved by paying attention to them as part of the development lifecycle. I wrote about metric_fu so that anyone can try these metrics out on their projects. I think metric_fu is awesome, and my interest in Ruby tools is part of what drove us to build Caliper, which is really the easiest way try out metrics for your project. Currently, you can think of it as hosted metric_fu, but we are hoping to go even further and make the metrics clearly actionable to users.

In the end, yep, this is a bit of a plug for a product I helped build, but it is really because I think code metrics can be a great tool to help anyone with their development. So submit your repo in and give Caliper hosted Ruby metrics a shot. We are trying to make metrics more actionable and useful for all Ruby developers out, so we would love to here from you with any ideas about how to improve Caliper, please contact us.

Written by DanM

October 27, 2009 at 10:30 pm

Announcing Caliper

If you’re anything like us, you recognize the usefulness of code quality metrics. But configuring metrics tools and linking them to your Continuous Integration system is a chore you may not have found time for. Here at Devver we’re all about taking the pain out of common development chores. Which is why we’re pleased to announce a new Devver service: Caliper.

Caliper is the easy way to get code metrics – like cyclomatic complexity and code duplication – about your project. Just go to http://devver.net/caliper. Enter the public Git repository URL of a Ruby project. In a few moments, we’ll give you a full set of metrics, generated by the awesome metric_fu gem.

Try it out right here:

If your project is hosted on Github, we can even re-generate metrics automatically every time someone makes a commit. Just add the URL

http://api.devver.net/github

to your project’s Post-Receive Hooks.

Caliper is currently an Alpha product. We will be be rapidly adding features and refinements over the next few weeks. If you have a problem, a question, or a feature request, please let us know!

Written by avdi

October 9, 2009 at 8:44 am

SimpleDB DataMapper Adapter: Progress Report

From the beginning of Devver, we decided we wanted to work with some new technologies and we wanted to be able to scale easily. After looking at options AWS seemed to have many technologies that could help us build and scale a system like Devver. One of these technologies was SimpleDB. One of the other new things we decided to try was DataMapper (DM) rather than the more familiar ActiveRecord. This eventually let me to work on my own SimpleDB DataMapper adapter.

Searching for ways to work with SDB using Ruby, we found a SimpleDB DM adapter by Jeremy Boles. It worked well initially but as our needs grew (and to make it compatible with the current version of DM) it became necessary to add and update the features of the adapter. These changes lived hidden in our project’s code for awhile, for no other reason than we were too lazy to really commit it all back on GitHub. Recently though there has been a renewed interest about working with on SimpleDB with Ruby. I started pushing the code updates on GitHub, then I got a couple requests and suggestions here and there to improve the adapter. One of these suggestions cam from Ara Howard, who is doing impressive work of his own on Ruby and AWS, specifically SimpleDB. His suggestion on moving from the aws_sdb gem to right_aws, which along with other changes improved performance significantly (1.6x on write, up to 36x on reading large queries over the default limit of 100 objects). Besides performance improvements, we have recently added limit and sorting support to the adapter.

#new right_aws branch using AWS select
$ ruby scripts/simple_benchmark.rb
      user     system      total        real
creating 200 users
 1.020000   0.240000   1.260000 ( 35.715608)
Finding all users age 25 (all of them), 100 Times
 59.280000   8.640000  67.920000 ( 99.727380)

#old aws_sdb using query with attributes
$ ruby scripts/simple_benchmark.rb
      user     system      total        real
creating 200 users
  1.290000   0.530000   1.820000 ( 52.916103)
Finding all users age 25 (all of them), 100 Times
  356.640000  53.090000 409.730000 (3574.260988)

view this gist

As I added features, testing the adapter also became slow, (over a minute a run) because the functional tests actually connect to and use SimpleDB. Since Devver is all about speeding up Ruby tests, I decided to get the tests running on Devver. It was actually very easy and sped up the test suite from 1 minute and 8 seconds down to 28 seconds. You can check out how much Devver speeds up the results yourself.

We are currently using the SimpleDB adapter to power our Devver.net website as well as the Devver backend service. It has been working well for us, but we know that it doesn’t cover everyone’s needs. Next time you are creating a simple project, give SimpleDB a look, we would love feedback about the DM adapter, and it would be great to get some other people contributing to the project. If anyone does fork my SDB Adapter Github repo, feel free to send me pull requests. Also, let me know if you want to try using Devver as you hack on the adapter, it can really speed up testing, and I would be happy to give out a free account.

Lastly, at a recent Boulder Ruby users group meet up, the group did a code review for the adapter. It went well and I should finish cleaning up the code and get the improvements suggested by the group committed to GitHub soon.

Update: The refactorings suggested at the code review are now live on GitHub.

Written by DanM

June 22, 2009 at 11:27 am

Boulder CTO Lunch with Matt McAdams

Dan usually goes to the Boulder CTO lunches, but he was out of town this month, which meant I had the pleasure of hanging out with some of Boulder’s best and brightest.

This month’s guest was Matt McAdams of TrackVia. TrackVia is an online database that is powerful yet simple enough to be used by people who are used to keeping data in spreadsheets (primarily business people). Matt gave a candid and often hilarious talk that touched on both both technical topics, and, luckily for me, a discussion of pricing and metrics, which are two topics that I’m currently very interested in.

On technology decisions:

Matt wasn’t a database guy originally, but used his practical knowledge he gained working on a previous startup

Went with the simplest design that could work and it’s continued to scale well

Smart technology decisions have allowed TrackVia to compete with a small, lean development team

On product development:

TrackVia started as a contract project for a single customer, but they saw the broader appeal

One of the earliest databases in TrackVia is the bug database (still around).  In other words, they’ve been dogfooding since day one.

They don’t worry about the competition. Instead, they focus on building the features that get people to sign up and pay.

On pricing:

You’ve got try stuff and iterate. TrackVia has changed their pricing several times.

Customers on the old pricing models have always been grandfathered in.

Sometimes raising your price can actually gain customers because some people assume that a cheap product or service must be low-quality (even if it’s actually very high quality).

If big customers really want feature X, it’s OK to ask them to pay extra to accelerate the development of that feature (or to customize their experience).

On metrics:

Good metrics allow you to try different strategies and measure their effect.

You must measure, tweak, and iterate.

If you can iterate on a weekly basis and your competition can iterate on a quarterly basis, you’ll win.

Metrics must continually be improved. TrackVia spends a lot of time tracking useful metrics, but even they know they need to add additional metrics in some key areas.


As usual, the CTO lunch was a great place to hear from other Boulder companies and I learned a lot. Thanks for everyone who attended and special thanks to Matt for leading our discussion.

Written by Ben

June 4, 2009 at 12:08 pm

Posted in Boulder

Tagged with , , ,