Creating Custom Cane Metrics

Code quality is a major component of code maintainability. Sloppy or overly complex code is sure to doom a project. If another developer can't tell the purpose of the code at first glance he/she will be less incline to invest time to correct that code's technical debt.

Code complexity, style, and documentation are all factors that developers can control with discipline. Luckily for those of us without discipline (myself included) Square has open sourced Cane, their code quality gem. Cane out of the box can run complexity, style, and documentation checks. Cane also supports the addition of custom checks, this allows you to specifically target the coding habits you want to discourage.

Custom Cane metrics start off easy enough, there are only three requirements:

  • A class level options method that defines the command line options your custom check is expecting. If the check doesn't require and options this method can return an empty Hash
  • A one argument constructor. The command line options defined in your options will be supplied as a Hash to the constructor.
  • A method to return the violations found for the current project named violations

Those requirements can easily be achieved with the following:

class MyCheck <
  def self.options
    { }

  def violations
    [ ] 

Of course this check doesn't actually check anything, so lets make something useful. Something I know I am guilty of is using the puts and print methods to output debugging information to the console while troubleshooting a test or bug. It can annoy others and, honestly, it is very easy to leave behind. So lets implement a PutsCheck metric so these rogue puts and print are caught. We are going to start with a simple class declaration, I'll go ahead and add some of the boiler plate Cane attributes

class PutsCheck <
    "Lines output to console using `puts` or `print`"
  PUTS_REGEX = /^\s*p(uts|rint)?[\s\(]+(.+?)\s*[\)\s]*$/

  def self.key; :puts; end
  def; "puts output checking"; end
  def self.options
      puts_glob:    ['Glob to run puts checks over',
                       default:   '{app,lib}/**/*.rb',
                       variable:  'GLOB',
                       clobber:   :no_puts],
      puts_exclude:  ['Exclude file or glob from puts checking',
                       variable:  'GLOB',
                       type:      Array,
                       default:   [],
                       clobber:   :no_puts],
      no_puts:      ['Disable puts checking', cast: ->(x) { !x }]

To walk through what is being declared here is two constants, DESCRIPTION and PUTS_REGEX, along with three class level methods, key, name, and options. The constants will be explained when we implement the violation checking. As for the class methods, key and name are implemented to match Square's own implementation of the metrics shipped with Cane. I have yet to find where they are being used.

The options method requires more explanation, this method defines the command line switches used by the metric. The Hash keys become the command line argument names, so :puts_glob becomes --puts-glob in the command line. The values in the Hash define the defaults for the arguments as well as their documentation. The first element in the Array is the help text for the argument. The key :default defines the arguments default value. The key :variable is used to determine how the value should be interpreted and :type defines the Ruby type the value should be coerced to. Using :clobber and then the name of another argument tells the system that if this other argument is set the ignore this one. Finally :cast allows you to define a lambda to do a custom coercion with.

With the options definition out of the way it is a good time to implement some helper functions to make using the arguments easier.

def file_names
  Dir[opts.fetch(:puts_glob)].reject { |file| excluded?(file) }

def exclusions
  @exclusions ||= opts.fetch(:puts_exclude, []) do |i|

def excluded?(file)

def worker

The methods exlusions and excluded? help to filter out files specified by the --puts-exclude argument, while file_names returns the files that the user wishes to run the puts check on. The worker method is just a convenient way to use Cane's SimpleTaskRunner when running in a single process or to use Parallel when using multiple. With these methods in place we can implement the violation checking for our metric.

def violations
  return [] if opts[:no_puts] do |file_name|

def find_violations(file_name)
  Cane::File.iterator(file_name).map.with_index do |line, number|
    puts_match = line.match(PUTS_REGEX)
    result = if !!puts_match
                 file: file_name,
                 line: number + 1,
                 label: "Line outputs '#{puts_match[2]}'",
                 description: DESCRIPTION

The violations method returns an Array of violations as long as the user does not use the command line switch --no-puts, in which case the code returns an empty Array. The meat of the metric lies in the find_violations method, using Cane::File.iterator() to ensure the all file encoding is accounted for. We then iterate through the lines of the file to see if they include a call to the methods p, puts, or print using our previously defined constant PUTS_REGEX. If an offending method call is found then it is added to the Array to be returned. Each violations is represented by a Hash object that has the keys :file, which is the file name containing the violation, and :line, where in the file the violation was found. As well as :label, which is specific information about the violation, and :description, which is a general description that the console output groups the violations by.

At this point you can use the custom metric by running it like so, cane -r puts_check.rb --check PutsCheck, or so the check doesn't have to be added every time you can add a .cane file to the project with like so:

> cat .cane
-r puts_check.rb
--check PutsCheck

The PutsCheck metric is now ready to be used as part of a git precommit or as part of a build server process.

Rack::Test and JSON

I recently had the opportunity to make a simple JSON web service using Sinatra, and while testing it I came across something that wasn't apparent from the documentation. I needed to test the web service's response when JSON was POST'ed to an endpoint. The obvious way to do this, or so I thought, would be to use the post() method like so:

post('/endpoint', { 'j': 's', 'o': 'n' })

Turns out that doesn't actually work the way one would expect. Looking at the source code on GitHub, it looks to me that the above code results in the Content Type of application/x-www-form-urlencoded and the Hash being URL encoded. It seems that if you pass a Hash at all Rack::Test will try to URL encoded the values. And as long as you set a specific Content Type Rack::Test won't mess with it. Something like the following will successfully result in a request with application/json for the Content Type and have the JSON as the request body.

post('/endpoint', "{ 'j': 's', 'o': 'n' }", { "CONTENT_TYPE" => "application/json" })

This is a bit more to type but it gets the job done, you can write a helper to wrap this into something shorter like:

def post_json(uri, json)
  post(uri, json, { "CONTENT_TYPE" => "application/json" })

I have yet to find another way to achieve this so I am open to any suggestions.

What is in a Type?

Ruby views value types differently then static-typed languages. Traditionally values like Integers, Chars, and Booleans are treated differently then the reference-types. For example in Objective-C in order to pass an NSInteger by reference or store it in a NSArray you have to box the value in a reference-type like NSNumber. Or in C# since value-type variables can't store null if your logic requires a variable to store null or an Integer you have to use the reference-type Nullable<Integer> instead.

This isn't needed in Ruby since the language doesn't have the notion of value-types, instead everything is a reference to an object on the heap. This allows classes like Integer to have methods for developer ease such as .even?, .times, and .upto(). Also having the value-types as objects provides the (albeit minimal) benefit of not requiring that the value be boxed and unboxed as it crosses between the realm of value and reference types.

So while the benefits might be minimal the difference is part of a greater mindset. Ruby is a language written for humans first and machines second. This might seem counter intuitive at first since a human will never execute the code but it makes writing the code more enjoyable.