2019-09-02 14:55 — By Erik van Eykelen

Writing Better Ruby (Part 2)

This is part 2 of a series of blog posts on writing better Ruby. The first installment can be found here.

1. count vs size

Don’t use count as a substitute for size. For Enumerable objects other than Array it will iterate the entire collection in order to determine its size – style guide

The article gives these examples:

# bad
some_hash.count

# good
some_hash.size

I created a benchmark to check how much slower count is (spoiler alert: crazy slow!).

2. Hash#fetch defaults

Introduce default values for hash keys via Hash#fetch as opposed to using custom logic – style guide

The article provides this example:

batman = { name: 'Bruce Wayne', is_evil: false }

# bad - if we just use || operator with falsy value we won’t get the expected result
batman[:is_evil] || true # => true

# good - fetch works correctly with falsy values
batman.fetch(:is_evil, true) # => false

The batman[:is_evil] || true statement yields true because false OR true equals true (simple Boolean logic). Use fetch() to get the actual value of :is_evil, or true if it’s nil.

3. Hash#fetch

Use Hash#fetch when dealing with hash keys that should be present – style guide

A benefit of fetch is that it raises KeyError on non-existing keys. I find this particularly useful for ENV variables e.g. in Rails apps because an invalid fetch key will crash the app making it very clear that’s something’s gone awry.

Suppose your .env file contains STRIPE_PUBLIC_KEY=pk_test_123

# bad - using the wrong variable name returns nil, making debugging harder
stripe_public_key = Rails.ENV['STRIPE-PUBLIC-KEY']

# good - using fetch raises `KeyError`
stripe_public_key = Rails.ENV.fetch('STRIPE-PUBLIC-KEY')

4. Hash#each

Use Hash#each_key instead of Hash#keys.each and Hash#each_value instead of Hash#values.each – style guide

The reason for this advice is performance since .keys and .values create temporary arrays, while the each_* methods return laze enumerators. See this benchmark for a performance comparison between arrays and enumerators on strings and symbols.

5. Set vs Array

Use Set instead of Array when dealing with unique elements. Set implements a collection of unordered values with no duplicates. This is a hybrid of Array’s intuitive inter-operation facilities and Hash’s fast lookup – style guide

Since the article doesn’t provide additional details, I’ve tried to sum up the differences between Set and Array:

  • A Set can only contain unique items. This enables optimizations to its internal list operations, making it a lot faster for some operations. For instance include? can be much faster when operating on a Set versus an Array.
  • As opposed to Array, a Set is an unordered list of values. Again, this enables Set to apply optimizations to the way it handles the list internally, making it faster than arrays. However, just as with Hashes prior to Ruby 1.9, if your code relies on the insertion order then you can’t use Set.
  • Set provides powerful mathematical set operations such as union, intersection, and subtraction. It also offers a way to test for equality of two sets regardless the position of the values.

6. Inline Annotations

In cases where the problem is so obvious that any documentation would be redundant, annotations may be left at the end of the offending line with no note. This usage should be the exception and not the rule – style guide

The article lists TODO, FIXME, OPTIMIZE, HACK, and REVIEW as suggested annotations. I only use the first two but I can see the benefit of using all five.

Note that Rails’ notes command only scans for FIXME, TODO, and OPTIMIZE. If you want to scan for e.g. HACK use rails notes:custom ANNOTATION=HACK (or grep -R "# *HACK" *). Edit: Rails 6 now offers an annotations argument, see https://blog.saeloun.com/2019/09/03/rails-notes-command-upgrades-in-rails-6.html#rails-6.

7. Class and self

When class (or module) methods call other such methods, omit the use of a leading self or own name followed by a . when calling other such methods. This is often seen in “service classes” or other similar concepts where a class is treated as though it were a function. This convention tends to reduce repetitive boilerplate in such classes – style guide

I’ve made this mistake a lot in the past, writing e.g. code like this:

# bad
def profile_unfinished?
  self.email_address.blank? ||  self.name.blank?
end

# good
def profile_unfinished?
  email_address.blank? ||  name.blank?
end

8. module_function

Prefer the use of module_function over extend self when you want to turn a module’s instance methods into class methods – style guide

Frankly, I am on the fence over this one. I tend to follow this advice because standards are good even if you don’t agree with every individual rule. See this article for a discussion whether extend self is an anti-pattern.

The advice made me revisit all the ways (I know of) to create class methods:

Method 1: using extend

module Mod
  def mod
    "hello"
  end
end

class Test
  extend Mod
end

Test.mod # "Hello"

Method 2: using module_function

module Mod
  module_function
  def mod
    "Hello"
  end
end

class Test
  def self.mod
    Mod.mod
  end
end

Test.mod # "Hello"

Method 3: using extend self

module Mod
  extend self
  def mod
    "Hello"
  end
end

class Test
  def self.mod
    Mod.mod
  end
end

Test.mod # "Hello"

Method 4: using self.{method_name}

module Mod
  def self.mod
    "Hello"
  end
end

class Test
  def self.mod
    Mod.mod
  end
end

Test.mod # "Hello"

Method 5: using class << self

module Mod
  class << self
    def mod
      "Hello"
    end
  end
end

class Test
  def self.mod
    Mod.mod
  end
end

Test.mod # "Hello"

9. Keyword Arguments vs Option Hashes

Use keyword arguments instead of option hashes – style guide

I’ve started to refactor old code to use keyword arguments, and it’s really making a difference in terms of nicer, self-explaining code.

For example:

data = TwoFactor.scannable_png(@user.account, @2fa_provisioning)

Compare this to:

data = TwoFactor.scannable_png(email_address: @user.account, uri: @2fa_provisioning)

With a single glance you know that @user.account contains an email address and that @2fa_provisioning apparently uses to_s to return the provisioning URI (which is OK for domain objects).

10. No Exceptional Flows

Don’t use exceptions for flow of control – style guide

I’ve definitely violated this rule, especially in IO-based code where it’s “easier” (okay, lazy) to catch exceptions than to write conditions to deal with expected and unexpected situations.

Example of “bad code”:

begin
  filepath = File.join(self.config.working_directory, self.config.settings.file_matter.blog_manifest_file)
  manifest = File.read(filepath)
rescue
  puts("Can't find blog manifest '#{filepath}'")
  exit
end

Rewritten to omit the exceptional flow:

filepath = File.join(self.config.working_directory, self.config.settings.file_matter.blog_manifest_file)
path = Pathname.new(filepath)
raise "Can't find blog manifest '#{filepath}'" unless path.exist?
manifest = File.read(filepath)

Again, that’s a wrap

In an upcoming third installment I will discuss the next 10 mistakes I often make. The first installment can be found here.

Check out my product Operand, a collaborative tool for due diligences, audits, and assessments.