A Benefit of Rails: Surrogate Keys

Today I helped a client wrestle with a database task which concluded:

. . . We’ll have to watch out inventing new “fake” customers because one day there may be a real customer with our made up ID ūüė¶

This is a problem in lots of apps, but not in Rails.

Rails enforces the “best practice” of giving every table a¬†primary key named¬†id, which is an auto-incrementing integer. But many non-Rails apps do not do this. Instead, they use some real-world data as the primary key. This is called a natural primary key. E.g., a customer’s social security number. There are arguments for and against this. This is a good little wikipedia page about the differences.

One big problem with natural primary keys is that they often turn out to not be as unique as we think. Plus, then we’re¬†put in the position of “faking” the SS number when we want to manually add customers who don’t have one, e.g. “999-….”. (I remember a college I attended which assigned 999- SS numbers to international students.) And this is the hack that the task description¬†refers to.

In contrast to natural keys, the¬†id¬†field is a so-called surrogate key. It intentionally¬†has no meaning in the real world, and so it has no reason to ever change. It doesn’t matter at all what an object’s id is, just so long as it’s unique. And since it gets set up as an auto-incrementing integer, that will be the case.

So, Rails is “opinionated software” and has made the decision for us. Every table will have;

  1. a surrogate key,
  2. named id,
  3. which is an auto-incrementing integer.

And in one swoop, a whole category of potential problems is eliminated. Secondly, this makes it very easy to get up to speed on a new Rails app.

Self-validating Ruby objects with ActiveModel Validations

I’m importing lots of CSV restaurant inspection data¬†with Ruby, and I need to make sure the cleaned up data matches the spec. For example, a violation must have a business_id and date. It can optionally have a code and description. My goal was to be able to write a class like this:

class Violation < ValidatedObject
  attr_accessor :business_id, :date, :code, :description

  validates :business_id, presence: true
  validates :date, presence: true, type: Date

…and it would just work, Rails-style:

Violation.new do |v|
  v.business_id = '1234'
  v.date = '2015-01-15'
# => ArgumentError: date must be of the class Date

Violation.new do |v|
 v.date = Date.new(2015, 1, 25)
# => ArgumentError: business_id is required

Violation.new do |v|
 v.business_id = '1234'
 v.date = Date.new(2015, 1, 25)
# => new instance Violation<...>

ActiveModel::Validations was refactored out of ActiveRecord to enable just this sort of use case. This is because the awesome list of built-in validations (here and here) and methods like #valid? are useful in a variety of contexts, not just in Rails.

It turned out that it was easy to write a small base class with the ActiveModel::Validations mixin to make the checking automatic upon initialization:

require 'active_model'

class ValidatedObject
  include ActiveModel::Validations

  def initialize(&block)

  def check_validations!
    fail ArgumentError, errors.messages.inspect if invalid?

For my importing and parsing purposes, I created a small custom TypeValidator:

 # Ensure an object is a certain class. This is an example of a custom
 # validator. It's here as a nested class for easy access by subclasses.
 # @example
 #   class Dog < ValidatedObject
 #     attr_accessor :weight
 #     validates :weight, type: Float
 #   end
 class TypeValidator < ActiveModel::EachValidator
   def validate_each(record, attribute, value)
     return if value.class == options[:with]

     message = options[:message] || "is not of class #{options[:with]}"
     record.errors.add attribute, message

Here’s the finished ValidatedObject, and an example subclass: information about a CSV feed. This set up is working great; the only change I can see making soon is changing ValidatedObject to be mixed in via include rather than subclassing.

Thanks to:

Yes, Rails does support case-insensitive queries

They’re unfortunately just a little buried and a little undocumented. And so conventional wisdom is that Rails doesn’t do case-insensitive finds. But in fact it does:

Example 1: Find all the statutes whose name begins with texas, case-insensitively:

t = Statute.arel_table

Continue reading “Yes, Rails does support case-insensitive queries”

. . . it is supremely important that we ensure our data is safe, consistent and reliable. We can dramatically increase these factors by taking full advantage of the tools at hand.

Yes. This is the most critical, important task in software development. A great set of posts, Coding Rails with Data Integrity by Jay Hayes. Part 1, part 2, and part 3.