Dynamic slugs on Rails 3

UPDATE: I gemified this code, and you can now find the ‘inferred_slug’ gem (rubygems, github).

Abstract

Slugs have typically been stored in databases. At least that is what I read about slugs. It turns out that I am starting an amazing Rails project, which is pretty complex, and I want to keep it simple, stupid. So, I am setting up some facts:

  • slugs are used for mainly two reasons:
    • secure accesses, repelling information fetching scripts that just increment an id.
    • prettify URI’s.
  • slugs are actually duplicated information:
    • duplicated information is never good. Even if you have amazing callbacks, like those that ActiveRecord provides.
      • you need to take care of keeping them updated. Forget a callback, and your slugs will be outdated. Inconsistent.
      • update the ‘no-brainer-logic’ that creates your slugs, and you need to ‘migrate’ your whole database, full of information. If you want to be consistent, of course. And I really expect you to be.

From my point of view, and when thought about this issue yesterday, it was a no-brainer that slugs need to be logic, not data. So, here is my take on the problem.

Please, note that for correctly running this example you need the stringex gem, that adds some neat features to Ruby strings (among other nice candy).

Prettify URI’s

First of all, and the easiest part. Let’s prettify all URI’s. Let’s make it global for keeping the point (obviously you could decide on which models you want this).

module Slug

  def slug
    if not respond_to? :name or name.empty?
      id
    else
      "#{id}-#{name}".to_url
    end
  end

  def to_param
    slug
  end

end

Ok, this is a no brainer. If there is a name attribute on your model, it will be used to prettify your URI’s. It’s important to note that the returned string contains the id at the beginning, followed by a dash, and whatever you want after that, in this case, the name. This will be explained later.

This way you can also reimplement slug method on your model, if your model instead of name contains a title column, that you want to show on the slug.

Now, let’s write an initializer that loads this directly on all your ActiveRecord::Base instances:

class ActiveRecord::Base
  include Slug
end

That’s it ! Now, you might be wondering why it works at all. You did not change any code on your controllers that call, for example, Model.find, and they are still able to find the record, when the passed parameter is 14-john-murray. Well, test on an irb session what returns ’14-john-murray’.to_i before continuing.

Yes, it returns 14. This is the reason why our slugs are of the form id-whatever-you-want-here. “#{id}-whatever-you-want-here”.to_i , where id  is an integer, will return that integer. We can still use a regular ActiveRecord find method !

We can still follow the links on our website; but ! Not everything is so cool. /users/14-john-murray, for instance. Try to point directly to /users/14-john-doe. It loads. It just fetches the id (14) and there are no checks about the rest of the slug. So it’s time to fix that. We made URI’s pretty, but we aren’t checking that 14-john-murray is our record with ID 14 , whose name should be John Murray.

 Protect yourself !

The first brain inner war that comes here is where to place this logic, and how. Let’s state some requirements that were basic:

  • This point should not exist, really. This should be crystal clear, but just in case. Obey MVC. No hacks.
  • We need to keep find method virgin. We’d like to still use it.
  • We don’t like to copy and paste code. We really do love kitties.

The controller looks like a nice place at a first sight. However you depend on your developer to remember that he/she needs to validate the slug somehow. Doesn’t look so nice now. The model cannot know anything about the controller. However, the model is given the whole id by the controller. There we go.

Let’s modify our simple slug initializer then:

class ActiveRecord::Base
  include Slug
end

module YourApplication
  module SlugFinders
    def find_by_slug(*args)
      record = find(*args)
      if record and record.respond_to? :slug
        return nil unless record.slug == args.first
      end
      record
    end

    def find_by_slug!(*args)
      find_by_slug(*args) or raise ActiveRecord::RecordNotFound
    end
  end
end

class ActiveRecord::Base
  extend YourApplication::SlugFinders
end

Okay, great. We added two methods that we can use to search. find_by_slug, and find_by_slug!, following the usual Rails convention.

The idea is really easy. Let’s focus on find_by_slug, since find_by_slug! contains the exact same logic from the point of view that we are interested in. First, let’s do a regular find call, that will return (or not) the record. Let’s suppose we got a record. If that record responds to the :slug message, we just check that its slug is the same as the one that the model was asked for.

Wrapping up

So, we can do now things like:

  • Client.find_by_slug ’1′ => nil
  • Client.find_by_slug ’1-john-doe’ => Client(…)
  • Client.find_by_slug ’1-alice-smith’ => nil
  • Client.first.payments.find_by_slug ’1′ => nil
  • Client.first.payments.find_by_slug ’1-niceproduct’ => Payment(…)
  • Client.first.payments.find_by_slug ’1-fakeproduct’ => nil

We got lots of things for free (yeah, as in free beer):

  • Avoid data replication. It sounds so wrong, and feels even worse…
  • Avoid database headaches.
  • Save space in our database. Sounds stupid, right ?
  • Our logic controls the whole slug system. Change your slug logic, and magically all of them change.
  • Little impact in our project source code.

I hope this post was useful to you. Please take into account that is just an example. If someone asks for it, I can create an empty rails project showing how it works.

Happy coding !

  • JohnR

    Very awesome!
    Agreed, databases are for data, not view formatting!!! I was looking for a solution without creating database tables like friendly_id or other gems.
    Great work!

  • JohnR

    By the way, im using rails 3.0~ and the initializer is missing a
    require ‘slug’

  • http://www.ereslibre.es ereslibre

    @JohnR: thanks :) and the initializer isn’t missing a require ‘slug’ if you add ‘stringex’ to your Gemfile and run bundler :)

  • Enrique García

    I’ve been thinking about this for a while now and I have to disagree.

    The main axioms are:

    1) That maintaining an additional column for storing the slugs is wrong, because there is information duplication, which means maintenance costs in the long term, and there are lots of issues with that.
    2) That slugs can always be programmatically deduced from other fields.

    I’d like to first address 1). It is actually *very easy* to use a column for storing slugs; there are several gems that implement that very well. They also come with an extensive test suite, and have been used by lots of peers (btw, my favorite gem for this stuff is friendly_id).

    The whole process involves adding a gem, bundle installing, adding a migration, migrating, and adding 1 line to one model. And there you go, you have something working, with most of the issues solved.

    Your current proposed solution involves writing and maintaining more code. You could argue that you could package your software as a gem, so it would work just as easily as friendly_id. And that would probably be true. But friendly_id is already here, already works, and has already been tested by my peers.

    Please don’t take this the wrong way. I appreciate your efforts. But I don’t care about code purity, if I have to choose between maintaining 30 LOC and maintaining 2. *Right now* your solution just seems worse to me, in all the code-related aspects that I care about, than the already existing solutions.

    There is the additional stylistic bit that I don’t like having the ids there on the slug, if at all possible. friendly_id allows me to do that very easily, since each slug is a unique id in the database.

    Now to address 2): In my experience, as soon as the slugs are viewed by humans, 2) goes out of the window. URLs have lots of importance to certain types of humans, and for some reason they end up being *clients*.

    You come up with a pretty convenient way to generate slugs automatically, and they just don’t like it!

    Or what’s worse! They come up with *logical* reasons for which the urls ought to be modifiable by them!

    Take the quintessential example: a blog post. Imagine that someone creates a post called “My euruko 2012 slides”. Usually the slug would be “my-euruko-2012-slides”. The thing is, the blog automatically adds the date to the url, as in “/2012/06/06/my-euruko-2012-slides”, so the user wants to go and remove the 2012 from the slug, to end up with something like “/2012/06/06/euruko-slides”.

    Another example: a page on a CMS. Let’s say that pages can be”nested”. One page could be titled “Contractor taxation documentation”, but given that this page is a “child” of “taxes”, which is a child of “contractors”, the slug could be “documentation”, so the url was nice and short: “/contractors/taxes/documentation” instead of “/contractors/taxes/contractor-taxation-documentation”.

    I guess that if you were *really sure* that humans would never want to modify the urls, then … nah, even in that case, I would recommend just adding the slug and the form field. It’s not worth the hassle to fight an end user over such a trivial thing to include.

    This said, I really appreciate your effort. Consider sending a patch to friendly_id so that it has a “tableless” mode ;)

  • http://www.ereslibre.es ereslibre

    @Enrique García: First of all, thank you very much for taking the time to write a well reasoned comment.

    Please, take into account several other circumstances:
    – This is obviously not intended for permalinks. Permalinks would require the slug to be saved on the database, without any doubt.
    – For this solution to work, the ID must be there. There is no other way. You need resources of the form “#{id}-whatever”
    – This is not the ‘holy grail’ for slugs. It’s really up to you where you want to use this solution. I will not personally use this just *everywhere* in my project, but where I believe it makes sense.
    – I do care about code perfection, as well as data perfection. This meaning that when I see data duplication *somehow* I start to worry. That said, and as I stated before, there are two exceptions to this personal rule: you need permalinks, or you want to strip out the ID from the resource ID on the route.

    friendly_id is cool. However the project that I am working on has some restrictions that simply don’t let me use it.

    The main reason of this post is that when I searched about slugs all I could find is ‘store in on the database’, with some sugar: ‘if you want permalinks, just let it be, if you don’t want permalinks, use ActiveRecord callbacks to update your slug’. I think is a nice summary of what I found out there.

    So I took some minutes to think and ask myself if I *always* need to save slugs on the database. And found out that the answer is no.

    That said, I have used friendly_id in personal projects and I personally love it.

    You were also right when you said that this code could have been ‘gemified’, and probably will be. Overall, I think this gem will have less LOC than friendly_id. With friendly_id you have to add two LOC per model. With this solution you would need to add none (initializer on ActiveRecord::Base), or one: (Slug is included per model, ‘include Slug’). So overall, your code will have less slug-related code too.

    However, I still think they serve different purposes. Of course I prefer ‘/states/washington’ over ‘/states/74-washington’, but you also need to balance, and think about what are the costs of that. On certain parts of my website/service I prefer not to save this information in the database, and allow my code to infer it.

    Thanks again for your comment.