teohm.dev

I enjoy life, and make stuff for people I care about :)

UTF-8 Param Name Issue in Rails Multipart Form

I first stumbled upon this issue when Yasith (@meaningful) showed me a strange bug in a Rails project. Here’s what happened:

Issue

When submit a multipart form that contains Unicode parameter name e.g.

1
2
3
4
<form method="post" enctype="multipart/form-data" action="">
  <input name="Iñtërnâtiônàlizætiøn_name"
         value="Iñtërnâtiônàlizætiøn_value" />
</form>

Rails controller returns the param value "Iñtërnâtiônàlizætiøn_value" as expected.

But the param name becomes: "I\xC3\xB1t\xC3\xABrn\xC3\xA2ti\xC3\xB4n\xC3\xA0liz\xC3\xA6ti\xC3\xB8n_name".

It makes life miserable, if you are not expecting this to happen:

1
2
params["Iñtërnâtiônàlizætiøn_name"] # => nil
params["I\xC3\xB1t\xC3\xABrn\xC3\xA2ti\xC3\xB4n\xC3\xA0liz\xC3\xA6ti\xC3\xB8n_name"] # => "Iñtërnâtiônàlizætiøn_value"

What happened?

When Rack returns multipart form data to Rails, it returns:

1
2
{ "I\xC3\xB1t\xC3\xABrn\xC3\xA2ti\xC3\xB4n\xC3\xA0liz\xC3\xA6ti\xC3\xB8n_name" =>
  "I\xC3\xB1t\xC3\xABrn\xC3\xA2ti\xC3\xB4n\xC3\xA0liz\xC3\xA6ti\xC3\xB8n_value" }

However, ActionDispatch::Http::Parameters#encode_params in Rails decided to only encode parameter values, but not parameter names. As a result, we get:

1
2
{ "I\xC3\xB1t\xC3\xABrn\xC3\xA2ti\xC3\xB4n\xC3\xA0liz\xC3\xA6ti\xC3\xB8n_name" =>
  "Iñtërnâtiônàlizætiøn_value" }

Solutions?

  1. Don’t use Unicode param name.
  2. Patch Rails source code. I added a fix in my forked branch, and reported the issue. Hopefully it will get fixed soon in the coming release.

Working Effectively With iTerm2

I have been using iTerm in daily work for almost a year now. Along the way, I learned a few handy settings tweaks and shortcut keys to boost my productivity in command-line environment.

Install iTerm2

If you haven’t heard of iTerm, it’s a popular open source alternative to Mac OS X Terminal. Give it a try, download and install it from http://www.iterm2.com.

Fine-Tune Settings

Launch iTerm, open iTerm > Preferences or just Cmd + ,.

Open tab/pane with current working directory

Under Profiles tab, go to General subtab, set Working Directory to “Reuse previous session’s directory”.

Enable Meta key

To enable Meta key for Bash readline editing e.g. Alt + b to move to previous word, under Profiles tab, go to Keys subtab, set Left option key acts as: to “+Esc”.

Hotkey to toggle iTerm2

Under Keys tab, in Hotkey section, enable “Show/hide iTerm2 with a system-wide hotkey” and input your hotkey combination, e.g. I use Ctrl + Shift + L.

Switch pane with mouse cursor

Under Pointer, in Miscellaneous Settings section, enable “Focus follows mouse”.

Handy Shortcut Keys

Here’s a set of shortcut keys I commonly use. You can always look for other shortcut keys in the iTerm menu.

Tab navigation

  • open new tab Cmd + t
  • next tab Cmd + Shift + ]
  • previous tab Cmd + Shift + [

Pane navigation

  • split pane left-right Cmd + d
  • split pane top-bottom Cmd + Shift + d
  • next pane Cmd + ]
  • previous pane Cmd + [

Search

  • open search bar Cmd + f
  • find next Cmd + g

Input to all panes

  • input to all panes in current tab Cmd + Alt + i

Clear screen

  • clear buffer Cmd + k
  • clear lines (Bash command) Ctrl + l

Zooming / Font Resize

  • toggle maximize window Cmd + Alt + =
  • toggle full screen Cmd + Enter
  • make font larger Cmd + +
  • make font smaller Cmd + -

iTerm lovers, did I miss anything out?

Shortcuts to Move Faster in Bash Command Line

Nowadays, I spend more time in Bash shell, typing longer commands. One of my new year resolutions for this year is to stop using left/right arrow keys to move around in the command line. I learned a few shortcuts a while ago.

Last night, I spent some time to read about “Command Line Editing” in the bash manual. The bash manual is a well-written piece of documentation. I think I should read it more often.

Well, here’s the new shortcuts I learned:

Basic moves

  • Move back one character. Ctrl + b
  • Move forward one character. Ctrl + f
  • Delete current character. Ctrl + d
  • Delete previous character. Backspace
  • Undo. Ctrl + -

Moving faster

  • Move to the start of line. Ctrl + a
  • Move to the end of line. Ctrl + e
  • Move forward a word. Meta + f (a word contains alphabets and digits, no symbols)
  • Move backward a word. Meta + b
  • Clear the screen. Ctrl + l

What is Meta? Meta is your Alt key, normally. For Mac OSX user, you need to enable it yourself. Open Terminal > Preferences > Settings > Keyboard, and enable Use option as meta key. Meta key, by convention, is used for operations on word.

Cut and paste (‘Kill and yank’ for old schoolers)

  • Cut from cursor to the end of line. Ctrl + k
  • Cut from cursor to the end of word. Meta + d
  • Cut from cursor to the start of word. Meta + Backspace
  • Cut from cursor to previous whitespace. Ctrl + w
  • Paste the last cut text. Ctrl + y
  • Loop through and paste previously cut text. Meta + y (use it after Ctrl + y)
  • Loop through and paste the last argument of previous commands. Meta + .

Search the command history

  • Search as you type. Ctrl + r and type the search term; Repeat Ctrl + r to loop through results.
  • Search the last remembered search term. Ctrl + r twice.
  • End the search at current history entry. Ctrl + j
  • Cancel the search and restore original line. Ctrl + g

Need more?

Using RABL in Rails JSON Web API

Let’s use an event management app as the example.

The app has a simple feature: a user can add some events, then invite other users to attend the event. Its data are represented in 3 models: User, Event, and Event Guest. An ER digram that shows the domain models.

Let say, we are going to add a read-only JSON web API for clients to browse data records from the app.

Problems

Model is not view

When working on a non-trivial web API, you will soon realize that, model often cannot be serialized directly in web API.

Within the same app, one API may need to render a summary view of the model, while another needs a detail view of the same model. You want to serialize a view or view object, not a model.

RABL (Ruby API Builder Language) gem is designed for this purpose.

Define once, reuse everywhere

Let say, we need to render these user attributes: id, username, email, display_name, except password.

In RABL, we can define the attribute whitelist in a RABL template.

# tryrabl/app/views/users/base.rabl
attributes :id, :username, :email, :display_name

To show individual user, we can now reuse the template through RABL extends.

# tryrabl/app/views/users/show.rabl
extends "users/base"
object @user

## JSON output:
# {
#     "user": {
#         "id": 8,
#         "username": "blaise",
#         "email": "matteo@wilkinsonhuel.name",
#         "display_name": "Ms. Noe Lowe"
#     }
# }

Here’s another example to show a list of users.

# tryrabl/app/views/users/index.rabl
extends "users/base"
collection @users

## JSON output:
# [{
#     "user": {
#         "id": 1,
#         "username": "alanna",
#         "email": "rubie@hayes.name",
#         "display_name": "Mrs. Gaylord Hoeger"
#     }
# }, {
#     "user": {
#         "id": 2,
#         "username": "jarrell.robel",
#         "email": "jarod@eichmann.com",
#         "display_name": "Oran Lebsack"
#     }
# }]

The template can be reused in nested child as well, through RABL child.

attributes :id, :title, :description, :start, :end, :location
child :creator => :creator do
  extends 'users/base'
end

## JSON output:
# {
#     "event": {
#         "id": 7,
#         "title": "Et earum sed fuga.",
#         "description": "Quis sed ..e ad.",
#         "start": "2011-05-31T08:31:45Z",
#         "end": "2011-06-01T08:31:45Z",
#         "location": "Saul Tunnel",
#         "creator": {
#             "id": 1,
#             "username": "alanna",
#             "email": "rubie@hayes.name",
#             "display_name": "Mrs. Gaylord Hoeger"
#         }
#     }
# }

Join table rendered as subclass

I notice a recurring pattern in two recent projects. For instance, in this example, from client’s point of view, Event Guest is basically a User with an additional attribute: RSVP status.

When query database, usually we need to query the join table: event_guests.

class GuestsController < ApplicationController
  def index
    @guests = EventGuest.where(:event_id => params[:event_id])
  end
end

But when rendering, the result set needs to be rendered as a list of Users. RABL allows you to do that easily, using its glue feature (a weird name though :).

# tryrabl/app/views/guests/index.rabl
collection @event_guests

# include the additional attribute
attributes :rsvp

# add child attributes to parent model
glue :user do
  extends "users/base"
end

## JSON output:
# [{
#     "event_guest": {
#         "rsvp": "PENDING",
#         "id": 3,
#         "username": "myrna_harvey",
#         "email": "shad.armstrong@littelpouros.name",
#         "display_name": "Savion Balistreri"
#     }
# }, {
#     "event_guest": {
#         "rsvp": "PENDING",
#         "id": 4,
#         "username": "adelle.nader",
#         "email": "brendon.howe@cormiergrady.info",
#         "display_name": "Edgardo Dickens"
#     }
# }]

I think I will use RABL for the next Rails web API project.

The complete Rails example code is available at github.com/teohm/tryrabl.

Learning Git Internals by Example

Status: Draft.
Plan to revise this post, probably simplify it in future..

Movitation

After switching to Git from Subversion and Mercurial for a few months, somehow I feel that Git is fundamentally different from Subversion or Mercurial, but couldn’t really tell the differences. I often see terms like tree, parent etc. in GitHub, which I have no idea what they actually mean.

So I decided to spent some time to learn Git.

I will try to summarize and publish important stuffs I learned about Git along the way.. but here is the first entry, about Git internals, which helped me to answer how Git is different other source control tools.

Objects, References, The Index

To understand the core of Git internals, there are 3 things to we should know: objects, references, the index.

I find this model is elegant. It fits well in a small diagram, as well as in my head.

A picture illustrates files in .git directory mentioned in this article.

Objects

All files that you commited into a Git repository, including the commit info are stored as objects in .git/objects/.

An object is identified by a 40-character-long string – SHA1 hash of the object’s content.

There are 4 types of objects:

  1. blob - stores file content.
  2. tree - stores direcotry layouts and filenames.
  3. commit - stores commit info and forms the Git commit graph.
  4. tag - stores annotated tag.

The example will illustrate how these objects relate to each others.

References

A branch, remote branch or a tag (also called lightweight tag) in Git, is just a pointer to an object, usually a commit object.

They are stored as plain text files in .git/refs/.

Symbolic References

Git has a special kind of reference, called symbolic reference. It doesn’t point to an object directly. Instead, it points to another reference.

For instance, .git/HEAD is a symbolic reference. It points to the current branch you are working on.

The Index

The index is a staging area, stored as a binary file in .git/index.

When git add a file, Git adds the file info to the index. When git commit, Git only commits what’s listed in the index.


Examples

Let’s walkthrough a simple example, to create a Git repository, commit some files and see what happened behind the scene in .git directory.

Initialize New Repository

$ git init canai

illustration

What happened:

  • Empty .git/objects/ and .git/refs/ created.
  • No index file yet.
  • HEAD symbolic reference created.
    $ cat .git/HEAD 
    ref: refs/heads/master
    

Add New File

$ echo "A roti canai project." >> README
$ git add README

illustration

What happened:

  • Index file created.
    It has a SHA1 hash that points to a blob object.
    $ git ls-files --stage
    100644 5f89c6f016cad2d419e865df380595e39b1256db 0 README
    
  • Blob object created.
    The content of README file is stored in this blob.
    # .git/objects/5f/89c6f016cad2d419e865df380595e39b1256db
    $ git cat-file blob 5f89c6
    A roti canai project.
    

First Commit

$ git commit -m'first commit'
[master (root-commit) d9976cf] first commit
 1 files changed, 1 insertions(+), 0 deletions(-)
 create mode 100644 README

illustration

What happened:

  • Branch ‘master’ reference created.
    It points to the lastest commit object in ‘master’ branch.
    $ cat .git/refs/heads/master 
    d9976cfe0430557885d162927dd70186d0f521e8
    
  • First commit object created.
    It points to the root tree object.
    # .git/objects/d9/976cfe0430557885d162927dd70186d0f521e8
    $ git cat-file commit d9976cf
    tree 0ff699bbafc5d17d0637bf058c924ab405b5dcfe
    author Huiming Teo <huiming@favoritemedium.com> 1306739524 +0800
    committer Huiming Teo <huiming@favoritemedium.com> 1306739524 +0800
    
    first commit
    
  • Tree object created.
    This tree represents the ‘canai’ directory.
    # .git/objects/0f/f699bbafc5d17d0637bf058c924ab405b5dcfe
    $ git ls-tree 0ff699
    100644 blob 5f89c6f016cad2d419e865df380595e39b1256db  README
    

Add Modified File

$ echo "Welcome everyone." >> README
$ git add README

illustration

What happened:

  • Index file updated.
    Notice it points to a new blob?
    $ git ls-files --stage
    100644 1192db4c15e019da7fc053225d09dea14bc3ac07 0 README
    
  • Blob object created.
    The entire README content is stored as a new blob.
    # .git/objects/11/92db4c15e019da7fc053225d09dea14bc3ac07
    $ git cat-file blob 1192db
    A roti canai project.
    Welcome everyone.
    

Add File into Subdirectory

$ mkdir doc
$ echo "[[TBD]] manual toc" >> doc/manual.txt
$ git add doc

illustration

What happened:

  • Index file updated.
    $ git ls-files --stage
    100644 1192db4c15e019da7fc053225d09dea14bc3ac07 0 README
    100644 ea283e4fb22719fad512405d41dffa050cd16f9a 0 doc/manual.txt
    
  • Blob object created.
    # .git/objects/ea/283e4fb22719fad512405d41dffa050cd16f9a
    $ git cat-file blob ea283
    [[TBD]] manual toc
    

Second Commit

$ git commit -m'second commit'
[master 556eaf3] second commit
 2 files changed, 2 insertions(+), 0 deletions(-)
 create mode 100644 doc/manual.txt

illustration

What happened:

  • Branch ‘master’ reference updated.
    It points to a lastest commit in this branch.
    $ cat .git/refs/heads/master 
    556eaf374886d4c07a1906b9fdcaba195292b96
    
  • Second commit object created. Notice its ‘parent’ points to the first commit object. This forms a commit graph.
    $ git cat-file commit 556e
    tree 7729a8b15b747bce541a9752a8f10d57daf221b6
    parent d9976cfe0430557885d162927dd70186d0f521e8
    author Huiming Teo <huiming@favoritemedium.com> 1306743598 +0800
    committer Huiming Teo <huiming@favoritemedium.com> 1306743598 +0800
    
    second commit
    
  • New root tree object created.
    $ git ls-tree 7729
    100644 blob 1192db4c15e019da7fc053225d09dea14bc3ac07  README
    040000 tree 6ff17d485bf857514f299f0bde0e2a5c932bd055  doc
    
  • New subdir tree object created.
    $ git ls-tree 6ff1
    100644 blob ea283e4fb22719fad512405d41dffa050cd16f9a  manual.txt
    

Add Annotated Tag

$ git tag -a -m'this is annotated tag' v0.1 d9976

illustration

What happened:

  • Tag reference created.
    It points to a tag object.
    $ cat .git/refs/tags/v0.1 
    c758f4820f02acf20bb3f6d7f6098f25ee6ed730
    
  • Tag object created.
    $ git cat-file tag c758
    object d9976cfe0430557885d162927dd70186d0f521e8
    type commit
    tag v0.1
    tagger Huiming Teo <huiming@favoritemedium.com> 1306744918 +0800
    
    this is annotated tag
    

Add new (lightweight) tag

$ git tag root-commit d9976

illustration

What happened:

  • Tag reference created.
    It points to a commit object.
    $ cat .git/refs/tags/root-commit 
    d9976cfe0430557885d162927dd70186d0f521e8
    

More Readings

What’s Next?

Looking for a minimal git workflow suitable for a distributed team, long-running project..

Using JQuery Validation in Rails Remote Form

In a recent project, I was trying to use JQuery Validation in an earlier version of Rails 3 remote form (jquery-ujs). They didn’t work out well in IE.

After experimenting with the latest jquery-ujs and making an embarrassing mistake, it turns out that the issue is resolved in the latest version.

(Mistake: You may notice I removed a previous post about this topic, where I mistakenly concluded the latest jquery-ujs is not working with JQuery Validation. Thanks to JangoSteve for pointing it out. The post was misleading, so I believe it’s best to remove it to avoid confusion. :-)

Get the latest jquery-ujs

There are 2 reasons to use the latest jquery-ujs:

  1. it has a patch that fixes the issue (see issue #118).
  2. it exposes an internal function that we may need – $.rails.handleRemote() (see more details)

Working example

The example is tested with:

When using submitHandler in JQuery Validation

If you are using JQuery Validation’s submitHandler(form) function, you need to call $.rails.handleRemote( $(form) ) function manually, so that it submits the form via XHR.

$('#submit_form').validate({

  submitHandler: function(form) {
    // .. do something before submit ..
    $.rails.handleRemote( $(form) );  // submit via xhr
    //form.submit();                  // don't use, it submits the form directly
  }

});

ActiveRecord Mass Assignment

Even you haven’t heard of mass assignment, it already exists in your first Rails generated scaffold code.

def create
  @comment = Comment.new(params[:comment])
  ..
end

def update
  ..
  @comment.update_attributes(params[:comment])
  ..
end

Why knowing mass assignment is important?

By default, mass assignment opens up an undesirable security hole, by allowing web clients to update any attributes they passing in, including attributes like created_at.

For details, you can read:

Mass assignment methods

There are a few ActiveRecord methods that accept mass assignment:

Using any of these methods, means you are now responsible for the safety of the web application.

Minimal protection

To use mass assignment safely, we want to specify exactly which attributes allowed to be updated.

1. Define attr_accessible on every model

attr_accessible defines a white-list of attributes that can be updated during mass assignment.

class Comment < ActiveRecord::Base
  attr_accessible :title, :content
end

2. Enforce attr_accessible usage

You may set the white-list default to empty. This forces you to define the whitelist explicitly on a model, before mass assignment can be used.

# config/initializer/enforce_attr_accessible.rb

ActiveRecord::Base.send(:attr_accessible, nil)

ActiveRecord Model Equality

In daily web app development, we often need to check if two model objects are equal. When I first started using Rails, this is what I wrote:

if user.id == current_user.id
  # do something
end

Use obj1 == obj2 instead

Of course, that reflects my ignorance on learning the new tool.

So, here’s a proper way to compare if two model objects are equal:

test "2 records with same #id equal to each others." do
  c1 = Comment.create
  c2 = Comment.find(c1.id)

  assert c1 == c2
  assert c2 == c1
end

What about new record?

Basically, a new model object only equals to itself:

test "New record does not equal to another new record." do
  new_record = Comment.new
  another_new = Comment.new

  assert new_record == new_record
  assert another_new == another_new
  assert new_record != another_new
end

Learning to use == operator for checking model equality is useful. Because now, we can start using other methods that depend on == operator. For instance, Array#include?.

test "When applied in Array" do
  c1 = Comment.create
  c2 = Comment.find(c1.id)

  assert [c1].include? c2
  assert [c2].include? c1
  assert [c1, c2] == [c1, c2]
end

Rails learning tests

You may notice the code examples are written as tests. This is my first Rails learning test. So the plan is to write more tests as a way to learn Rails. I will post them here occationally if interesting enough to share.

Source code: model_equality_test.rb