I am now working in Ruby on Rails for about a month. I have done a few smaller things before, investigating, trying out; but now I am really building a full application site in Rails! It feels great :)

But of course, just starting out in such a new environment, both ruby and rails, I encountered a lot of initial problems.

One of them is showing international, accented characters, like é, è, ê … from the database. Nor Rails nor irb could show the correct characters. I really needed to solve this problem, so this started an investigative journey.

The symptoms

When displaying a table of results from Rails in any browser, all accented characters were shown as white question marks inside a black rhombus (diamond?). When i tested this with irb i also got very weird results.

When using TOAD the results of a query are displayed correctly. But maybe TOAD just does it very clever? If i use the GUI version of SQL*Plus (sqlplusw) the results are also shown correctly. But when i use the console (command-line) version of sqlplus, the characters were also distorted. So maybe the clue could be found there.

Oracle NLS-LANG

No matter what character-set the Oracle database uses to store the data (e.g. UTF-8, UTF-16) the data is always converted to the clients character set. This is truly an amazing feature. This depends on a setting called NLS_LANG. Not sure wat it quite stands for. NLS_LANG is built up as follows:
NLS_LANG=language_territory.charset

The NLS_LANG property is set by default in the registry to the correct Windows codepage.

NLS_LANG="AMERICAN_AMERICA.WE8MSWIN1252"

But why doesn’t it work inside the command line console? Apparently because it used a different codepage. If you type chcp (requesting the codepage) at the DOS prompt, it would normally return 437 (it did on my machine). So you would need to enter


> set NLS_LANG=american_america.US8PC437
> require 'oci8'
> OCI8.new('user','pw','db').exec('select * from emp') do |r| puts r.join('|'); end

in your irb and then all results would be displayed correctly. The crucial line being the correct setting of NLS_LANG.

Wow! I got my results correctly in ruby! Now i was in the assumption that Rails would be a piece of cake, but that was wrong.

Fixing Rails

The easy idea would be to set NLS_LANG in Rails correct, before the oci8-library is required. My first approach was to set the NLS_LANG in the first line of the environment.rb with the following line:

ENV["NLS_LANG"] = "AMERICAN_AMERICA.WE8MSWIN1252"

But this didn’t work. I am using NetBeans 6.5, and it took me a while to realise that if I edited a file to contain special characters (fixed text, e.g. on a menu) it would work. NetBeans (or Rails for that matter) standard works with UTF-8. So all files are encoded in that way.

The easy solution would be to use

ENV["NLS_LANG"] = "AMERICAN_AMERICA.AL32UTF8"

But that didn’t work. I tried the alternative AMERICAN_AMERICA.UTF8 but that didn’t work either. I am just guessing here, but i think somehow the NLS_LANG setting didn’t get picked up in the Rails environment. It kept using the registry setting.

So I tried to turn it around: make Rails use the windows codepage instead of utf-8.

This took several steps:

  • change the default setting inside NetBeans to use the correct code-page instead of UTF-8. This in fact only affects the format of the files that are saved, but it is nevertheless important that all files being served are in the same format. Also convert all previously edited files from UTF-8 to standard ASCII. I used an editor to do that.
  • added the following to application_controller.rb:

    before_filter :headers_iso

    def headers_iso
    # make sure the charset matches the default Oracle NLS setting
    headers["content-type"]= "text/html; charset=windows-1252"
    end

  • added the correct META-tag to application.rhtml in the HEAD-section:

    Note: this meta-tag is not really needed, it has no real effect.

And that finally worked! :)