my little technology blog

Thoughts about technology (mostly about programming ...)

Thursday, April 06, 2006

Localizing Rails pluralization/singularization, why and how ..

I hate introductionary posts - so I won't introduce myself :) You'll get the fragments of it in the posts I am planning to write.

I started learning ruby on rails when the version 1.0 came out. I have heard about ruby and rails before, but didn't want to bother checking it out. After watching rails video, I sent an email to my friend stating how impressive the video was. I had to update a project I was working on (boring and unfinished accounting software for Bosnian market called Sklad - "harmony", in bosnian, written in C++ and Qt) so I decided to rewrite it entirely in RoR instead.

After a month or two my friend and I started another project (secret, for now) and the first thing we started discussing is the language of the table names, variable names, comments etc. My friend's points were that the software development is english language based and that, if a project is ment to be succesful, it should be written in english. I agreed that we do the project in english, since it's oriented towards the international market, and we considered opening the source (as in "speech" ;) ) . What I didn't agree with is using english in every situation.

"Sklad" was written with english table names (because of pluralization - I didn't want to set names of the tables manually, since I was already late with the project), and with bosnian coulmn names. Generated scaffold gave me pretty good application base, and I had to do very little translation in order to get interface in bosnian language. But what felt very strange (while programming) was mixture of bosnian and and english while working with models. For example Item.cijena ("cijena" means "price").

Other reason for having localized pluralization rules is the existing projects. A lot of existing large information systems in Bosnia are built on Oracle database. Oracle's CASE tool - in Designer, you must supply the plural form of some entity's name in order to get table name in plural. Having localized pluralization/singularization in rails is good because one can switch form java to RoR and reuse the existing database schema easier.

Another good reason is learning. On most universities in Bosnia, examples are in bosnian (serbian or croatian) languages. RoR could be used for learning to program web applications. Existing Java (Eclipse, Tomcat etc.) web application course in my university is too complicated for around 90% students. It teaches students to create their own frameworks or extend existing without teaching them, for example, what MVC means.

At the time I started writting Sklad I didn't know much about ruby, so I thought it would be hard to change the rules of pluralization but I was wrong. After a while, I found out that Inflector is the responsible class, and that it has very nice interface (don't all rails classes ?) for changing. After I saw some examples, I started writting bosnian rules. I thought that nice place to put the changes would be ./config/environment.rb, and was pleasantly suprised when I saw the commented example of changing the rules ... Here is what I got ...

Inflector.inflections do |inflect|

#clearing english rules
inflect.clear

#adding bosnian pluralization rules
#male gender ... basic rule - add "i" at the end

# "automobil" -> "automobili"
inflect.plural /^([a-zA-z]*[^aoieukgh])$/i , '\1i'
# k in front of i becomes c "krik" -> "krici"
inflect.plural /^([a-zA-z]*)k$/i, '\1ci'
# g in front of i becomes z "hirurg" -> "hirurzi"
inflect.plural /^([a-zA-z]*)g$/i, '\1zi'
# h in front of i becomes s "trbuh" -> "trbusi"
inflect.plural /^([a-zA-z]*)h$/i, '\1si'

#female gender - basic rule change "a" at the end to "e"
inflect.plural /^([a-zA-z]*)a$/i, '\1e' # "jabuka" -> "jabuke"

#middle gender ( I don't know how to say it in english ;) ) - basic rule
# change
inflect.plural /^([a-zA-z]*)o$/i, '\1a'

#singularization rules

inflect.singular /^([a-zA-z]*)i/i, '\1'
inflect.singular /^([a-zA-z]*)ci/i, '\1k'
inflect.singular /^([a-zA-z]*)zi/i, '\1g'
inflect.singular /^([a-zA-z]*)si/i, '\1h'
inflect.singular /^([a-zA-z]*)e$/i, '\1a'
inflect.singular /^([a-zA-z]*)a$/i, '\1o'

inflect.irregular 'covjek', 'ljudi'
inflect.irregular 'cvijet', 'cvjetovi'
inflect.irregular 'dijete', 'djeca'

inflect.uncountable %w( novine )


end
It's easy ... first argument of plural() / singular() method is matching regular expression, and the second one is the string of RE that makes the change by substituting some parts matched in the first RE. Method irregular(), like the name says, is used for words that form irregular plural. First argument is singular form, second is plural form. Last method is used for words that have only one form.
Improtant thing is, when forming the rules, that "word" == "word".pluralize.singularze. If that's not the case, you'll have problems ...

Here are some testing examples:

[senadu@localhost proj]$ ./script/console
Loading development environment.
>> "automobil".pluralize
=> "automobili"
>> "automobili".singularize
=> "automobil"
>> "jabuka".pluralize.singularize
=> "jabuka"
>> "dijete".pluralize
=> "djeca"
>> "novine".pluralize
=> "novine

These rules don't cover all the possibilities in bosnian language, but they work fine for most of the words used in software engineering. Also, the rules can be used for croatian and serbian languages without change ...


1 Comments:

  • At 2:57 PM, Blogger Reired said…

    Hello I'M LOOKING FOR YOUR UPLOAD OF A POEM CALLED MY GOD IS NO STRANGER SO I MAY PRINT A COPY IF YOU COULD SEND THAT TO MY EMAIL [PHELAN1950@LIVE.COM] THERE WAS NO AUTHOR LISTED SO YOUR BLOG IS THE ONLY PLACE I FOUND IT YOUR ORIGINAL POST WAS IN NOVEMBER 24,2008. THANK YOU FOR YOUR HELP

     

Post a Comment

<< Home