Sunday, December 12, 2010

Mongoid: autocreate_indexes

Indexes help make your database reads faster, but has the downside of making your database writes slower. Either way, you will come across the need for indexes at some point.

Mongoid has a flag called autocreate_indexes. This flag tells MongoDB to create the indexes every time a class is loaded. This is set to false by default. However, when you add an index to a model, your application will complain since there is no index in MongoDB yet.

To get around this, you can set autocreate_indexes to true. You only want to do this for development and test since this can be slow. For staging and production, you will need to log into MongoDB and create an index manually before pushing your code.

Thursday, November 18, 2010

RSpec 2: #raise_error

I was writing a spec and attempted to use #raise_error. To my surprise, it wasn't working. Here's what I did and how I solved it.

Code to be tested:
class TestClass
  def run; raise "Error"; end
end

Spec:
it 'should raise' do
  TestClass.new.run.should raise_error
end

To my surprise, the error wasn't caught and it failed with a raised error. There was no syntax error and the way I wrote it felt very natural, but it wasn't behaving the way I thought it should. However, upon further investigation, I realized the usage of #raise_error is on a Proc or lambda.

The spec should have been:
it 'should raise' do
  Proc.new { TestCase.new.run }.should raise_error
  lambda { TestCase.new.run }.should raise_error
end

And if you really want it to read well:
it 'should raise' do
  expect { TestCase.new.run }.to raise_error
end

Thursday, November 11, 2010

Fixing Ruby on Ubuntu

Before using RVM, I was using the default Ruby installation on Ubuntu. Then, I needed different versions of Ruby and started messing around with installing other Rubies. Apparently, I messed up my default Ruby install without knowing it. Then, I moved off to RVM and never looked back.

Recently, I needed my system Ruby again. This is when I found out it was completely messed. It kept giving me this error when I tried to use RubyGems.
'require': no such file to load -- thread (LoadError)

This was supposed to be impossible since thread is in the core library of Ruby. Then, I tried other things like requiring pp and date. Both gave me the same error.

I attempted to reinstall Ruby.
sudo apt-get purge ruby ruby-dev ruby1.8 ruby1.8-dev rubygems rubygems1.8
sudo apt-get install ruby ruby-dev rubygems

Unfortunately, this didn't help at all. I tried this several times to no avail. I then tried installing RubyGems through the source, but I couldn't even run setup.rb since it requires core libraries.

I decided to look through Synaptic Package Manager for anything that matched Ruby. That's when I realized I hadn't uninstalled all of Ruby. I had missed the Ruby libraries! Duh! I had messed up my Ruby libraries and wasn't reinstalling it.
sudo apt-get purge libruby libruby1.8
sudo apt-get install libruby

This fixed my require problems and all was right again.

Friday, November 5, 2010

Brother HL-2170W: Installation Gotchas + Reset to Factory Settings

Using Ubuntu 10.04, I decided to install this brand new printer and enable wireless printing. If you are using a USB cable, just plug-and-play. It just works. If you are planning to use wireless, read on!

Network Cable

You will need an extra network cable if your computer doesn't have wireless.

Wireless Authentication

The first gotcha is wireless authentication. This printer seems to behave properly only with AES encryption. TKIP will not work. Go ahead and change your settings from your router. I'll wait.

Wireless Setup

Get hold of a Windows or Mac machine. You NEED this in order to setup wireless. Ubuntu does not have an interface to this part of the printer. Unfortunately, you will need the CDs that come with the printer as well. Plop the CD into your computer and run it. A wizard will show up. It's pretty straightforward until you get to the "Top Menu". Here, you need to click the following:
  1. Install Printer Driver (even though we just want to setup wireless)
  2. Wireless Network users (window pops up)
  3. Wireless Setup Only
  4. Step by Step install
From here, follow the rest of the wizard. It should detect your printer and allow you to configure settings such as IP address (if you decided to use static IP) and wireless authentication. Fill it all in. At the end, it will tell you to unplug the network cable from the printer and ask if you want to print the settings. Make sure you print the settings. If anything goes wrong, it tells you on the print-out. When I wasn't using AES encryption, I had "Failed To Associate" under "Wireless Link Status".

MAC Addresses

If you do anything with MAC addresses, the second gotcha is that the wired interface and the wireless interface have different MAC addresses. Be careful that you use the wireless one.

Ubuntu

Time to add the printer! Go to System > Administration > Printing > Add. Wait for a little bit. The New Printer popup should automatically detect your printer. Here's the third gotcha. I saw two. One that said "LPD network printer via DNS-SD" and one where it asked for Host (Probe) and Queue. You must select the DNS-SD one. The other one doesn't work! After selecting DNS-SD, it should look for drivers and then ask to print a test page. Make sure the test page prints.

Done!

Reset to Factory Settings

In case you screw up the settings and can't access the printer anymore, you can reset it to factory settings.
  1. Turn off the printer using the switch on the side.
  2. While holding down the Go button (big button that lights up blue in the corner), turn the switch on.
  3. Hold the Go button until the Toner, Drum, and Error lights turn on (this may be immediate).
  4. Let go of the Go button.
  5. Wait for the Toner, Drum, and Error lights to turn off (this may be immediate).
  6. Press the Go button 7 times. 
  7. The Toner, Drum, and Error lights should light up after the 7th time. This means the printer has been reset to factory settings.

Monday, October 25, 2010

Rails 3.0: MongoDB + Mongoid

Why Mongoid?

The de facto ORM for MongoDB in Rails is Mongo Mapper, so why choose Mongoid? You can ready why straight from the horses mouth here.

To summarize, Mongoid is built for Rails 3.0 and it handles larger documents better. It also feels like NoSQL when you use it. MongoMapper was built during Rails 2.x days and when MongoDB was young. It is modeled very closely to ActiveRecord to make the transition easier. Hence, it feels more like SQL. MongoMapper is more extensible though with a larger community.

MongoDB

Grab the latest build from MongoDB's download page. At this time, it is 1.6.3.

Download and install. We make a softlink. This way, if we upgrade, we just switch the softlink and everything else stays the same.
cd /usr/local/src/
sudo wget http://fastdl.mongodb.org/linux/mongodb-linux-i686-1.6.3.tgz
sudo tar xzf mongodb-linux-i686-1.6.3.tgz
sudo rm -rf mongodb-linux-i686-1.6.3.tgz
sudo ln -s mongodb-linux-i686-1.6.3 mongodb

Add /usr/local/src/mongodb/bin to your path.
PATH=$PATH:/usr/local/src/mongodb/bin

Make /data/db and give ownership to your user. There are obviously several ways to do this, but this is the easiest.
sudo mkdir -p /data/db
sudo chown -R wesley:wesley /data

Start your engines!
mongod

Mongoid

This builds on top of my previous post on installing Rails 3.0 with BDD. This assumes that the Rails application was created without ActiveRecord using -O. Check out Rails 3.0 agnosticism for an explanation.

Add mongoid to Gemfile. bson_ext is installed for a speed boost.
gem 'mongoid', '2.0.0.rc.6'
gem 'bson_ext', '~>1.2'

Install mongoid.
bundle install
rails generate mongoid:config

Cucumber

Add a cucumber environment to mongoid.yml.
cucumber:
  <<: *defaults
  database: myproject_cucumber 

Cucumber makes use of Database Cleaner. There are posts saying Database Cleaner doesn't work with Mongoid. However, it seems like it is now according to the official documentation. We need to modify features/support/env.rb this is not recommended as it is regenerated on a cucumber-rails upgrade. Instead, we will create features/support/local_env.rb.
require 'database_cleaner'
DatabaseCleaner.strategy = :truncation
DatabaseCleaner.orm = "mongoid"
Before { DatabaseCleaner.clean }

RSpec 2

RSpec includes ActiveRecord specific lines in spec/spec_helper.rb. You need to comment out the following 2 lines.
# config.fixture_path = "#{::Rails.root}/spec/fixtures"
# config.use_transactional_fixtures = true

To properly clean the database, RSpec needs to know how to do that with Mongoid. Again, we can use Database Cleaner for this.

Open up spec/spec_helper.rb and add the following in the RSpec.configure block
RSpec.configure do |config|    
  # Other things

  # Clean up the database      
  require 'database_cleaner'   
  config.before(:suite) do     
    DatabaseCleaner.strategy = :truncation
    DatabaseCleaner.orm = "mongoid" 
  end

  config.before(:each) do
    DatabaseCleaner.clean      
  end
end 

References


Updates

December 2, 2010: Changed Cucumber local_env.rb for database cleaning.
December 2, 2010: Added a section for RSpec database cleaning.
February 1, 2011: Updated Mongoid version in the Gemfile
February 26, 2011: Added a note to clean out ActiveRecord specific lines in RSpec.

Tuesday, October 19, 2010

Rails 3.0: Installing Cucumber + RSpec 2 + Capybara + AutoTest

Behaviour Driven Development (BDD) was created in response to Test Driven Development (TDD). TDD had brought the idea of testing to the forefront, but stopped short when it was applied mainly to developers. It failed to include other stakeholders. BDD is suppose to remedy this by specifying the behaviour of the application at a high level in English. This allows non-developers to spec out the application and be included in the conversation.

Cucumber

Cucumber is a BDD framework for Ruby. Specs will be written at this higher layer first to drive behaviour.

RSpec 2

RSpec 2 just came out of beta. It is a TDD framework for Ruby. Specs will be written at this second layer for testing the details of the implementation.

Capybara

Capybara is a replacement for Webrat. It is used to simulate how a real world user would interact with your application. A good post about why you would use Capybara over Webrat can be found here.

AutoTest

AutoTest runs your Cucumber and RSpec specs automatically whenever a file that affects the specs is modified.

Install

This builds on top of my previous post on installing Rails 3.0. This assumes that the Rails application was created without Test::Unit using -T. Check out Rails 3.0 agnosticism for an explanation.

Open Gemfile and add the following.
group :development, :test do
  gem 'capybara'
  gem 'database_cleaner'
  gem 'cucumber-rails'
  gem 'cucumber'
  gem 'rspec-rails'
  gem 'autotest'
  gem 'spork'
  gem 'launchy'
end

Install the gems. If you aren't using ActiveRecord, you won't have a database.yml. cucumber:install will complain unless you pass -D to it.
bundle install
rails generate rspec:install
rails generate cucumber:install --rspec --cabybara

AutoTest checks your entire project for changes. When tests fail, test.log is written to and because there is a change, AutoTest will kick-off again, and again, and again. To stop this from happening, create a .autotest at the root of your project to ignore certain files.
Autotest.add_hook :initialize do |at|                                                                                                                                                                          
  %w{ .git doc log tmp vendor }.each { |ex| at.add_exception( ex ) } 
end

AutoTest does not run Cucumber out of the box. There is debate whether autotesting Cucumber is a good idea since it is a very high-level test and can be quite heavy. Autotests should run quickly. If you want autotesting of Cucumber, you must add AUTOFEATURE to your environment before running or in your .bashrc.
export AUTOFEATURE=true

Start continuous testing. Hit ctrl+c twice to stop.
autotest

References


Updates

Feb. 8, 2011: Added solution to AutoTest continuously running on failure.

Monday, October 18, 2010

Rails 3.0: Agnosticism

When Rails and Merb merged, agnosticism was one of the big features to be added to Rails. They have stayed true to this vision.

Take a look at the help when creating a new Rails application.
rails new --help

You will notice three options.
  • -O: Skip Active Record
  • -T: Skip Test::Unit
  • -J: Skip Prototype

This allows you to create a fresh Rails application and have it ready to integrate with other gems more easily.
rails new myproject -OTJ

Monday, October 11, 2010

Rails 3.0: Installing with RVM + Thin

To keep things clean, we will use RVM's gemsets to install Rails 3.0. This will keep your work environment clean and allow you to switch between projects without worry which Ruby version or gems are installed.

Setup

I rarely look at the local documentations of gems, so let's not download them.
Create ~/.gemrc and add the following to it.
gem: --no-ri --no-rdoc

Install Ruby 1.9.2 and create a gemset for your project.
rvm install 1.9.2
rvm gemset create myproject 

Rails 3.0

When using a gemset, all gems installed will be bundled under that gemset. Therefore, you can have a gemset per project without worry about cross-contamination!

Use your gemset to install Rails 3.0. We will also need to install Bundler.
rvm use 1.9.2@myproject
gem install bundler
gem install rails
rails new myproject
cd myproject

Create .rvmrc in the root of your project and add the following to it.
rvm use 1.9.2@myproject
This tells RVM to switch to the project's Ruby version and gemset whenever you enter the project.

Give it permission.
cd ..
cd myproject
This will prompt you to trust the .rvmrc. Type 'y' and hit enter. It will automatically switch to 1.9.2@myproject now when going to the project.

Install the required Rails 3.0 gems. If you don't want sqlite, go into Gemfile and remove sqlite3-ruby and use your favourite database. I will be using MongoDB in a later post.
bundle install

Thin

WEBrick is slow. Let's install Thin.

Open up Gemfile and add thin.
gem 'thin'

Install thin.
bundle install

Start

Drum roll please!
rails server thin

Now visit http://localhost:3000/

Saturday, October 9, 2010

Why MongoDB?

MongoDB is a NoSQL implementation that I've decided to use for my project. One of the major deciding factors is that I deal with MongoDB at work and have experience with it. Unfortunately, this reason alone will not help you decide whether to use MongoDB, so I've outlined some other points below. Feel free to add more in the comments!

What is MongoDB?

MongoDB is a document-based database system. It stores everything in BSON, which is the binary format of JSON. A database holds a bunch of collections (tables). Each collection holds a bunch of documents (records/rows). Each document can be thought of as a large hash object. There are keys (columns) with values and the values can be anything represented in JSON, such as hashes, arrays, numbers, serialized objects, etc. MongoDB has been implemented with ease and speed as its main goals. Every design decision is made with this in mind, which leads to priorities in certain areas over others.

MongoDB vs RDBMS

This is similar to my previous post about NoSQL, but more specifically applied to MongoDB.

Advantages

Schema-less

Documents in a collection don't have to have the same format. This allows more flexible migrations, such as "lazy-loaded" migrations. Basically, there are certain migrations that don't have to happen en masse. They can occur individually when the document is read or written to. This allows for less downtime.

Scalable

Sharding is one of the goals MongoDB is concentrating on. Data is dispersed over two or more servers relieving the load on any single server which increases speed. Downtime is decreased because if a shard goes down, data on the other shards are still accessible. Being able to add shards lends to easier horizontal scaling. MongoDB sharding has been in active development for a while and was unleashed as of version 1.6. Rough spots still exist but MongoDB is looking to patch those up in the coming future.

Failover

Database servers are often setup in a master-slave format. This is not always easy to do. It's even better when the master fails and the slave is automatically upgraded to be the master. This is even harder to do. MongoDB does this seamlessly with replica sets. The servers in a set elect one server to be the master while the others replicate. If the master goes down, the others detect this and elect another server to be the master. New servers can be added without disturbing the setup. The application never has to know if the master has changed. No downtime. Elegant!

Speed

Speed is always a religious-like debate with a million benchmarks showing a million winners. MongoDB has documented a slew of benchmarks. What I take from this is that MongoDB is fast enough. It may or may not be the fastest, but it's definitely blazing. Coupled with the other advantages, I'd say this is a bonus.

GridFS

Ever needed to store large files? You've probably used the file system, Amazon S3, blobs etc. They may or may not have been easy to integrate, but it was another thing you had to deal with. Not with MongoDB. It implements a file storage specification called GridFS. It allows you to store large objects into the database as if it was a normal document. Not only is it one less thing to learn, it makes it easier to move your data since everything is in the database.

Disadvantages

Single server durability

MongoDB does not support this. Yet. Durability is the concept that anything committed to the database is actually committed to the database and resides in there permanently. Single server durability is the idea that a single server alone will maintain durability. However, MongoDB has a different stance on this. They believe that single server durability is not the goal but durability itself is and that durability should be attained through the use of multiple servers and replication. This is a goal MongoDB is actively working towards and which I believe they will achieve.

ACID

MongoDB does not support ACID. This prevents MongoDB being used in certain situations, but the trade-off is worth it for applications that don't require ACID. That gain is in speed. And it's noticeable.

Transactions

Transactions can be viewed as part of ACID, but I thought I'd make this explicit. If you require transactions, MongoDB is out of the question unless you roll your own. Again, for many applications out there, transactions are unnecessary.

Relational

MongoDB does not handle highly relational databases as well as RDBMS. I think this one is obvious, but many forget to take this into consideration when hopping feet first into MongoDB. You've been warned.

MongoDB vs Other NoSQL Implementations

Unfortunately, I haven't worked too much with other implementations of NoSQL. However, most popular NoSQL implementations are in use in production by notable companies. MongoDB is a little easier to wrap your mind around since it still retains quite a lot of similarities with RDBMS, but still gives you that extra kick. One other thing to consider is that MongoDB provides commercial support. This is not currently available with all NoSQL implementations. If you have experience with this, I'd like to hear about it in the comments below.

Conclusion

I'm choosing MongoDB because I've worked with it and have really enjoyed the experience over RDBMS. However, if you're at a crossroads, I'd recommend MongoDB in your next project unless you have a highly relational database or you need ACID.

Saturday, October 2, 2010

Why NoSQL?

Recently, there has been a movement to NoSQL and a disdain for traditional RDBMS when it comes to web development. There have been many discussions pitting the two against each other, almost reaching a religious fervor. With the movement, there have been several new databases available:

Why NoSQL?

Hasn't RDBMS served us well for decades? They're mature, well thought out, and optimized for their task. SQL is a powerful querying language that allows you to do almost anything. So why the change?

PITA

It's human nature at its best. We're given a tool for a task. We use it to its fullest and try to increase the efficiency of the tool. However, there will always be frustrations and clumsiness inherent with the tool. No tool is ever perfect for the job at hand. Working with a tool for many years, the frustrations accumulate until, one day, we can't take it anymore. It's a PITA. We invent something new, even if it isn't as good yet. It at least solves the frustrations of the previous tool, which at this point, is all that matters. Why do I say this is human nature at its best? Because this is how we improve. This is how we take it to the next level. Human beings are never satisfied with the status quo for long.

Scaling

So what's so frustrating with RDBMS? Scaling. It's that simple. RDBMS has problems with scale. When an RDBMS gets large, think about what happens with the following:
  • migrations
  • expansion
  • backups
  • failover
  • locking

Migrations

What happens when you need to migrate data in a table that has millions of rows? Hours of downtime. Downtime for your site means frustrations for your customers. That's not a good thing.

Expansion

How easy is it to tack on another database server? Well, you're going to have to worry about sharding first in the application itself before you can even try. Don't have that code from the beginning? Well, start coding now.

Backups

Don't have sharding? Your backups are going to be huge. This isn't as much of a PITA these days with such large disk space, but no matter what, it's still easier to handle smaller pieces of data.

Failover

The current RDBMS implementations don't make it easy to setup failover. It's usually tacked on afterwards.

Locking

With ACID and transactions, locking is inevitable in RDBMS. With a large scale, there are bound to be long waits. Unfortunately, the larger the database, the more waiting. There are ways around this, but it's a PITA.

Variety

It's not to say that RDBMS isn't good. It's very good at what it does. But, for certain web applications, the tool just doesn't fit. The PITA factor goes through the roof. Does this mean we should default to NoSQL everytime? No. The point is variety. It's not about RDBMS vs NoSQL vs something else. It's about which tool fits best for the job. We should be glad that there's a choice. We can even decide to use RDBMS alongside NoSQL if it works better. Variety is the spice of life.

Where does it fit?

If we can't throw out RDBMS, then when should we use NoSQL? It seems like the jury is still out on this as not all NoSQL implementations are built the same. No implementation solves all the frustrations of RDBMS. Plus, NoSQL hasn't reached the age in production environments to tell what the side-effects are. You will need to do research on this front and see which implementation, if any, of NoSQL fits your problem. Here's a quick guideline though:
  • not many relations (low number of joins)
  • scaling required
  • read-intensive
  • ACID and transactions are not important
  • you just want to geek-out and live on the edge :D

Wednesday, September 15, 2010

Why RVM?

Ubuntu 10.04 comes with Ruby 1.9.1-p378. That's it for 1.9.1. Want another patch level or 1.9.2? Compile it yourself. This is ugly.

Enter RVM.

RVM allows you to install a plethora of Ruby versions and switch between them in a snap. This is a no brainer for Ruby developers as we move forward. Along with Ruby versions, RVM provides the ability to create gemsets to allow the changing of gems depending on projects.

Installing Ruby 1.9.2:
rvm install 1.9.2

Using Ruby 1.9.2:
rvm use 1.9.2

It's that easy. Here's a good installation guide. Look out for instructions after installing RVM. Make sure you do everything it tells you to, especially removing the return from your .bashrc.

Monday, September 13, 2010

Init

I'm starting a brand new project and rediscovering Rails along the way. There have been many changes since I last touched Rails (2.1), so I'll attempt to document my path as I relearn everything again.

My goal is to learn as much as I can about all the new technologies and ideas that have exploded onto the scene recently including

Currently, I'm thinking my stack will be

Things I might consider