Showing posts with label mongodb. Show all posts
Showing posts with label mongodb. Show all posts

Monday, October 25, 2010

Rails 3.0: MongoDB + Mongoid

Why Mongoid?

The de facto ORM for MongoDB in Rails is Mongo Mapper, so why choose Mongoid? You can ready why straight from the horses mouth here.

To summarize, Mongoid is built for Rails 3.0 and it handles larger documents better. It also feels like NoSQL when you use it. MongoMapper was built during Rails 2.x days and when MongoDB was young. It is modeled very closely to ActiveRecord to make the transition easier. Hence, it feels more like SQL. MongoMapper is more extensible though with a larger community.

MongoDB

Grab the latest build from MongoDB's download page. At this time, it is 1.6.3.

Download and install. We make a softlink. This way, if we upgrade, we just switch the softlink and everything else stays the same.
cd /usr/local/src/
sudo wget http://fastdl.mongodb.org/linux/mongodb-linux-i686-1.6.3.tgz
sudo tar xzf mongodb-linux-i686-1.6.3.tgz
sudo rm -rf mongodb-linux-i686-1.6.3.tgz
sudo ln -s mongodb-linux-i686-1.6.3 mongodb

Add /usr/local/src/mongodb/bin to your path.
PATH=$PATH:/usr/local/src/mongodb/bin

Make /data/db and give ownership to your user. There are obviously several ways to do this, but this is the easiest.
sudo mkdir -p /data/db
sudo chown -R wesley:wesley /data

Start your engines!
mongod

Mongoid

This builds on top of my previous post on installing Rails 3.0 with BDD. This assumes that the Rails application was created without ActiveRecord using -O. Check out Rails 3.0 agnosticism for an explanation.

Add mongoid to Gemfile. bson_ext is installed for a speed boost.
gem 'mongoid', '2.0.0.rc.6'
gem 'bson_ext', '~>1.2'

Install mongoid.
bundle install
rails generate mongoid:config

Cucumber

Add a cucumber environment to mongoid.yml.
cucumber:
  <<: *defaults
  database: myproject_cucumber 

Cucumber makes use of Database Cleaner. There are posts saying Database Cleaner doesn't work with Mongoid. However, it seems like it is now according to the official documentation. We need to modify features/support/env.rb this is not recommended as it is regenerated on a cucumber-rails upgrade. Instead, we will create features/support/local_env.rb.
require 'database_cleaner'
DatabaseCleaner.strategy = :truncation
DatabaseCleaner.orm = "mongoid"
Before { DatabaseCleaner.clean }

RSpec 2

RSpec includes ActiveRecord specific lines in spec/spec_helper.rb. You need to comment out the following 2 lines.
# config.fixture_path = "#{::Rails.root}/spec/fixtures"
# config.use_transactional_fixtures = true

To properly clean the database, RSpec needs to know how to do that with Mongoid. Again, we can use Database Cleaner for this.

Open up spec/spec_helper.rb and add the following in the RSpec.configure block
RSpec.configure do |config|    
  # Other things

  # Clean up the database      
  require 'database_cleaner'   
  config.before(:suite) do     
    DatabaseCleaner.strategy = :truncation
    DatabaseCleaner.orm = "mongoid" 
  end

  config.before(:each) do
    DatabaseCleaner.clean      
  end
end 

References


Updates

December 2, 2010: Changed Cucumber local_env.rb for database cleaning.
December 2, 2010: Added a section for RSpec database cleaning.
February 1, 2011: Updated Mongoid version in the Gemfile
February 26, 2011: Added a note to clean out ActiveRecord specific lines in RSpec.

Saturday, October 9, 2010

Why MongoDB?

MongoDB is a NoSQL implementation that I've decided to use for my project. One of the major deciding factors is that I deal with MongoDB at work and have experience with it. Unfortunately, this reason alone will not help you decide whether to use MongoDB, so I've outlined some other points below. Feel free to add more in the comments!

What is MongoDB?

MongoDB is a document-based database system. It stores everything in BSON, which is the binary format of JSON. A database holds a bunch of collections (tables). Each collection holds a bunch of documents (records/rows). Each document can be thought of as a large hash object. There are keys (columns) with values and the values can be anything represented in JSON, such as hashes, arrays, numbers, serialized objects, etc. MongoDB has been implemented with ease and speed as its main goals. Every design decision is made with this in mind, which leads to priorities in certain areas over others.

MongoDB vs RDBMS

This is similar to my previous post about NoSQL, but more specifically applied to MongoDB.

Advantages

Schema-less

Documents in a collection don't have to have the same format. This allows more flexible migrations, such as "lazy-loaded" migrations. Basically, there are certain migrations that don't have to happen en masse. They can occur individually when the document is read or written to. This allows for less downtime.

Scalable

Sharding is one of the goals MongoDB is concentrating on. Data is dispersed over two or more servers relieving the load on any single server which increases speed. Downtime is decreased because if a shard goes down, data on the other shards are still accessible. Being able to add shards lends to easier horizontal scaling. MongoDB sharding has been in active development for a while and was unleashed as of version 1.6. Rough spots still exist but MongoDB is looking to patch those up in the coming future.

Failover

Database servers are often setup in a master-slave format. This is not always easy to do. It's even better when the master fails and the slave is automatically upgraded to be the master. This is even harder to do. MongoDB does this seamlessly with replica sets. The servers in a set elect one server to be the master while the others replicate. If the master goes down, the others detect this and elect another server to be the master. New servers can be added without disturbing the setup. The application never has to know if the master has changed. No downtime. Elegant!

Speed

Speed is always a religious-like debate with a million benchmarks showing a million winners. MongoDB has documented a slew of benchmarks. What I take from this is that MongoDB is fast enough. It may or may not be the fastest, but it's definitely blazing. Coupled with the other advantages, I'd say this is a bonus.

GridFS

Ever needed to store large files? You've probably used the file system, Amazon S3, blobs etc. They may or may not have been easy to integrate, but it was another thing you had to deal with. Not with MongoDB. It implements a file storage specification called GridFS. It allows you to store large objects into the database as if it was a normal document. Not only is it one less thing to learn, it makes it easier to move your data since everything is in the database.

Disadvantages

Single server durability

MongoDB does not support this. Yet. Durability is the concept that anything committed to the database is actually committed to the database and resides in there permanently. Single server durability is the idea that a single server alone will maintain durability. However, MongoDB has a different stance on this. They believe that single server durability is not the goal but durability itself is and that durability should be attained through the use of multiple servers and replication. This is a goal MongoDB is actively working towards and which I believe they will achieve.

ACID

MongoDB does not support ACID. This prevents MongoDB being used in certain situations, but the trade-off is worth it for applications that don't require ACID. That gain is in speed. And it's noticeable.

Transactions

Transactions can be viewed as part of ACID, but I thought I'd make this explicit. If you require transactions, MongoDB is out of the question unless you roll your own. Again, for many applications out there, transactions are unnecessary.

Relational

MongoDB does not handle highly relational databases as well as RDBMS. I think this one is obvious, but many forget to take this into consideration when hopping feet first into MongoDB. You've been warned.

MongoDB vs Other NoSQL Implementations

Unfortunately, I haven't worked too much with other implementations of NoSQL. However, most popular NoSQL implementations are in use in production by notable companies. MongoDB is a little easier to wrap your mind around since it still retains quite a lot of similarities with RDBMS, but still gives you that extra kick. One other thing to consider is that MongoDB provides commercial support. This is not currently available with all NoSQL implementations. If you have experience with this, I'd like to hear about it in the comments below.

Conclusion

I'm choosing MongoDB because I've worked with it and have really enjoyed the experience over RDBMS. However, if you're at a crossroads, I'd recommend MongoDB in your next project unless you have a highly relational database or you need ACID.