OmniTI ~ RubyRep - Yet Another Tool For PostgreSQL Replication

RubyRep - Yet Another Tool For PostgreSQL Replication

By: Denish Patel 19 Aug '09

Level:

This article reviews a fundamental concept or principle
This article reviews an intermediate concept or principle
This article reviews an advanced concept or principle
This article expresses an opinion or just a downright rant

One of the key features any enterprise considers when choosing a database technology for their architecture solution stack is that of replication. Oracle and MySQL both have built in replication solutions, but as of yet PostgreSQL doesn't support a built in replication solution. There are many replication solutions available however, and different companies are using different solutions customized for their needs.

Among all of the solutions, Slony is probably the most widely tested and deployed within organizations, although it does have the following limitations:

Replicated tables must have a unique or primary key
It does not support replication of large objects
Schema changes are not propagated (though they can be coordinated)
It does not support synchronizing databases outside of replication
There are limitations on version compatability; you can not replicate from PostgreSQL 8.2 to PostgreSQL 8.4 for example
It is more difficult to set up than many other replication solutions

One new alternative to Slony is a project known as RubyRep, which is designed to avoid some of the limitations of Slony. RubyRep provides both master-slave and master-master replication, and it works for PostgreSQL as well as MySQL. It is currently developed by Arndt Lehmann, a German who has been living since 2001 in Tokyo, Japan. He also provides great support to the RubyRep mailing list, especially for adding new features or fixing bugs.

RubyRep always operates on two databases. To make it simple to understand, the databases are referred to as "left" and "right" database respectively.

RubyRep's key features includes:

Simple configuration, complete setup can be done via single configuration file.
Simple Installation, if you have a JVM installed, then you just have to download and extract the files.
Platform Independent, it runs on Unix and Windows platform.
Table Design Independent, meaning that all commands work on tables no matter if they have a simple primary key (all data types acceptable), a combined primary key, or no primary key at all. It successfully processes multi-byte texts and "big" data types

In addition to the above, RubyRep actually provides three tools in one; a Compare, Sync, and Replication tools.

Compare

This tool scans corresponding tables of left and right database, looking for diverging data. Key features of the comparison tool are:

Different output modes, from a count of differences to full row dumps.
Low bandwidth mode available, reducing the number of round-trips so only actual differences go through the network.
A progress bar with estimated remaining amount of work.
Server load is targeted toward only the "right" database server.

In one test we ran, we compared two 50 million row tables in around 3 hours, without affecting production server load. This is accomplished by comparing rows in batches, and you can adjust the batch size in the configuration file.

Sync

The sync tool is used to synchronize data in corresponding tables of a left and right pair of databases. Key features of the sync tool are:

All features of the Compare tool also apply to syncs
Automatically orders table syncs to avoid foreign key conflicts.
You can configure the Sync policy to ignore deletes in left database, or to ignore creating records in right database, and other such combinations
Provides two prebuilt conflict resolution methods, either left db wins or right db wins
Custom conflict resolution methods specifiable via ruby code snippets
Merge decisions can optionally be logged in the rubyrep event log table.

Replicate

Of course RubyRep also provides a replication tool. Some of the key features of the replication tool include:

Automatically sets up all necessary triggers, log tables, etc.
Automatically discovers newly added tables and synchronizes the table content
Automatically reconfigures sequences to avoid duplicate key conflicts
Tracks changes to primary key columns
Can implement either master-slave or master-master replication
Prebuilt conflict resolution methods available include left or right wins, or earlier, later change wins
Custom conflict resolution specifiable via ruby code snippets
Replication decisions can optionally be logged in the rubyrep event log table

One of the problems common to replication solutions is that of setting up new nodes. With Slony, there are always some headaches caused by high load on master database server, as a result of the TRUNCATE/COPY cycle Slony goes through. In the case of RubyRep, most of the CPU load is on the slave server, and you can use the Sync command in advance before you start replicating database. RubyRep also provides some flexibility to ignore the Sync commands if you don't want to sync the database again.

RubyRep in action...

Help

 
; ./rubyrep --help

Usage: ./bin/rubyrep [general options] command [parameters, ...]

Asynchronous master-master replication of relational databases.

Available options:
        --verbose                    Show errors with full stack trace
    -v, --version                    Show version information.
        --help                       Show this message

Available commands:
  generate        Generates a configuration file template
  help            Shows detailed help for the specified command
  proxy           Proxies connections from rubyrep commands to the database
  replicate       Starts a replication process
  scan            Scans for differing records between databases
  sync            Syncs records between databases
  uninstall       Removes all rubyrep tables, triggers, etc. from "left" and "right" database

Generate configuration file

 
; ./rubyrep generate pagila.conf

Compare/Sync Example:

 
; cat pagila.conf
  RR::Initializer::run do |config|
  config.left = {
    :adapter  => 'postgresql', # or 'mysql'
    :database => 'pagila,
    :username => 'rubyrep',
    :password => 'rubyrep',
    :host     => '192.168.0.1',
    :port =>'5432'
  }

  config.right = {
    :adapter  => 'postgresql',
    :database => 'pagila',
    :username => 'rubyrep',
    :password => 'rubyrep',
    :host     => '127,0.0.1',
    :port     => '5483'
  }

  config.include_tables 'users'
  # config.include_tables /^e/ # regexp matching all tables starting with e
  # config.include_tables /./ # regexp matching all tables in the database
end

; ./rubyrep scan -d=keys -b -c pagila.conf  > users_diff.log

; cat users_diff.log
users             users .........................  1
--- 
:conflict: 
- lastname: patel
  zipcode: "2096"
  ipaddress: 192.168.0.126
 userid: 48212620
 address: columbia

; ./rubyrep sync -c pagila.conf

Replication example:

By default, RubyRep runs in master-master replication mode, but you can adjust the following configuration setting to make it master-slave replication:


; cat pagila_replicate.conf
 RR::Initializer::run do |config|
  config.left = {
    :adapter  => 'postgresql', 
    :database => 'pagila,
    :username => 'rubyrep',
    :password => 'rubyrep',
    :host     => '192.168.0.1',
    :port =>'5432',
    :schema_search_path => 'public,pagila' 
  }

  config.right = {
    :adapter  => 'postgresql',
    :database => 'pagila',
    :username => 'rubyrep',
    :password => 'rubyrep',
    :host     => '127,0.0.1',
    :port     => '5483',
    :schema_search_path => 'public,pagila' 
  }

   config.include_tables /./ # regexp matching all tables in the database
   config.options[:auto_key_limit] = 60
   config.options[:adjust_sequences] = false 
   config.options[:sequence_increment] = 1 
   #Sync Policy: Changes in the right database will not be applied to the left database. 
   config.options[:right_record_handling] = :ignore 
   config.options[:sync_conflict_handling] = :left_wins 
  # Additional logging 
  config.options[:logged_replication_events] = [
                                         :ignored_changes,
                                         :ignored_conflicts
                                                ]
#ignore history tables
 config.exclude_tables /_history/
 config.exclude_tables 'test1'
 config.exclude_tables 'pagila'
end

Detailed information for the each configuration setting can be found in the RubyRep documentation. There are also sample configuration files and a tutorial available too for getting familiar with each of the features RubyRep offers. Based on our initial testing, it should also be possible to upgrade some older PostgreSQL databases from 8.2 to 8.4; We found a problem with tsvector support, but that will not affect everyone, and hopefully it can be fixed soon.

In This Issue…

Marketing Malware

23 Dec '09 from Mark Hammonds
Level:
- This article reviews a fundamental concept or principle
- This article reviews an intermediate concept or principle
- This article reviews an advanced concept or principle
- This article expresses an opinion or just a downright rant
Internet registrar GoDaddy.com is notorious for two things: domain names and risque super bow…
Business Metrics Too

16 Dec '09 from Jason Dixon
Level:
- This article reviews a fundamental concept or principle
- This article reviews an intermediate concept or principle
- This article reviews an advanced concept or principle
- This article expresses an opinion or just a downright rant
When I began tinkering around with web services as a hobby, it was common to fiddle with an ap…
Transcending the Medium

15 Oct '09 from Mark Hammonds
Level:
- This article reviews a fundamental concept or principle
- This article reviews an intermediate concept or principle
- This article reviews an advanced concept or principle
- This article expresses an opinion or just a downright rant
I grew up in Boyd County, Kentucky, where John Deere tractors are practically an indigenous sp…
When Commodity Makes Sense

29 Sep '09 from Eric Sproul
Level:
- This article reviews a fundamental concept or principle
- This article reviews an intermediate concept or principle
- This article reviews an advanced concept or principle
- This article expresses an opinion or just a downright rant
We’d all like to spend as little money as possible to get the performance we desire from…
Stacking the Deck for Publishers

22 Sep '09 from Jon Tan
Level:
- This article reviews a fundamental concept or principle
- This article reviews an intermediate concept or principle
- This article reviews an advanced concept or principle
- This article expresses an opinion or just a downright rant
Newspapers and magazines have a unique opportunity with online publishing. They have the bes…
Concepts of Cloud(ish) Storage

22 Sep '09 from Theo Schlossnagle
Level:
- This article reviews a fundamental concept or principle
- This article reviews an intermediate concept or principle
- This article reviews an advanced concept or principle
- This article expresses an opinion or just a downright rant
It’s rare that I write an article simply to educate. Most of the time I am attempting t…
What is Web Operations?

22 Sep '09 from Theo Schlossnagle
Level:
- This article reviews a fundamental concept or principle
- This article reviews an intermediate concept or principle
- This article reviews an advanced concept or principle
- This article expresses an opinion or just a downright rant
The field of web operations is one with which I am intimately familiar. For the last twelve y…
ORMs Done Right

22 Sep '09 from Clinton Wolfe
Level:
- This article reviews a fundamental concept or principle
- This article reviews an intermediate concept or principle
- This article reviews an advanced concept or principle
- This article expresses an opinion or just a downright rant
Object-Relational Mapper (ORM) systems are one of the most contentious topics in database appl…
Virtualization, ZFS and Zetaback

22 Sep '09 from Mark Harrison
Level:
- This article reviews a fundamental concept or principle
- This article reviews an intermediate concept or principle
- This article reviews an advanced concept or principle
- This article expresses an opinion or just a downright rant
It used to be the case that when you wanted to deploy a new application you would need to buy …
Dissecting Today's Internet Traffic Spikes

22 Sep '09 from Theo Schlossnagle
Level:
- This article reviews a fundamental concept or principle
- This article reviews an intermediate concept or principle
- This article reviews an advanced concept or principle
- This article expresses an opinion or just a downright rant
Today's Internet has changed quite a bit from the Internet I used to know. The Internet has a…
YSlow! to YFast! in 45 minutes.

22 Sep '09 from Theo Schlossnagle
Level:
- This article reviews a fundamental concept or principle
- This article reviews an intermediate concept or principle
- This article reviews an advanced concept or principle
- This article expresses an opinion or just a downright rant
The web is a complex beast. There are many moving parts involved in delivering a complete web…
RubyRep - Yet Another Tool For PostgreSQL Replication

19 Aug '09 from Denish Patel
Level:
- This article reviews a fundamental concept or principle
- This article reviews an intermediate concept or principle
- This article reviews an advanced concept or principle
- This article expresses an opinion or just a downright rant
One of the key features any enterprise considers when choosing a database technology for their…
Under the Hood

8 Jun '09 from Sherry Schlossnagle
Level:
- This article reviews a fundamental concept or principle
- This article reviews an intermediate concept or principle
- This article reviews an advanced concept or principle
- This article expresses an opinion or just a downright rant
My perspective on the evolution of OmniTI is somewhat like that of a mechanic on a team of rac…
Increasing the Aperture on Security

13 Feb '09 from Jason Dixon
Level:
- This article reviews a fundamental concept or principle
- This article reviews an intermediate concept or principle
- This article reviews an advanced concept or principle
- This article expresses an opinion or just a downright rant
Security is good. Security is necessary. Security is someone else's concern. Security is for o…
Embracing Failure to Rise Above Enterprise-Class Thinking

13 Feb '09 from Eric Sproul
Level:
- This article reviews a fundamental concept or principle
- This article reviews an intermediate concept or principle
- This article reviews an advanced concept or principle
- This article expresses an opinion or just a downright rant
Failures in technical systems are inevitable. Drives die, network interfaces wink out, backho…
The Irony of Sun Database Technology

13 Feb '09 from Robert Treat
Level:
- This article reviews a fundamental concept or principle
- This article reviews an intermediate concept or principle
- This article reviews an advanced concept or principle
- This article expresses an opinion or just a downright rant
It's been just over a year since Sun announced it had agreed to purchase MySQL, the ever popul…
Custom Trending and the Benefits of Source Code Availability

13 Feb '09 from Mark Harrison
Level:
- This article reviews a fundamental concept or principle
- This article reviews an intermediate concept or principle
- This article reviews an advanced concept or principle
- This article expresses an opinion or just a downright rant
One of the self evident truths about system administration is that you need to know what is go…
Using Less is Green

13 Feb '09 from Theo Schlossnagle
Level:
- This article reviews a fundamental concept or principle
- This article reviews an intermediate concept or principle
- This article reviews an advanced concept or principle
- This article expresses an opinion or just a downright rant
Every time I hear about green computing I feel like there is a gap — an enor…
Oracle Flashback Versions Query

27 Jan '09 from Denish Patel
Level:
- This article reviews a fundamental concept or principle
- This article reviews an intermediate concept or principle
- This article reviews an advanced concept or principle
- This article expresses an opinion or just a downright rant
In database system, some times flashback for the older versions of the data or query is import…

Seeds 2009

Year:

RubyRep - Yet Another Tool For PostgreSQL Replication

Sync

Replicate

RubyRep in action...

In This Issue…

Marketing Malware

Business Metrics Too

Transcending the Medium

When Commodity Makes Sense

Stacking the Deck for Publishers

Concepts of Cloud(ish) Storage

What is Web Operations?

ORMs Done Right

Virtualization, ZFS and Zetaback

Dissecting Today's Internet Traffic Spikes

YSlow! to YFast! in 45 minutes.

RubyRep - Yet Another Tool For PostgreSQL Replication

Under the Hood

Increasing the Aperture on Security

Embracing Failure to Rise Above Enterprise-Class Thinking

The Irony of Sun Database Technology

Custom Trending and the Benefits of Source Code Availability

Using Less is Green

Oracle Flashback Versions Query

What We Do

Footer

Year:

RubyRep - Yet Another Tool For PostgreSQL Replication

Sync

Replicate

RubyRep in action...

Subscribe to Our Newsletter

Stay in Touch!

In This Issue…

What We Do

Footer