"realtime" fs mirror application (backup, Python and Linux inotify)

Discussion in 'Linux Networking' started by Roc Zhou, Oct 24, 2007.

  1. Roc Zhou

    Roc Zhou Guest

    Hello:

    Recently I started an open source project "cutils" on the sourceforge:
    http://sourceforge.net/projects/crablfs/

    The document can be found at:
    http://crablfs.sourceforge.net/#ru_data_man

    This project's mirrord/fs_mirror tool is a near realtime file system
    mirroring application across 2 or more hosts, something like MySQL's
    replication, but it's for the file system especially with a great
    amount
    of small files, such as the php scripts and images of a website or the
    (vitual) websites.

    There are several ways to use this tool. The simplest is to mirror a
    host's file system to another host for backup, and use the rotate
    function(in the future version) or rotate scripts to get a daily or
    hourly snapshot with the hard link.

    Or futhur more, you can use it this way:
    This graph should be displayed with monospaced fonts:

    +----------+
    | worker | -[mirrord] -----------\
    +----------+ |
    ...... |
    |
    +----------+ |
    | worker | -[mirrord] -----------\
    +----------+ |
    V
    [fs_mirror]
    |
    +----------+ +----------+
    | worker | -[mirrord] ---> | backup |
    +----------+ +----------+
    | |
    [take_over] |
    | |
    V |
    +----------+ |
    | rescue | <------------------- NFS
    +----------+

    This is the multi to one backup, which is cost efficient. If one of
    the
    worker hosts fails, you can subsitute the failed worker with the
    rescue
    host, with the aid of any high available method, such as heartbeat
    project. By this way, you can use 1 or 2 hosts to support the HA of
    more
    than 3 servers.

    Or you can also use it as an IDS(Intrusion Detection System) like a
    realtime "tripware", or you can make a mirror chain that a host B
    mirrors
    from A and be mirrored by C, etc ... I will also try to research a way
    to use it as a distributed implemetation with one write and multi-read
    model.

    mirrord/fs_mirror makes use of inotify, which is a function afforded
    by
    the recent Linux (from 2.6.12). It is a counterpart of FAM, since
    Linux
    FAM has stopped so long.

    Now it works for me, on a RHEL4 system and the LFS 6.2, I hope this
    tool
    can be useful to you too.

    Thanks.
     
    Roc Zhou, Oct 24, 2007
    #1
    1. Advertisements

  2. Roc Zhou

    Roc Zhou Guest

    Now I meet a strange problem.

    After the first sync init, it enters to the realtime replication
    state. I deployed them on 3 machines, and have run near half month.
    Suddenly one day, a host, I don't know what's wrong, I found fs_mirror
    get the empty
    records from its mirrord agent. In normal conditions, these records
    should
    be:
    "CREATE:/var/www/html"
    "FWRITE:/var/www/html/index.php"
    "DELETE:/var/www/html/temp"
    "MOVE:('/var/www/html/aa', '/var/www/html/bb')"
    ....
    But should be no empty records. This lead to fs_mirror to a dead
    infinite loop.

    I restart the fs_mirror from the broken point, but the problem
    remains, after DEBUG I found the problem occurs at the same serial
    number every time(I use Berkeley DB as the log record(wmLog), and
    serial numbers are the keys), so I suspect that the problem is BDB,
    but I don't know how to test
    and locate to the right place.

    I tried to open the orignal db file in Python:
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "/usr/lib/python2.5/bsddb/__init__.py", line 223, in
    __getitem__
    return _DeadlockWrap(lambda: self.db[key]) # self.db[key]
    File "/usr/lib/python2.5/bsddb/dbutils.py", line 62, in DeadlockWrap
    return function(*_args, **_kwargs)
    File "/usr/lib/python2.5/bsddb/__init__.py", line 223, in <lambda>
    return _DeadlockWrap(lambda: self.db[key]) # self.db[key]
    KeyError: '6854'
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "/usr/lib/python2.5/bsddb/__init__.py", line 223, in
    __getitem__
    return _DeadlockWrap(lambda: self.db[key]) # self.db[key]
    File "/usr/lib/python2.5/bsddb/dbutils.py", line 62, in DeadlockWrap
    return function(*_args, **_kwargs)
    File "/usr/lib/python2.5/bsddb/__init__.py", line 223, in <lambda>
    return _DeadlockWrap(lambda: self.db[key]) # self.db[key]
    KeyError: '6854'Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "/usr/lib/python2.5/bsddb/__init__.py", line 278, in first
    rv = _DeadlockWrap(self.dbc.first)
    File "/usr/lib/python2.5/bsddb/dbutils.py", line 62, in DeadlockWrap
    return function(*_args, **_kwargs)
    _bsddb.DBNotFoundError: (-30990, 'DB_NOTFOUND: No matching key/data
    pair found

    Even I have stopped the mirrord daemon, the errors remain.

    Then I tried to copy and move out the database file, and open the new
    dbfile:0
    the length is 0, and getitem get the same errors above.

    Why this occurs when I copy the db file? Especially the len() is 0?!

    I can only restart the mirrord, to rebuild the BDB data file, and so
    far,
    this problem does not occurs again.

    I don't know why there is a occasional problem like this? Is there any
    one be familiar with BDB can give me several advices?

    Thanks.
     
    Roc Zhou, Oct 31, 2007
    #2
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.