Benchmark Setup

Benchmark Description
Berkeley DB A distributed database system
NYU benchmarks Distributed file system based on FUSE

Buliding recording system

TODO

Berkeley DB

A brief description of Berkeley DB.

Install Berkeley DB

  1. Get Berkeley DB source code from its official download link

  2. Decompress the source code to $BDB_ROOT

  3. Go to $BDB_ROOT/build_unix and type following command

    1. ./../dist/configure --enable-cxx --enable-stl --enable-sql

    2. make

    3. sudo make install

  4. Now we'd better add Berkeley DB libraries to dynamic linking searching directory by creating file BerkeleyDB.conf in /etc/ld.so.conf.d/ with a single line containning /usr/local/BerkeleyDB.5.2/lib/.

Play with Berkeley DB

Let's try to build $BDB_ROOT/examples/c/ex_rep, which is an example of replication implementation using Berkeley DB.

First, go to $BDB_ROOT/examples/c/ex_rep/, and create Makefile as following

CC=gcc
CFLAGS=-I/usr/local/BerkeleyDB.5.2/include/ -I$(BDB_ROOT)/src
LDFLAGS=-ldb-5.2 -L/usr/local/BerkeleyDB.5.2/lib/
OBJS=rep_base.o rep_msg.o rep_net.o rep_common.o

all: ex_reg_mgr

ex_reg_mgr: $(OBJS)
	${CC} ${CFLAGS} ${LDFLAGS} $^ -o $@

rep_base.o: base/rep_base.c
	${CC} ${CFLAGS} -c base/rep_base.c

rep_msg.o: base/rep_msg.c
	${CC} ${CFLAGS} -c base/rep_msg.c

rep_net.o: base/rep_net.c
	${CC} ${CFLAGS} -c base/rep_net.c

rep_common.o: common/rep_common.c
	${CC} ${CFLAGS} -c common/rep_common.c

Then try make.

A brief description of ex_rep_mgr.

Create a shell script run.sh as following.

rm -rf data
mkdir data
killall ex_reg_mgr
./ex_reg_mgr -M -h data -l 127.0.0.1:6000 -r 127.0.0.1:6001 -n 2 &
./ex_reg_mgr -C -h data -l 127.0.0.1:6001 -r 127.0.0.1:6000 -n 2 &

Then run sh run.sh, which starts a replication system with two nodes.

Bulid Berkeley DB with LLVM

Berkeley DB uses libtool for finding compilers, which is a standard shell scripts often used by Linux developers. To build Berkeley DB with LLVM, we need to replace all gccg compilers and linkers with llvm-gccllvm-g++/llvm-ld. This can be implemented with following steps.

Potential BDB example applications for testing

Take a look at here for Berkeley reference.

Application Description
c/ex_rep/ This creates a toy stock quote server with DB's single-master, multiple-client replication, with communication over TCP.
c/ex_thread.c Threaded application with multiple readers and writers. Refer to this.
transapp.cs Transaction failure, recovery and dead lock
txn Distributed Transactions

NYU Benchmark

NYU benchmark comes from the lab of Professor Jinyang Li's course Distributed Systems in New York University. Lab instruction is avaiable on Jinyang's lab webpage.

Installation

First install dependent libraries.

sudo apt-get install encfs libfuse2 libfuse-dev fuse-utils

Download Jinyang's lab package, and decompress it to get a directory named lab8. The directory lab8 contains solutions of Jinyang's course lab from eight students. Decompress one solution, for example crm350-lab8.tgz, and we will obtain a solution directory. Let's rename it to crm_lab for distinguish. In crm_lab, run Makefile and make sure you get every project compiled. You might have to comppile all following projects. (Possibly you will get compilation error caused by ambigius function call, then fix it and build again)

arch 
  • make extent_server

  • make lock_server

  • make lock_tester

  • make rsm_tester

  • make yfs_client

  • make

For the usage of these executives, refer to Jinyang's webpage.

Testing

The lab only support running We have several potential testing issues for Jinyang's lab.

  1. message lost and re-transimissions (network partitions)

  2. lock server failures

  3. Extent server consistency

Random Testing

  1. First, we want to create a senario with multiple clients, replicas where clients keep doing random file operations.

  2. Second, we want to enable following failures and recoveries, which happen in random but are limited by some specifications.

    1. Message omission

    2. Network partition

    3. Node failure

  3. Third, we need a checker to verify application states.

Running the apps

$ ./start.sh 0 3
$ ./stop.sh

Run ./start.sh LOSSY NUM, where LOSSY seems to specify whether there's failure and NUM specifies the number of replicas. You can try ./start.sh 0 3.

After running start.sh, be sure to run stop.sh to umount yfs from your system. You may need root permission to do this: sudo ./stop.sh.

Another way to run the yfs lab is using rsm_tester.pl, like

$ ./rsm_tester.pl 3

Bug Report

.# Branch Code Description Root Cause
1 crm fuse.cc Deadlock when two client tend to delete the same file at the same time Control flow bypasses releaselock after aquirelock. It happens when two replicas are concurrently removing one file
2 crm rand_tester.cpp Unhandled file operation failures A client asks whether a file exists. The server says yes. Then another client removes the file. Then the first client will fail in opening the file.
3 crm UNKNOWN Sb keeps sending “out of bound” tickets UNKNOWN
4 crm UNKNOWN Deadlock after killing one lock_server UNKNOWN


The fixing flag of bug# will be FIX_BUG_#. To avoid the bug, compile the code using $ g++ -DFIX_BUG_# …

Code segment for bug1

if (yfs_client::OK != (ret = yfs->acquirelock(parent_inum))) {
	fuse_reply_err(req, EPIPE);
	goto release_one;
}

...

if (!found) {
	fuse_reply_err(req,ENOENT);
	//	the next r eturn might lead to deadlock!!!!!!
	//	instead we should use release_one
	return;
} else {
	if (yfs_client::OK != yfs->acquirelock(thisinum)) {
		fuse_reply_err(req,EPIPE);
		goto release_both;
	}
	...
}

release_one:
yfs->releaselock(parent_inum);

Code segment for bug2
...
if (stat(filename.c_str(), &fst) != -1)
{
	...
	return;
}

FILE *file = fopen(filename.c_str(), "w");
//	Here should handle file==NULL failures!!!!!!!!!
fprintf(file, "%s\n", gen_string(10).c_str());
fclose(file);