#include #include #include #include #include #include #include #include #include /* ********************************************************* License: syncbench: a program to help determine optimal i/o synchronization mechanisms written by Mark Travis Copyright (c) 2004 Mark Travis Permission to use, copy, modify, and distribute this software and its documentation for any purpose and without fee is hereby granted, provided that the above copyright notice appear in all copies and that both that copyright notice and this permission notice appear in supporting documentation, and that the name of the author not be used in advertising or publicity pertaining to distribution of the software without specific, written prior permission. The author makes no representations about the suitability of this software for any purpose. It is provided "as is" without express or implied warranty. GOAL: To help "Determine optimal fdatasync/fsync, O_SYNC/O_DSYNC options" from the PostgreSQL TODO list (http://developer.postgresql.org/todo.php) WHAT THIS DOES: Writes sequential chunks of data to a file and synchronizes with the I/O device in a variety of ways. Gives results in microseconds. EDIT BEFORE COMPILING: There are 6 things which must be defined in order for this thing to compile. FILE_SYNC and OPEN_SYNC define how file contents are to be sync'd, and each platform has various options. FILE_SYNC defines the function call made to sync after each write(). Most platforms should support fsync(2) at a minimum. fdatasync(2) should be faster on platforms which support it. OPEN_SYNC defines the flag to be set in open(2) for synchronizing all writes without explicitly calling fsync(2) or its kind. Most platforms should support O_SYNC at a minimum. O_DSYNC should be faster on platforms which support it. And some environments have neither. I could only find O_FSYNC on FreeBSD 4.10. Anyway, reading man pages for open(2), fsync(2), and fdatasync(2) is probably a good idea before setting FILE_SYNC and OPEN_SYNC. ****************************************************** */ /* #define FILE_SYNC(X) fdatasync(X) #define OPEN_SYNC O_DSYNC */ /* ********************************************************** The other 4 macros are CHUNKSIZE, CHUNKS, FILESIZE_MULTIPLIER, and SLEEP. CHUNKSIZE is the size of each chunk of data written in the test. CHUNKS is the number of times they are written. All writes are sequential. Before the test is run, a file is created, filled with "A" characters, fsync(2)'d, and then closed. Then we sleep for SLEEP seconds before proceeding with the actual benchmark run. The sleep is to let the I/O device quiesce if it wants to. The size of the file created is FILESIZE_MULTIPLER times CHUNKSIZE times CHUNKS. So if FILESIZE_MULTIPLIER exceeds 1 then the file will be bigger than the amount of data written to it. Having the file size equal to or exceeding the amount of data written should help to simulate real-world WAL behavior. Extending the size of a file requires extra work for the filesystem to perform. To learn the impact of that, go ahead and set FILESIZE_MULTIPLIER to less than 1 if you want. ******************************************************* */ /* #define CHUNKSIZE 8 * 1024 #define CHUNKS 2 * 1024 #define FILESIZE_MULTIPLIER 2 #define SLEEP 5 */ /* ******************************************************* Nothing else should need to be modified from this point, but please keep reading for building and running instructions. BUILDING: It should be simple as long as the right tools are in your $PATH (cc, etc.) Linux and *BSD seem to be happy with this: cc syncbench.c -o syncbench Solaris 8 requires -lrt in order to support fdatasync: cc syncbench.c -lrt -o syncbench Other O/S's you'll have to figure out on your own. RUNNING: It takes two arguments. The first is the name of the file to use for the test. If the file doesn't exist already, it will be created. If it does exist already, then it will be truncated before writing. The second argument is the mode in which file data is synchronized to disk. It can be one of buffered, filesync and opensync. buffered means no syncing takes place. Unless your filesystem is mounted synchronously this should be the fastest option by far. filesync executes the function defined by FILE_SYNC above after each chunk of data is written. opensync open()'s the file with the flag defined in OPEN_SYNC. PostgreSQL supports 5 methods of sync'ing if you count "fsync=false". These methods are in postgresql.conf under the "WRITE AHEAD LOG" section. Here's how to use this tool to try and simulate those methods: PostgreSQL Method: Syncbench fsync=false run with 2nd argument=buffered fsync #define FILE_SYNC(X) fsync(X) 2nd argument=filesync fdatasync #define FILE_SYNC(X) fdatasync(X) 2nd argument=filesync open_sync #define OPEN_SYNC O_SYNC 2nd argument=opensync open_datasync #define OPEN_SYNC O_DSYNC 2nd argument=opensync Obviously, if the platform doesn't support those things for whatever reason then the program won't build. RESULTS: The number of microseconds between writes starting and ending is displayed. Time taken to pre-populate, open(), or close() the datafile is not included in this calculation. Obviously, the results will vary based on the amount of data written, synchronization options used, hardware, O/S, filesystem, etc. It's a good idea to try to run this on systems which are as idle as possible -- especially the I/O device(s) being tested. ******************************************************* */ int main(int argc, char *argv[]) { int fd, n, bm_openflag, bm_do_filesync; char *buf; struct timeval tv_before, tv_after; struct timezone tz_garbage; if ( argc != 3 ) { printf("usage: %s \n", argv[0] ); exit(1); } bm_openflag=0; bm_do_filesync=0; if ( !strncmp( argv[2], "buffered", strlen("async") ) ) { printf("test uses %s\n", argv[2]); } else if ( !strncmp( argv[2], "filesync", strlen("filesync") ) ) { printf("test uses %s\n", argv[2]); bm_do_filesync=1; } else if ( !strncmp( argv[2], "opensync", strlen("opensync") ) ) { printf("test uses %s\n", argv[2]); bm_openflag=OPEN_SYNC; } else { puts("Second argument must be one of async filesync opensync"); exit(1); } printf("Starting test with FILESIZE: %i, CHUNKSIZE: %i, CHUNKS: %i\n", \ FILESIZE_MULTIPLIER * CHUNKSIZE * CHUNKS, CHUNKSIZE, CHUNKS ); fd = open( argv[1], O_WRONLY | O_CREAT | O_TRUNC, 0666 ); if ( fd == -1 ) { perror("Can't create data file"); exit(1); } buf = malloc(CHUNKSIZE); if ( buf == NULL ) { puts("malloc choked for some reason. Bye!"); exit(1); } /* ************************************************************** Make sure that the whole file is not made up of NULLs. I seem to recall a characteristic of *NIX filesystems that likes to not populate NULL-filled files with actual blocks full of NULLs. This may cause actual population of the file to cause an update of metadata, which might cause performance issues. Anyway, pre-populate the file with "A" to hedge against that. It may be desirable to pre-populate just like WAL is pre-populated. Or maybe this doesn't matter. *************************************************************** */ memset(buf, 65, CHUNKSIZE); for ( n=0; n