My Project
Disk Configuration Files
Author
Timo Bingmann (2013-2014)

A main feature of the STXXL is to take advantage of parallel access to multiple disks. For this, you must define the disk configuration in a text file, using the syntax described below. If no file is found at the locations below, STXXL will by default create a 1000 MiB file in /var/tmp/stxxl on Unix or in the user's temp directory on Windows.

These are the locations STXXL will look for a disk configuration file on Linux/Unix systems, in order of precedence:

Warning
On many Linux distributions the $HOSTNAME variable is not exported. For the host specific configuration to work, you must add "export HOSTNAME" to your shell configuration (.bashrc).

On Windows systems, STXXL looks for a disk configuration file in the following directories:

Note
In a default Windows 7 installation, %APPDATA% is C:\Users\<username>\AppData\Roaming
You can visit your %APPDATA% directory by simply entering "%APPDATA%" in the Windows Explorer address/location line.

Disk Configuration File Format

Each line of the configuration file describes a disk. Lines starting with '#' are comments.

A disk description uses the following format:

disk=<path>,<capacity>,<fileio> <options>

Description of the parameters:

Example:

disk=/data01/stxxl,500G,syscall unlink
disk=/data02/stxxl,300G,syscall unlink

On Windows, one usually uses different disk drives and wincall.

disk=c:\stxxl.tmp,700G,wincall delete
disk=d:\stxxl.tmp,200G,wincall delete

On Linux you can try to take advantage of NCQ + Kernel AIO queues:

disk=/data01/stxxl,500G,linuxaio unlink
disk=/data02/stxxl,300G,linuxaio unlink

Recommended: File System XFS or Raw Block Devices

The library benefits from direct transfers from user memory to disk, which saves superfluous copies. We recommend to use the XFS file system, which gives good read and write performance for large files. Note that file creation speed of XFS is a bit slower, so that disk files should be precreated for optimal performance.

If the filesystems only use is to store one large STXXL disk file, we also recommend to add the following options to the mkfs.xfs command to gain maximum performance:

$ mkfs.xfs -d agcount=1 -l size=512b

The following filesystems have been reported not to support direct I/O: tmpfs , glusterfs . By default, STXXL will first try to use direct I/O (O_DIRECT open flag). If that fails, it will print a warning and open the file without O_DIRECT.

Note
It is also possible to use raw disk devices with syscall.
Just use disk=/dev/sdb1 or similar. This will of course overwrite all data on the partitions! The I/O performance of raw disks is generally more stable and slightly higher than with file systems.
disk=/dev/sdb1,0,syscall raw_device
The raw_device flag is only for verification, STXXL will automatically detect raw block devices and also their size.

Log Files

STXXL produces two kinds of log files, a message and an error log. By setting the environment variables STXXLLOGFILE and STXXLERRLOGFILE, you can configure the location of these files. The default values are stxxl.log and stxxl.errlog, respectively.

Precreating External Memory Files

In order to get the maximum performance one can precreate disk files described in the configuration file, before running STXXL applications. A precreation utility is included in the set of STXXL utilities in stxxl_tool. Run this utility for each disk you have defined in the disk configuration file:

$ stxxl_tool create_files <capacity> <full_disk_filename...>
// for example:
$ stxxl_tool create_files 1GiB /data01/stxxl

User-Supplied disk_config Structures

With STXXL >= 1.4.0, the library can also be configured via the user application.

All disk configuration is managed by the stxxl::config class, which contains a list of stxxl::disk_config objects. Each stxxl::disk_config object encapsulates one disk= lines from a config file, or one allocated disk.

The disk configuration must be supplied to the STXXL library before any other function calls, because the stxxl::config object must be filled before any external memory blocks are allocated by stxxl::block_manager.

int main()
{
// get uninitialized config singleton
stxxl::config * cfg = stxxl::config::get_instance();
// create a disk_config structure.
stxxl::disk_config disk1("/tmp/stxxl.tmp", 100 * 1024 * 1024, "syscall unlink");
disk1.direct = stxxl::disk_config::DIRECT_ON; // force O_DIRECT
// add disk to config
cfg->add_disk(disk1);
// add another disk
cfg->add_disk( disk_config("disk=/tmp/stxxl-2.tmp, 10 GiB, syscall unlink") );
// ... add more disks
// use STXXL library as usual ...
}