Gfarm files system

From Wikisym

Jump to: navigation, search

Contents

Grid Datafarm for Petascale Data Intensive Computing

Grid Datafarm is a Petascale data-intensive computing project initiated in Japan. The project is a collaboration among High Energy Accelerator Research Organization (KEK), National Institute of Advanced Industrial Science and Technology (AIST), the University of Tokyo, and Tokyo Institute of Technology. The challenge involves construction of a Peta- to Exascale parallel filesystem exploiting local storages of PCs spread over the world-wide Grid.

Image:Gfarm-logo.gif
Grid Data Farm

Introduction

Gfarm

Gfarm is a reference implementation of the Grid Datafarm architecture designed for global petascale data-intensive computing. It provides Gfarm Grid file system that is a shared file system in cluster or Grid that can scale up to petascale storage, and realize scalable I/O bandwidth and scalable parallel processing.


Gfarm Grid file system is a virtual file system that integrates local disks of compute/filesystem nodes. It consists of

many compute/filesystem nodes, and Gfarm metadata server node. On each compute/filesystem node, the Gfarm file system daemon (gfsd) is running to facilitate remote file operations with access control in the Gfarm filesystem as well as file replication, fast invocation, and node resource status monitoring. Gfarm metadata server node manages Gfarm filesystem metadata and parallel process information, on which the Gfarm job manager (gfmd), and filesystem metadata server (slapd) are running.

For further information about scalable I/O performance, reliable file access, and so on, see the introduction of Gfarm.

GfarmFS-FUSE

GfarmFS-FUSE enables you to mount a Gfarm filesystem in userspace.

Document

Downloads

Installation and Configuration

Basic Command

Problems

FAQ

Personal tools