BeeGFS is a scalable network-storage platform with a focus on being distributed, resilient, highly configurable and having good performance and high reliability. BeeGFS is extremely configurable, with administrators being able to control virtually all aspects of the system. A command line interface is used to monitor and control the cluster.
- BeeGFS (formerly FhGFS) is a parallel file system, developed and optimized for high-performance computing. BeeGFS includes a distributed metadata architecture for scalability and flexibility reasons. Its most important aspect is data throughput. BeeGFS was originally developed at the Fraunhofer Center for High Performance Computing in Germany by a team around Sven Breuner, who later became the CEO of ThinkParQ, the spin-off company that was founded in 2014 to maintain BeeGFS and offer professional services.
- BeeGFS is the leading parallel cluster file system, developed with a strong focus on performance and designed for very easy installation and management. If I/O intensive workloads are your problem, BeeGFS is the solution.
- 1 Terminology
- 2 Installation
- 3 Server tuning and advanced features
- 4 See also
|Node Type and Description||Packages|
|Management Server (one node)
|Metadata Server (at least one node)
|Storage Server (at least one node)
|InfluxDB / Grafana based Monitoring Server (optional)
|BeeGFS utilities for administrators
In addition to the free and open-source packages described here, BeeGFS also offers a number of Enterprise Features and Professional Support, which include:
- High Availability
- Quota Enforcement
- Access Control Lists (ACLs)
- Storage Pools
- Burst buffer function with BeeOND
Example cluster deployment
The following hardware configuration will be used in this example:
||Management Server and Monitoring (optional) Server|
Install it with the package AUR on the management node
The management service needs to know where it can store its data. It will only store some node information like connectivity data, so it will not require much storage space and its data access is not performance critical. Thus, this service is typically not running on a dedicated machine.
storeMgmtdDirectory = /mnt/beegfs/beegfs-mgmtd
beegfs-mgmtd.service on the management node:
# systemctl start firstname.lastname@example.org # systemctl enable email@example.com
Install the package AUR on the management/monitoring node
192.168.0.1, which collects statistics from the system and provides them to the user using a time series database InfluxDB. For visualization of the data
beegfs-mon provides predefined Grafana panels that can be used out of the box.
beegfs-mon, you need to edit the configuration file
/etc/beegfs/beegfs-mon.conf. If you have everything installed on the same host, you only need to specify the management host:
sysMgmtHost = localhost
clientfor example or you need to use a different database port or name, you also need to modify the corresponding entries:
dbHostName = node04 dbHostPort = 9096 dbHostName = beegfs_mon_client
beegfs-mon.service on the management/monitoring node:
# systemctl start firstname.lastname@example.org # systemctl enable email@example.com
Configuration of default Grafana panels
# cd /etc/beegfs/grafana # ./import-dashboards default
Accessing Grafana panels
/mnt/beegfs/beegfs-mgmtdfor management servers and
/mnt/beegfs/beegfs-monfor monitoring servers.
Install the package AUR on the metadata server(s), i.e.
The metadata service needs to know where it can store its data and where the management service is running. Typically, one will have multiple metadata services running on different machines.
sysMgmtdHost = node01 storeMetaDirectory = /mnt/beegfs/beegfs-meta
beegfs-meta.service on the metadata node.
# systemctl start firstname.lastname@example.org # systemctl enable email@example.com
Install the package AUR on the storage server(s), i.e.
The storage service needs to know where it can store its data and how to reach the management server. Typically, one will have multiple storage services running on different machines and/or multiple storage targets (e.g. multiple RAID volumes) per storage service.
sysMgmtdHost = node01 storeStorageDirectory = /mnt/beegfs/beegfs-storage
beegfs-storage.service on the storage node.
# systemctl start firstname.lastname@example.org # systemctl enable email@example.com
The client service needs to know where it can reach the management server.
sysMgmtdHost = node01
The client service needs to know where it can mount the cluster storage, as well as the location of teh client configuration file.
Load the Kernel module and its dependencies.
# modprobe beegfs
beegfs-helperd.service on the client node:
# systemctl start firstname.lastname@example.org # systemctl enable email@example.com
beegfs-client.service on the client node:
# systemctl start beegfs-client.service # systemctl enable beegfs-client.service
Install the package AUR.
Check the detected network interfaces and transport protocols from a client node with the following commands:
# beegfs-ctl --listnodes --nodetype=mgmt --nicdetails node01 [ID: 1] Ports: UDP: 8008; TCP: 8008 Interfaces: + enp0s31f6[ip addr: 192.168.0.1; type: TCP]
# beegfs-ctl --listnodes --nodetype=meta --nicdetails node02 [ID: 2] Ports: UDP: 8005; TCP: 8005 Interfaces: + eno1[ip addr: 192.168.0.2; type: TCP]
# beegfs-ctl --listnodes --nodetype=storage --nicdetails node03 [ID: 3] Ports: UDP: 8003; TCP: 8003 Interfaces: + eno1[ip addr: 192.168.0.3; type: TCP]
# beegfs-ctl --listnodes --nodetype=client --nicdetails 4E451-5DAEDCBF-node04 [ID: 4] Ports: UDP: 8004; TCP: 0 Interfaces: + wlo1[ip addr: 192.168.0.4; type: TCP]
Server tuning and advanced features
- Explicitly Install AUR, which will provide
libbeegfs-ib.soshared object libraries.
- Enable support for RDMA-capable network hardware.
- Rebuild the client kernel module.