Automating Folding@Home Setup: Expediting Distributed Computing Research

Posted 4 years ago by Petr Bednar

logo One in a million

For nearly two decades, volunteer distributed computing projects have helped advance scientific research and understanding of infections and hereditary diseases. Scientists need to understand how diseases work to formulate therapies to fight them. One key way to arrive at this understanding is to create models and run multitudes of molecular dynamics simulations. 

These simulations are huge and resource-intensive. Even with the help of big cloud providers and onsite clusters, scientists still need computation resources to power these simulations. Using client-server systems like BOINC, volunteers run simulations called work units using unused CPU, GPU, and gaming system resources, then upload them to the server. 

We recently joined more than 700,000 computer owners who have donated their unused computing resources to one established, and promising distributed computing project, Folding@home (FAH or F@h). While F@H’s initial focus was on protein folding, it has pivoted to biomedical problems, such as Alzheimer’s disease and cancer, and now, COVID-19.  We were curious about F@H’s distributed computing model, and decided to take a deeper look at the technology. 

We have decided to contribute to Folding@home with our headless server infrastructure. It wasn’t completely trivial, so we decided to write this blog to help those, who will be willing to contributing with their servers on the scale. A set of automated scripts described below can simplify their job.

Technical approach

We fully automated the client installation process, as the official installation needs a few interactive actions, like setting a username, team number, etc. It isn’t difficult to do this on more than one server, although it could take a bit more time. Our goal was to simplify deployment for Debian/ Mint/Ubuntu. 

You can choose a username, and optionally, join a team or “fold” anonymously. To create a username and get a passkey, you need to add your name and email here. F@H will send a passkey to your email. The passkey is unique to you and prevents other users from using your username. The benefit of having a username is that you can check your stats and compare your performance to others. You can also create your own team by completing a simple form here

First, we need to avoid interactive actions. The best way to do this is with a Bash script with arguments as input data for a configuration.

Explain arguments:

-u | --user = user name
-t | --team =  team number (default: 0)
-p | --passkey = user passkey
-l | --power = folding power (default: medium)
-n | --password = password for remote control
-g | --gpu = use GPU (false)

We used the official deb package version 7.4.4 (fahclient_7.4.4_amd64.deb) available on the F@H website.

The full installation script is available here:   https://gitlab.com/snippets/1963653/raw

Example run script with arguments:

curl -s https://gitlab.com/snippets/1963653/raw | sudo bash -s -- -u <<your_username>> -t <<team_number>> -p <<your_passkey>> --power full -n Passw0rd --gpu

Example run script as an anonymous user and without a GPU:

curl -s https://gitlab.com/snippets/1963653/raw | sudo bash

To change client settings, edit the /etc/fahclient/config.xml file, then restart the client:

sudo /etc/init.d/FAHClient start
sudo /etc/init.d/FAHClient stop
or
sudo /etc/init.d/FAHClient restart

FAHControl helps us control our local FAHClient and remote deployments. To add a remote FAHClient, you only need the IP address of the server/device and the FAHClient password, which was set in config.xml (or by setting the -n argument in our script):

Point system

After finishing a Work Unit (WU), you are awarded a specific amount of points. Points are determined by the performance of each contributor’s folding hardware (CPU, GPU, etc.) relative to a reference benchmark machine (more info here).

This kind of simulation can also utilize the parallel architecture of modern GPUs. For certain types of calculations, a GPU provides 20-30 times more computing power than CPU-based calculations. In addition, the FAHClient is quite intelligent. When you run it on a laptop which runs on a battery, the client will be automatically stopped.

And if you would like to verify that your deployment succeeded, find your team here and enjoy your stats increasing: https://stats.foldingathome.org/team/

Profiq team stats can be found here – https://stats.foldingathome.org/team/256122

Petr Bednar

Leave a Reply

Related articles