Automating [email protected] Setup: Expediting Distributed Computing Research
Posted 3 years ago by Petr Bednar
For nearly two decades, volunteer distributed computing projects have helped advance scientific research and understanding of infections and hereditary diseases. Scientists need to understand how diseases work to formulate therapies to fight them. One key way to arrive at this understanding is to create models and run multitudes of molecular dynamics simulations.
These simulations are huge and resource-intensive. Even with the help of big cloud providers and onsite clusters, scientists still need computation resources to power these simulations. Using client-server systems like BOINC, volunteers run simulations called work units using unused CPU, GPU, and gaming system resources, then upload them to the server.
We recently joined more than 700,000 computer owners who have donated their unused computing resources to one established, and promising distributed computing project, [email protected] (FAH or [email protected]). While [email protected]’s initial focus was on protein folding, it has pivoted to biomedical problems, such as Alzheimer’s disease and cancer, and now, COVID-19. We were curious about [email protected]’s distributed computing model, and decided to take a deeper look at the technology.
We have decided to contribute to [email protected] with our headless server infrastructure. It wasn’t completely trivial, so we decided to write this blog to help those, who will be willing to contributing with their servers on the scale. A set of automated scripts described below can simplify their job.
We fully automated the client installation process, as the official installation needs a few interactive actions, like setting a username, team number, etc. It isn’t difficult to do this on more than one server, although it could take a bit more time. Our goal was to simplify deployment for Debian/ Mint/Ubuntu.
You can choose a username, and optionally, join a team or “fold” anonymously. To create a username and get a passkey, you need to add your name and email here. [email protected] will send a passkey to your email. The passkey is unique to you and prevents other users from using your username. The benefit of having a username is that you can check your stats and compare your performance to others. You can also create your own team by completing a simple form here.
First, we need to avoid interactive actions. The best way to do this is with a Bash script with arguments as input data for a configuration.
-u | --user = user name -t | --team = team number (default: 0) -p | --passkey = user passkey -l | --power = folding power (default: medium) -n | --password = password for remote control -g | --gpu = use GPU (false)
We used the official deb package version 7.4.4 (fahclient_7.4.4_amd64.deb) available on the [email protected] website.
The full installation script is available here: https://gitlab.com/snippets/1963653/raw
Example run script with arguments:
curl -s https://gitlab.com/snippets/1963653/raw | sudo bash -s -- -u <<your_username>> -t <<team_number>> -p <<your_passkey>> --power full -n Passw0rd --gpu
Example run script as an anonymous user and without a GPU:
curl -s https://gitlab.com/snippets/1963653/raw | sudo bash
To change client settings, edit the /etc/fahclient/config.xml file, then restart the client:
sudo /etc/init.d/FAHClient start sudo /etc/init.d/FAHClient stop or sudo /etc/init.d/FAHClient restart
FAHControl helps us control our local FAHClient and remote deployments. To add a remote FAHClient, you only need the IP address of the server/device and the FAHClient password, which was set in config.xml (or by setting the -n argument in our script):
After finishing a Work Unit (WU), you are awarded a specific amount of points. Points are determined by the performance of each contributor’s folding hardware (CPU, GPU, etc.) relative to a reference benchmark machine (more info here).
This kind of simulation can also utilize the parallel architecture of modern GPUs. For certain types of calculations, a GPU provides 20-30 times more computing power than CPU-based calculations. In addition, the FAHClient is quite intelligent. When you run it on a laptop which runs on a battery, the client will be automatically stopped.
And if you would like to verify that your deployment succeeded, find your team here and enjoy your stats increasing: https://stats.foldingathome.org/team/
Profiq team stats can be found here – https://stats.foldingathome.org/team/256122