By Peter Unmack
First you have to establish an account. You can register here.
From memory, after you first setup an account you need to create a directory for your files. Then click on data, click on upload data, add a name for the data file (that is how it gets listed, can be different, or the same as the actual data file--I keep it the same to avoid confusion). Upload your files, these are usually your data files and any other accessory files like partition details for RAxML (I haven't found the multiple upload thing to be of much good the couple times I've tried it). Don't worry about any of the drop down boxes. Then hit save.
Go to tasks, click on new task the first time, for subsequent tasks I just clone an existing one and modify it, that way you only have to change a couple of settings. Name your task, then click somewhere on the page, that will reload the page and save your task name (check that it is what you want before you hit save task and run in case it didn't save), select your data file, then select the analysis (RAxML, but not the blackbox version), then set parameters.
Each option under parameters is linked to a help file which is pretty straight forward most of the time relative to what you need to do, or what the options are.
Max hours run, this sets how long the run will go before the system forces it to stop. The longer this is set for means it takes longer for the run to make it into the queue. 0.5 hours goes in almost straight away, 2 hours goes in relatively soon, longer runs will often have some delay depending on how busy the server is. Most small and moderate sized datasets (e.g., 1-3 genes, <200 otus) will be done in under 5 minutes (for 1000 reps).
Most of the settings can be left as per the default values provided.
Enter Outgroup (one or more comma-separated outgroups, but no spaces), e.g. name1,name2,name3.
If you wish to use a constraint tree or mixed/partitioned model then select the file under that option (you have to have already uploaded it as a plain text file).
This is a simple example of the file you'd need to upload by codon position that you need for running partitioned models for a single gene with 1140 bp (what you call the gene1codon1 label is optional, the rest has to be identical):
DNA, gene1codon1 = 1-1140\3
DNA, gene1codon2 = 2-1140\3
DNA, gene1codon3 = 3-1140\3
This is a more complicated one when codon positions aren't sequential (this one is by positions 1+2 as a single partition and position 3 as a separate partition):
DNA, gene1codon1 = 1-2183\3, 2-2183\3, 2184-2866\3, 2185-2866\3, 2867-2888\3, 2868-2888\3, 2889-3178\3, 2890-3178\3, 3179-4559\3, 3180-4559\3, 4560-6961\3, 4561-6961\3, 6962-8057\3, 6963-8057\3
DNA, gene1codon2 = 3-2183\3, 2186-2866\3, 2869-2888\3, 2891-3178\3, 3181-4559\3, 4562-6961\3, 6964-8057\3
Hit Advanced Parameters if it isn't already shown.
Select the model to run. GTRCAT is better for larger datasets and is not recommended for anything with less than 50 OTUs. I've never seen any difference between either model when I've tried both, but with ~50 or less I always use GTRGAMMA.
Under Configure Bootstrapping
The only setting here I set is let RAxML halt bootstrapping automatically (you have to unset "Specify an Explicit Number of Bootstraps" first) and I use the default options suggested by RAxML. If you want to set the number of bootstraps then enter the value in the box above.
You need to change the Enter a random seed value for rapid bootstrapping value each run (with the same dataset).
Hit Save Parameters
Hit Save & Run Task
When your task is finished (you get an email when it completes) you go to the task list and hit view output, you can either download individual files, or the whole thing. I usually grab stdout.txt as that has all the details of the run including the command, RAxML version number, likelihood score (right near the end of the file), how long the run took, etc. I also grab RAxML_bipartitions.result as this has the best ML tree with the bootstrap values on it. I add .nwk to the extension and then open it with MEGA (either double click it, or within MEGA go User tree>Display Newick Trees). I don't grab any of the other files (might be good to check the stderr.txt file for any errors though, should be 0 kb if there are none).
Don't forget to cite the CIPRES resource in your paper, as well as RAxML too!
Miller, M.A., Pfeiffer, W., and Schwartz, T. (2010) Creating the CIPRES Science Gateway for inference of large phylogenetic trees. In, Proceedings of the Gateway Computing Environments Workshop (GCE), 14 Nov. 2010, New Orleans, LA pp 1-8.
Stamatakis, A. (2014) RAxML Version 8: A tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics, 30, 1312–1313. https://academic.oup.com/bioinformatics/article-lookup/doi/10.1093/bioinformatics/btu033
Back to Unmack's Molecular Phylogenetics page.