Project preparation
To create, setup and run a typical ARIA project, the following steps are required:
Data conversion
ARIA provides data-conversion routines in the following way: all data-sources must be specified in a XML-file: "conversion.xml". To create an empty conversion template XML-file use the following command:
aria2 --convert -t conversion.xml
"-t" means "template". This command creates a new (text) file, "conversion.xml", formatted in XML. It is a template containing all essential informations needed for data-conversion.
The next step is to specify data information, for instance the data format. ARIA supports now the following formats: ANSIG, NMRVIEW, XEASY. The proton and heteronuclei dimensions of the spectra should be also specified, as well as the spectra location. This is done by filling in some fields in the newly created "conversion.xml" file, using your favorite text editor.
Once your completed "conversion.xml" file has been stored, run ARIA again to perform the actual conversion:
aria2 --convert conversion.xml
This command first loads and parses the conversion file. It then reads in your data files (i.e spectra, shifts-lists, sequence-file), converts them into XML and stores them. The new data XML files can then be edited using a XML or text-editor.
This conversion step can be bypassed, in the version 2.1, by the direct use of a CCPN project. Within this functionnality, the user can choose in the CCPN project the set of data he would like to import into ARIA. Inversely, it is also possible to save into a CCPN project the data (restraints, coordinates) generated by ARIA. The CCPN import and export tools are available only from the Graphic User Interface.
The FormatConvertor tool from CCPN can be also used to convert data files to the ARIA xml format.
aria2 --convert_ccpn conversion.xml
The project XML file
The complete definition of a project, i.e. data-sources, parameters for the minimization protocol, parameters for the other sub-modules of ARIA etc., is encapsulated in the project's XML-file ("project-xml").
If a project filename has been specified in the conversion-xml file, a project XML file has been generated during data conversion. In that case, the project-xml already references the converted data. Otherwise, you can create an (empty) project template XML file by invoking the command:
aria --project_template new_project.xml
The information about the NMR data may then be filled in the fields referencing your data by hand, using a text editor, or it is possible to start the User Graphics Interface.
Additional information to put in the project xml file
It is here supposed that the project xml file was generated using the conversion utility of ARIA, and that the information about input data is therefore correct. Also, the information given above concerns only a run with generic parameters, without specific options (spin-diffusion correction, network anchoring, symetric homodimers). Description of the input required by the options are given in the Tutorials folder.
If you decided to fill the information by hand into the project xml file, here are some information about the fields which have to be filled in the first place.
A first set of information is located in the element project:
<project name="werner" version="1.0" author="terez" date="13 octobre 2005" description="test project"
comment="" references="" working_directory="./"
temp_root="tmp_dir/" run="1" file_root="hrdc" cache="yes"
cleanup="yes">
working_directory: Defines the root directory of your project. Every run is then stored in a separate sub-directory [working_directory]/runxxx.
temp_root: For every run, ARIA creates a directory [temp_root]/aria_temp.xxxxx to store all temporary (large) files (e.g. CNS output). If omited, the current directory will be used. When calculating on multiple machines, the temporary directory *must* be accessible from all machines.
run: Specifies the current run of a project. Every RUN is stored in a separate directory with the project directory as its root, i.e. [working_directory]/runxxxx.
file_root: The file_root is used to build filenames. E.g., if the file_root is set to "my_structure", the structure PDB-files are called my_structure_1.pdb etc.
cache: If enabled, ARIA caches all XML data-files (i.e. molecule definition and spectra). If a RUN is executed more than once, ARIA just accesses the cache instead of reading the complete XML files; this considerably speeds-up the reading process. If the original data (i.e. those that are stored in the run's local data-directory [working_directory]/RUNxxx/data) are modified, the cache is automatically invalidated so that the data are read from their original XML files.
In the element 'cns':
<cns local_executable="/bin/cns_solve"
keep_output="yes" keep_restraint_files="yes" create_psf_file="yes" generate_template="yes"
nonbonded_parameters="PROLSQ">
local_executable: Contains the complete path of the cns binary used for the cns set-up (generation of psf files, of extended structures).
The elements 'job_manager' and 'host' contain the parameters related to the generation of conformers on several computers:
<job_manager default_command="csh -f">
<host enabled="yes"
command="qsub -@ sge_options.sge -q queue_name"
executable="/bin/cns_solve"
n_cpu="2" use_absolute_path="yes"/>
<host enabled="yes"
command="ssh hotsname csh"
executable="/bin/cns_solve"
n_cpu="2" use_absolute_path="yes"/>
</job_manager>
default_command: Job-managers default command for job-dispatch. The default (and recommanded) value is"csh
-f".
enabled: Option to enable a given host.
command: Command to launch a job on the given host. Two possibilities are given in the example above: use the batch system SGE, or a simple ssh command to the host. In the first case, no hostname should be given, as sge is dispatching the CPU time between the users.
Note for qsub: the "--no-test" option is required when using qsub.
executable: The absolute path of the CNS binary on the given host. WARNING: if you are using qsub or another batch system dispatching automatically the CPU time between the users, as you are not giving any hostname, the absolute path should be the same on all the computers you will use. Two ways to do so are: (i) install cns on disk exported to all computer, (ii) have exactly the same cns installation on each computer.
If all calculations are run on the local computer, the syntax is:
<job_manager default_command="csh -f"/>
How to start the User Graphics Interface (GUI)
You may use ARIAs graphical user interface (GUI), an XML-editor or a text-editor to display and edit your project. However, ARIA's GUI is intended to streamline the project-setup and further provides brief descriptions / help for most of the parameter settings. For starting the GUI with the project "new_project.xml" use the command:
aria2 --gui new_project.xml
or
aria2 -g new_project.xml
If the project file is omitted, the GUI starts without loading any project.
Project setup
Once the project-xml file "new_project.xml" is complete, run ARIA to setup the project:
aria2 -s new_project.xml
This command reads in the XML-file "new_project.xml" and checks it for errors. If the project has been loaded successfully, the following actions are performed:
- Directory-tree creation: it creates the full directory tree needed by ARIA and CNS
- Data setup: copies all (XML) data-files from its source locations into the local "data" directory of the directory tree.
- Copy CNS-specific file from their source location (ARIAs installation path) into the local directory "cns/...".
Details on the directory tree can be found in the page The project directory tree.
If the project has already been set-up, re-setup skips all existing files. To enforce overwriting of existing files, use the option -f:
aria2 -sf new_project.xml
Forced setup overwrites/updates the following files
- Data files in the project's local data-directory.
- CNS-specific files.