Non-standard residue definitions
Generalities
For handling non-standard atoms and residues like ions or ligands the user has to modify several of the CNS specific scripts and definition files. Usually the CNS script "generate.inp" is executed for generating a PSF file. This has to be patched manually and copied to the cns/protocols directory of the run's local directory tree if one wants to setup the PSF file in the ARIA run.
Instead of modifying the "generate.inp" script, one could copy the PSF and the "_template" PDB file to the "cns/begin" directory and disable the generation of these files via the GUI (node "CNS" in the "Structure Generation" sub-tree or the project tree). Also the definition files for methyl groups ("methyls.tbl") and of prochiral groups ("setup_swap_list.tbl"), located in the same directory, should be checked manually.
Additionally topology, parameter and linkage files have to be modified. Make sure that you introduce the right bonds if you want to use torsion angle dynamics in the structure generation. User-specific files can be set via the GUI in the "Sequence" panel.
The Hetero-compound Information Centre might assist you in creating the necessary patches and modifications in the definition files.
An example of introducing modified residues
To help you to set-up your own example, here the description of a project on a peptide containing modified residues. This example was provided by Hélène Déméné and Gaetan Bellot, from Centre de Biochimie Structurale (Montpellier, France). The full example is described in the Tutorials folder, and only the data related to the chemical modifications are presented in this page.
The example presents the study of a cyclic peptide interacting with G protein-coupled receptors (GPCRs), and more details about the corresponding research can be found on the page of Hélène Déméne.
The peptide sequence, shown below, contains 4 modified amino-acids: GLX (chiral D Glutamine), NLE (Norleucine), GLI (Glycine before cyclisation) and CYC (cyclisated Cysteine).
GLI | GLX | VAL | LEU | ILE | PHE | ARG | GLU | ILE | HIS | ALA | SER | LEU | VAL | PRO | GLY | PRO | SER | GLU | ALA |
GLY | ARG | ARG | ARG | ARG | GLY | ARG | ARG | THR | GLY | SER | PRO | SER | GLU | GLY | ALA | HIS | VAL | SER | ALA |
ALA | NLE | ALA | LYS | THR | VAL | ARG | NLE | THR | CYC | | | | | | | | | | |
The following files have to be modified:
- CNS files contained in the cns/toppar directory (in the ARIA distribution as well as in the run):
- topology libraries: topalldg5.3.pro and topalldg.5.3.pep
- parameter files parall* may have to be modified, in case of the introduction of new atom, types, or new potential terms.
- ARIA files:
- in the directory src/py/data of the ARIA distribution: atomnames.xml and iupac.xml
- in the directory src/py/legacy of the ARIA distribution: PseudoAtom.py, Nomenclature.py and AminoAcid.py
CNS Files
In the file topallhdg5.3.pro, the residues GLX, NLE, GLI and CYC were added:
residue GLX
group
atom N type=NH1 charge=-0.36 end
atom HN type=H charge= 0.26 end
atom CA type=CH1E charge= 0.00 end
atom HA type=HA charge= 0.10 end
atom CB type=CH2E charge=-0.20 end
atom HB1 type=HA charge= 0.10 end
atom HB2 type=HA charge= 0.10 end
atom CG type=CH2E charge=-0.20 end
atom HG1 type=HA charge= 0.10 end
atom HG2 type=HA charge= 0.10 end
atom CD type=C charge= 0.48 end
atom OE1 type=O charge=-0.48 end
atom NE2 type=NH2 charge=-0.52 end
atom HE21 type=H charge= 0.26 end
atom HE22 type=H charge= 0.26 end
atom C type=C charge= 0.48 end
atom O type=O charge=-0.48 end
bond N HN
bond N CA bond CA HA
bond CA CB bond CB HB1 bond CB HB2
bond CB CG bond CG HG1 bond CG HG2
bond CG CD
bond CD OE1
bond CD NE2 bond NE2 HE21 bond NE2 HE22
bond CA C
bond C O
improper HA C N CB
improper CD CG OE1 NE2
improper NE2 CD HE21 HE22
improper CG CD NE2 HE21
improper HB1 HB2 CA CG
improper HG1 HG2 CB CD
dihedral CG CB CA N
dihedral CD CG CB CA
dihedral OE1 CD CG CB
end
Residue NLE
residue NLE
group
atom N type=NH1 charge=-0.360 end
atom HN type=H charge= 0.260 end
atom CA type=CH1E charge= 0.000 end
atom HA type=HA charge= 0.100 end
atom CB type=CH2E charge=-0.200 end
atom HB1 type=HA charge= 0.100 end
atom HB2 type=HA charge= 0.100 end
atom CG type=CH2E charge=-0.200 end
atom HG1 type=HA charge= 0.100 end
atom HG2 type=HA charge= 0.100 end
atom CD type=CH2E charge=-0.200 end
atom HD1 type=HA charge= 0.100 end
atom HD2 type=HA charge= 0.100 end
atom CE type=CH3E charge=-0.200 end
atom HE1 type=HA charge= 0.100 end
atom HE2 type=HA charge= 0.100 end
atom HE3 type=HA charge= 0.100 end
atom C type=C charge= 0.480 end
atom O type=O charge=-0.480 end
bond N HN
bond N CA bond CA HA
bond CA CB bond CB HB1 bond CB HB2
bond CB CG bond CG HG1 bond CG HG2
bond CG CD bond CD HD1 bond CD HD2
bond CD CE bond CE HE1 bond CE HE2 bond CE HE3
bond CA C
bond C O
improper HA N C CB
improper HB1 HB2 CA CG
improper HG1 HG2 CB CD
improper HD1 HD2 CG CE
improper HE1 HE2 CD HE3
dihedral CG CB CA N
dihedral CD CG CB CA
dihedral CE CD CG CB
end
Residue GLI
residue GLI
group
atom N type=NH1 charge=-0.570 end
atom HN type=H charge= 0.370 end
atom CA type=CH2G charge= 0.200 end
atom HA1 type=HA charge= 0.000 end
atom HA2 type=HA charge= 0.000 end
atom C type=C charge= 0.500 end
atom O type=O charge=-0.500 end
bond N HN
bond N CA bond CA HA1 bond CA HA2
bond CA C
bond C O
DONO HN N
ACCE O C
improper HA1 HA2 N C
end
Residue CYC
residue CYC
group
atom N type=NH1 charge=-0.36 end
atom HN type=H charge= 0.26 end
atom CA type=CH1E charge= 0.00 end
atom HA type=HA charge= 0.10 end
atom CB type=CH2E charge=-0.20 end
atom HB1 type=HA charge= 0.10 end
atom HB2 type=HA charge= 0.10 end
atom CG type=CH2E charge=-0.20 end
atom HG1 type=HA charge= 0.10 end
atom HG2 type=HA charge= 0.10 end
atom SG type=SM charge=-0.14 end
atom CD type=C charge= 0.48 end
atom OE1 type=O charge=-0.48 end
atom NE2 type=NH2 charge=-0.52 end
atom HE21 type=H charge= 0.26 end
atom HE22 type=H charge= 0.26 end
atom OC type=O charge=-0.29 end
atom C type=C charge= 0.48 end
bond N HN
bond N CA bond CA HA
bond CA CB bond CB HB1 bond CB HB2
bond CB SG bond SG CG bond CG HG1 bond CG HG2
bond CG CD bond CD OC
bond C OE1
bond C NE2 bond NE2 HE21 bond NE2 HE22
bond CA C
improper HA N C CB
improper HB1 HB2 CA SG
improper CG SG HG1 HG2
improper C CA OE1 NE2
improper NE2 C HE21 HE22
improper CA C NE2 HE21
angle CB SG CG
angle SG CG CD
dihedral SG CB CA N
dihedral CA CB SG CG
dihedral CB SG CG CD
dihedral SG CB CA C
dihedral SG CG CD OC
dihedral OE1 C CA CB
end
An additional residue PEPC is used for the peptide cyclisation:
PRESidue PEPC
ADD BOND -CD +N
ADD ANGLE -CG -CD +N
ADD ANGLE -OC -CD +N
ADD ANGLE -CD +N +CA
ADD ANGLE -CD +N +HN
ADD DIHEdral -SG -CG -CD +N
ADD DIHEdral -CG -CD +N +CA
ADD IMPRoper -CG -CD +N +HN
ADD IMPRoper -OC -CD +N +CA
ADD IMPRoper -CG -CD +N +CA
end
File topallhdg5.3.pep
The definition of residues allowed at the Cterm and Nterm places is modified, because CYC can be N-terminal only, and GLI C-terminal only.
first PROP tail + PRO end
first NTER tail + ALA end
first NTER tail + ARG end
first NTER tail + ASN end
first NTER tail + ASP end
first NTER tail + CYS end
first NTER tail + GLN end
first NTER tail + GLU end
first NTER tail + GLY end
first NTER tail + HIS end
first NTER tail + ILE end
first NTER tail + LEU end
first NTER tail + LYS end
first NTER tail + MET end
first NTER tail + PHE end
first NTER tail + PRO end
first NTER tail + SER end
first NTER tail + THR end
first NTER tail + TRP end
first NTER tail + TYR end
first NTER tail + VAL end
first NTER tail + GLX end
first NTER tail + NLE end
first NTER tail + CYC end
last CTER head - ALA end
last CTER head - ARG end
last CTER head - ASN end
last CTER head - ASP end
last CTER head - CYS end
last CTER head - GLN end
last CTER head - GLU end
last CTER head - GLY end
last CTER head - HIS end
last CTER head - ILE end
last CTER head - LEU end
last CTER head - LYS end
last CTER head - MET end
last CTER head - PHE end
last CTER head - PRO end
last CTER head - SER end
last CTER head - THR end
last CTER head - TRP end
last CTER head - TYR end
last CTER head - VAL end
last CTER head - GLX end
last CTER head - NLE end
last CTER head - GLI end
ARIA Files
The following files have to be modified in order to include the modified residues:
- src/py/data/atomnames.xml: contains the atom names of all possible residues. see the file
- src/py/data/iupac.xml: IUPAC atom names. see the file
- src/py/legacy/PseudoAtom.py: Definition of pseudo atoms. see the file
- src/py/legacy/Nomenclature.py: rules for converting atom names and definitions of methylene and methyl groups. see the file
- src/py/legacy/AminoAcid.py: rules for converting residue names (one-letter and three-letter codes). see the file
An example of run with thecyclic peptide presented here, is available in the tutorial page Modified residues.