# cttrajectory module¶

This is where the cttrajectory shinanigans occurs

cttrajectory.py

This is where some stuff will be described

camparitraj.cttrajectory.testfunct()

This is a test function for documentation - do we build from source?

class camparitraj.cttrajectory.CTTrajectory(trajectory_filename=None, pdb_filename=None, TRJ=None, protein_grouping=None, pdblead=False)

CTrajectory class that holds a single simulation trajectory object.

__init__(trajectory_filename=None, pdb_filename=None, TRJ=None, protein_grouping=None, pdblead=False)

CAMPARITraj trajectory object initializer.

CAMPARITraj will, by default, extract out the protein component from your trajectory automatically, which lets you ask questions about the protein only (i.e. without salt ions getting in the way).

Note that by default the mechanism by which individual proteins are identified is by cycling through the unique chains and determining if they are protein or not. You can also provide manual grouping via the protein_grouping option, which lets you define which residues should make up an individual protein. This can be useful if you have multiple proteins associated with the same chain, which happens in CAMPARI if you have more than 26 separate chains (i.e. every protein after the 26th is the ‘Z’ chain).

Parameters
• trajectory_filename (str) – Filename which contains the trajectory file of interest. Normally this is __traj.xtc or __traj.dcd.

• pdb_filename (str) – Filename which contains the pdb file associated with the trajectory of interest. Normally this is __START.pdb.

• protein_grouping (list of lists of ints, default=None) – Lets you manually define protein groups to be considered independently.

• pdblead (bool, default=False) – Lets you set the PDB file (which is normally ONLY used as a topology file) to be the first frame of the trajectory. This is useful when the first PDB file holds some specific reference information which you want to use (e.g. RMSD or Q).

• TRJ (MDTraj trajectory, default=None) – It is sometimes useful to re-defined a trajectory and create a new CTTraj object from that trajectory. This could be done by writing that new trajectory to file, but this is extremely slow due to the I/O impact of reading/writing from disk. If an mdtraj trajectory objected is passed, this is used as the new trajectory from which the CTTrajectory object is constructed.

export_distanceMap(proteinID, filename)

Function which returns two matrices with the mean and standard deviation distances between the complete set of intra-residue distances in a single protein molecule.

This explicitly defines the non-redundant map, so you only get a matrix with one half filled in.

Parameters
• proteinID (int) – The ID of the protein being considered, where the ID is the proteins position in the self.proteinTrajectoryList list

• filename (str) – Filename which your file should be saved to (.csv extension is added automatically).

Returns

• tuple (tuple containing distanceMap and STDMap)

• distanceMap (numpy matrix) – Is an [n x m] matrix where n and m are the number of proteinID1 residues and proteinID2 residues. Each position in the matrix corresponds to the mean distance between those two residues over the course of the simulation.

• stdMap (numpy matrix) – Is an [n x m] matrix where n and m are the number of proteinID1 residues and proteinID2 residues. Each position in the matrix corresponds to the standard devaiation associated with the distances between those two residues.

export_intraChainDistanceMap(proteinID1, proteinID2, filename, resID1=None, resID2=None)

Function which writes an intrachain distance map to a CSV file. Note this distanceMap is not returned, but can be generated by the .get_intrachainDistanceMap() function.

This computes the (full) intramolecular distance map, where the “distancemap” function computes the intermolecular distance map.

Obviously this only makes sense if your system has two seperate protein objects defined, but in principle the output from:

get_intracChainDistanceMap(0,0)

would be the same as:

get_distanceMap(0)

This is actually a useful sanity check!

Parameters
• proteinID1 (int) – The ID of the first protein of the two being considered, where the ID is the proteins position in the self.proteinTrajectoryList list.

• proteinID2 (int) – The ID of the second protein of the two being considered, where the ID is the proteins position in the self.proteinTrajectoryList list

• filename (str) – Filename which your file should be saved to (.csv extension is added automatically).

• resID1 (list of integers, default=None) – Is the list of residues from protein 1 we’re considering. If no option is provided assume we’re using all of the residues in protein 1.

• resID2 (list of integers, default=None) – Is the list of residues from protein 2 we’re considering. If no option is provided assume we’re using all of the residues in protein 2.

Returns

• tuple (tuple containing distanceMap and STDMap)

• distanceMap (numpy matrix) – Is an [n x m] matrix where n and m are the number of proteinID1 residues and proteinID2 residues. Each position in the matrix corresponds to the mean distance between those two residues over the course of the simulation.

• stdMap (numpy matrix) – Is an [n x m] matrix where n and m are the number of proteinID1 residues and proteinID2 residues. Each position in the matrix corresponds to the standard devaiation associated with the distances between those two residues.

export_localCollapse(proteinID1, filename, windowSize=10, bins=array([0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2.0, 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 2.9, 3.0, 3.1, 3.2, 3.3, 3.4, 3.5, 3.6, 3.7, 3.8, 3.9, 4.0, 4.1, 4.2, 4.3, 4.4, 4.5, 4.6, 4.7, 4.8, 4.9, 5.0, 5.1, 5.2, 5.3, 5.4, 5.5, 5.6, 5.7, 5.8, 5.9, 6.0, 6.1, 6.2, 6.3, 6.4, 6.5, 6.6, 6.7, 6.8, 6.9, 7.0, 7.1, 7.2, 7.3, 7.4, 7.5, 7.6, 7.7, 7.8, 7.9, 8.0, 8.1, 8.2, 8.3, 8.4, 8.5, 8.6, 8.7, 8.8, 8.9, 9.0, 9.1, 9.2, 9.3, 9.4, 9.5, 9.6, 9.7, 9.8, 9.9]))

Write a local collapse vector to file.

localCollapse calculates a vectorial representation of the radius of gyration along a polypeptide chain. This makes it very easy to determine where you see local collapse vs. local expansion.

Parameters
• proteinID1 (int) – The ID of the protein to be assessed

• filename (str) – Filename which your file should be saved to (.csv extension is added automatically)

• windowSize (int, default=10) – Size of the window over which conformations are examined. Default is 10.

• bins (np.arange or list) – A range of values (np.arange or list) spanning histogram bins. Default is np.arange(0, 10, 0.1).

Returns

tuple

Return type

tuple containing meanData, and stdData. Histogrammed data is only included when bins is not None.

get_distanceMap(proteinID)

Function which returns two matrices with the mean and standard deviation distances between the complete set of intra-residue distances in a single protein molecule.

This explicitly defines the non-redundant map, so you only get a matrix with one half filled in.

Parameters

proteinID (int) – The ID of the protein being considered, where the ID is the proteins position in the self.proteinTrajectoryList list

Returns

• tuple (tuple containing distanceMap and STDMap)

• distanceMap (numpy matrix) – Is an [n x m] matrix where n and m are the number of proteinID1 residues and proteinID2 residues. Each position in the matrix corresponds to the mean distance between those two residues over the course of the simulation.

• stdMap (numpy matrix) – Is an [n x m] matrix where n and m are the number of proteinID1 residues and proteinID2 residues. Each position in the matrix corresponds to the standard devaiation associated with the distances between those two residues.

get_intraChainDistanceMap(proteinID1, proteinID2, resID1=None, resID2=None)

Function which returns two matrices with the mean and standard deviation distances between the residues in resID1 from proteinID1 and resID2 from proteinID2

This computes the (full) intramolecular distance map, where the “distancemap” function computes the intermolecular distance map.

Obviously this only makes sense if your system has two separate protein objects defined, but in principle the output from:

intracChainDistanceMap(0,0)

would be the same as:

distanceMap(0)

This is actually a useful sanity check!

Parameters
• proteinID1 (int) – The ID of the first protein of the two being considered, where the ID is the proteins position in the self.proteinTrajectoryList list.

• proteinID2 (int) – The ID of the second protein of the two being considered, where the ID is the proteins position in the self.proteinTrajectoryList list

• filename (str) – Filename which your file should be saved to (.csv extension is added automatically).

• resID1 (list of integers, default=None) – Is the list of residues from protein 1 we’re considering. If no option is provided assume we’re using all of the residues in protein 1.

• resID2 (list of integers, default=None) – Is the list of residues from protein 2 we’re considering. If no option is provided assume we’re using all of the residues in protein 2.

Returns

• tuple (tuple containing distanceMap and STDMap)

• distanceMap (numpy matrix) – Is an [n x m] matrix where n and m are the number of proteinID1 residues and proteinID2 residues. Each position in the matrix corresponds to the mean distance between those two residues over the course of the simulation.

• stdMap (numpy matrix) – Is an [n x m] matrix where n and m are the number of proteinID1 residues and proteinID2 residues. Each position in the matrix corresponds to the standard devaiation associated with the distances between those two residues.

get_intrachain_interResidue_atomic_distance(proteinID1, proteinID2, R1, R2, A1='CA', A2='CA', stride=1, mode='atom', correctOffset=True)

Function which returns the distance between two specific atoms on two residues. The atoms selected are based on the ‘name’ field from the topology selection language. This defines a specific atom as defined by the PDB file. By default A1 and A2 are CA (C-alpha) but one can define any residue of interest.

We do not perform any sanity checking on the atom name - this gets really hard - so have an explicit try/except block which will warn you that you’ve probably selected an illegal atom name from the residues.

Distance is returned in Angstroms.

Parameters
• R1 (int) – Residue index of first residue

• R2 (int) – Residue index of second residue

• A1 (str, default='CA') – Atom name of the atom in R1 we’re looking at

• A2 (str, default='CA') – Atom name of the atom in R2 we’re looking at

• stride (int, default=1) – Defines the spacing between frames to compare with - i.e. take every \$stride-th frame. Setting stride=1 would mean every frame is used, which would mean you’re doing an all vs. all comparisons, which would be ideal BUT may be slow.

• mode (str, default='atom') –

Mode allows the user to define different modes for computing atomic distance.

The default is ‘atom’ whereby a pair of atoms (A1 and A2) are provided. Other options are detailed below and are identical to those offered by mdtraj in compute_contacts.

• ’ca’ - same as setting ‘atom’ and A1=’CA’ and A2=’CA’, this uses the C-alpha atoms.

• ’closest’ - closest atom associated with each of the residues, i.e. the is the point of closest approach between the two residues.

• ’closest-heavy’ - same as ‘closest’, except only non-hydrogen atoms are considered.

• ’sidechain’ - closest atom where that atom is in the sidechain. Note this requires mdtraj version 1.8.0 or higher.

• ’sidechain-heavy’ - closest atom where that atom is in the sidechain and is heavy. Note this requires mdtraj version 1.8.0 or higher.

• correctOffset (bool, default=True) – Defines if we perform local protein offset correction or not. By default we do, but some internal functions may have already performed the correction and so don’t need to perform it again.

num_proteins
try:

ctutils.mkl_set_num_threads(MAXCORES) print(“MKL will use %i threads” % (ctutils.mkl_get_max_threads()))

except Exception as e:

print(“MKL libraries not available [%s]” % str(e))