1 """
2 CSB is a high-level, object-oriented library used to solve problems in the
3 field of Computational Structural Biology.
4
5
6 Introduction
7 ============
8
9 The library is composed of a set of highly branched python packages
10 (namespaces). Some of the packages are meant to be directly used by
11 the clients (core library), while others are utility modules and take
12 part in the development of the library:
13
14 1. Core class library -- object-oriented, granular, with an emphasis
15 on design and clean interfaces. A Sequence is not a string, and a
16 Structure is not a dict or list. Naming conventions matter.
17
18 2. Application framework -- executable console applications
19 ("protocols"), which consume objects from the core library.
20 The framework ensures that each CSB application is also reusable
21 and can be instantiated as a regular python object without any
22 ugly side effects (sys.exit() and friends). See L{csb.apps} for more
23 details.
24
25 3. Test framework -- built on top of the standard unittest as a thin
26 wrapping layer. Provides some sugar like transparent management of
27 test data files, and modular test execution. L{csb.test} will give
28 you all the details.
29
30 The core library is roughly composed of:
31
32 - bioinformatics API: L{csb.bio}, which includes stuff like
33 L{csb.bio.io}, L{csb.bio.structure}, L{csb.bio.sequence},
34 L{csb.bio.hmm}
35
36 - statistics API: L{csb.statistics}, L{csb.numeric}
37
38 - utilities - L{csb.io}, L{csb.core}
39
40
41 Getting started
42 ===============
43
44 Perhaps one of the most frequently used parts of the library is the
45 L{csb.bio.structure} module, which provides the L{Structure}, L{Chain},
46 L{Residue} and L{Atom} objects. You could easily build a L{Structure}
47 from scratch, but a far more common scenario is parsing a structure from
48 a PDB file using one of the L{AbstractStructureParser}s. All bio IO
49 objects, including the StructureParser factory, are defined in
50 L{csb.bio.io} and sub-packages:
51
52 >>> from csb.bio.io.wwpdb import StructureParser
53 >>> p = StructureParser("/some/file/pdb1x80.ent")
54 >>> s = p.parse_structure()
55 >>> print(s)
56 <Structure: 1x80, 2 chains>
57
58 The last statement will return a L{csb.bio.structure.Structure} instance,
59 which is a composite hierarchical object:
60
61 >>> for chain_id in s.chains:
62 chain = s.chains[chain_id]
63 for residue in chain.residues:
64 for atom_id in residue.atoms:
65 atom = residue.atoms[atom_id]
66 print(atom.vector)
67
68 Some of the inner objects in this hierarchy behave just like dictionaries
69 (but are not):
70
71 >>> s.chains['A'] # access chain A by ID
72 <Chain A: Protein>
73 >>> s['A'] # the same
74 <Chain A: Protein>
75
76 Others behave like collections:
77
78 >>> chain.residues[10] # 1-based access to the residues in the chain
79 <ProteinResidue [10]: PRO 10>
80 >>> chain[10] # 0-based, list-like access
81 <ProteinResidue [11]: GLY 11>
82
83 But all entities are iterable because they inherit the C{items} iterator
84 from L{AbstractEntity}. The above loop can be shortened:
85
86 >>> for chain in s.items:
87 for residue in chain.items:
88 for atom in residue.items:
89 print(atom.vector)
90
91 or even more:
92
93 >>> from csb.bio.structure import Atom
94 >>> for atom in s.components(klass=Atom):
95 print(atom.vector)
96
97 You may also be interested in extracting a sub-chain from this structure:
98
99 >>> s.chains['B'].subregion(3, 20) # from positions 3 to 20, inclusive
100 <Chain B: Protein>
101
102 or modifying it in some way, for example, in order to append a new residue,
103 try:
104
105 >>> from csb.bio.structure import ProteinResidue
106 >>> from csb.bio.sequence import ProteinAlphabet
107 >>> residue = ProteinResidue(401, ProteinAlphabet.ALA)
108 >>> s.chains['A'].residues.append(residue)
109
110 Finally, you would probably want to save your structure back to a PDB file:
111
112 >>> s.to_pdb('/some/file/name.pdb')
113
114
115 Where to go from here
116 =====================
117
118 If you want to dive into statistics, you could peek inside L{csb.statistics}
119 and its sub-packages. For example, L{csb.statistics.pdf} contains a collection
120 of L{probability density objects<csb.statistics.pdf.AbstractDensity>},
121 like L{Gaussian<csb.statistics.pdf.Normal>} or L{Gamma<csb.statistics.pdf.Gamma>}.
122
123 But chances are you would first like to try reading some files, so you could
124 start exploring L{csb.bio.io} right now. As we have already seen,
125 L{csb.bio.io.wwpdb} provides PDB L{Structure<csb.bio.structure.Structure>}
126 parsers, for example L{csb.bio.io.wwpdb.RegularStructureParser} and
127 L{csb.bio.io.wwpdb.LegacyStructureParser}.
128
129 L{csb.bio.io.fasta} is all about reading FASTA
130 L{Sequence<csb.bio.sequence.AbstractSequence>}s and
131 L{SequenceAlignment<csb.bio.sequence.AbstractAlignment>}s. Be sure to check out
132 L{csb.bio.io.fasta.SequenceParser}, L{csb.bio.io.fasta.SequenceAlignmentReader}
133 and L{csb.bio.io.fasta.StructureAlignmentFactory}.
134
135 If you are working with HHpred (L{ProfileHMM<csb.bio.hmm.ProfileHMM>}s,
136 L{HHpredHit<csb.bio.hmm.HHpredHit>}s), then L{csb.bio.io.hhpred} is for you.
137 This package provides L{csb.bio.io.hhpred.HHProfileParser} and
138 L{csb.bio.io.hhpred.HHOutputParser}, which are used to read *.hhm and *.hhr
139 files.
140
141 Finally, if you want to make some nice plots with matplotlib, you may like the
142 clean object-oriented interface of our L{Chart<csb.io.plots.Chart>}. See
143 L{csb.io.plots} and maybe also L{csb.io.tsv} to get started.
144
145
146 Development
147 ===========
148
149 When contributing code to CSB, please take into account the following:
150
151 1. New features or bug fixes should always be accompanied by test cases.
152 Also, always run the complete test suite before committing. For more
153 details on this topic, see L{csb.test}.
154
155 2. The source code of CSB must be cross-platform and cross-interpreter
156 compatible. L{csb.core} and L{csb.io} will give you all necessary
157 details on how to use the CSB compatibility layer.
158
159
160 License
161 =======
162
163 CSB is open source and distributed under OSI-approved MIT license::
164
165 Copyright (c) 2012 Michael Habeck
166
167 Permission is hereby granted, free of charge, to any person obtaining
168 a copy of this software and associated documentation files (the
169 "Software"), to deal in the Software without restriction, including
170 without limitation the rights to use, copy, modify, merge, publish,
171 distribute, sublicense, and/or sell copies of the Software, and to
172 permit persons to whom the Software is furnished to do so, subject to
173 the following conditions:
174
175 The above copyright notice and this permission notice shall be
176 included in all copies or substantial portions of the Software.
177
178 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
179 EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
180 MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
181 IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
182 CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
183 TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
184 SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
185
186 """
187
188 __version__ = '1.2.5.{revision}'
192 """
193 CSB version number.
194 """
195
197
198 version = __version__.split('.')
199
200 if not len(version) in (3, 4):
201 raise ValueError(version)
202
203 self._package = __name__
204
205 self._major = version[0]
206 self._minor = version[1]
207 self._micro = version[2]
208 self._revision = None
209
210 if len(version) == 4:
211 self._revision = version[3]
212
215
217 return '{0.package} {0.full}'.format(self)
218
219 @property
221 """
222 Major version (huge, incompatible changes)
223 @rtype: int
224 """
225 return int(self._major)
226
227 @property
229 """
230 Minor version (significant, but compatible changes)
231 @rtype: int
232 """
233 return int(self._minor)
234
235 @property
237 """
238 Micro version (bug fixes and small enhancements)
239 @rtype: int
240 """
241 return int(self._micro)
242
243 @property
245 """
246 Build number (exact repository revision number)
247 @rtype: int
248 """
249 try:
250 return int(self._revision)
251 except:
252 return self._revision
253
254 @property
256 """
257 Canonical three-part version number.
258 """
259 return '{0.major}.{0.minor}.{0.micro}'.format(self)
260
261 @property
263 """
264 Full version, including the repository revision number.
265 """
266 return '{0.major}.{0.minor}.{0.micro}.{0.revision}'.format(self)
267
268 @property
271