A place to cache linked articles (think custom and personal wayback machine)
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

index.md 11KB

3 years ago
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138
  1. title: DNA Chisel - a versatile sequence optimizer
  2. url: https://edinburgh-genome-foundry.github.io/DnaChisel/
  3. hash_url: e8ebcfcc0dbd4336a82f618a1bd1a818
  4. <p>DNA Chisel (complete documentation <a class="reference external" href="https://edinburgh-genome-foundry.github.io/DnaChisel/">here</a>)
  5. is a Python library for optimizing DNA sequences with respect to a set of
  6. constraints and optimization objectives. It comes with over 15 classes of
  7. sequence specifications which can be composed to, for instance, codon-optimize
  8. genes, meet the constraints of a commercial DNA provider, avoid homologies
  9. between sequences, tune GC content, or all of this at once!</p>
  10. <p>DNA Chisel also allows users to define their own specifications in Python,
  11. making the library suitable for a large range of automated sequence design
  12. applications, and complex custom design projects. It can be used as a Python
  13. library, a command-line interface, or a <a class="reference external" href="https://cuba.genomefoundry.org/sculpt_a_sequence">web application</a>.</p>
  14. <div class="section" id="example-of-use">
  15. <h2>Example of use</h2>
  16. <div class="section" id="defining-a-problem-via-scripts">
  17. <h3>Defining a problem via scripts</h3>
  18. <p>In this basic example we generate a random sequence and optimize it so that</p>
  19. <ul class="simple">
  20. <li><p>It will be rid of BsaI sites.</p></li>
  21. <li><p>GC content will be between 30% and 70% on every 50bp window.</p></li>
  22. <li><p>The reading frame at position 500-1400 will be codon-optimized for <em>E. coli</em>.</p></li>
  23. </ul>
  24. <p>Here is the code to achieve that:</p>
  25. <div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">dnachisel</span> <span class="kn">import</span> <span class="o">*</span>
  26. <span class="c1"># DEFINE THE OPTIMIZATION PROBLEM</span>
  27. <span class="n">problem</span> <span class="o">=</span> <span class="n">DnaOptimizationProblem</span><span class="p">(</span>
  28. <span class="n">sequence</span><span class="o">=</span><span class="n">random_dna_sequence</span><span class="p">(</span><span class="mi">10000</span><span class="p">),</span>
  29. <span class="n">constraints</span><span class="o">=</span><span class="p">[</span>
  30. <span class="n">AvoidPattern</span><span class="p">(</span><span class="s2">"BsaI_site"</span><span class="p">),</span>
  31. <span class="n">EnforceGCContent</span><span class="p">(</span><span class="n">mini</span><span class="o">=</span><span class="mf">0.3</span><span class="p">,</span> <span class="n">maxi</span><span class="o">=</span><span class="mf">0.7</span><span class="p">,</span> <span class="n">window</span><span class="o">=</span><span class="mi">50</span><span class="p">),</span>
  32. <span class="n">EnforceTranslation</span><span class="p">(</span><span class="n">location</span><span class="o">=</span><span class="p">(</span><span class="mi">500</span><span class="p">,</span> <span class="mi">1400</span><span class="p">))</span>
  33. <span class="p">],</span>
  34. <span class="n">objectives</span><span class="o">=</span><span class="p">[</span><span class="n">CodonOptimize</span><span class="p">(</span><span class="n">species</span><span class="o">=</span><span class="s1">'e_coli'</span><span class="p">,</span> <span class="n">location</span><span class="o">=</span><span class="p">(</span><span class="mi">500</span><span class="p">,</span> <span class="mi">1400</span><span class="p">))]</span>
  35. <span class="p">)</span>
  36. <span class="c1"># SOLVE THE CONSTRAINTS, OPTIMIZE WITH RESPECT TO THE OBJECTIVE</span>
  37. <span class="n">problem</span><span class="o">.</span><span class="n">resolve_constraints</span><span class="p">()</span>
  38. <span class="n">problem</span><span class="o">.</span><span class="n">optimize</span><span class="p">()</span>
  39. <span class="c1"># PRINT SUMMARIES TO CHECK THAT CONSTRAINTS PASS</span>
  40. <span class="k">print</span><span class="p">(</span><span class="n">problem</span><span class="o">.</span><span class="n">constraints_text_summary</span><span class="p">())</span>
  41. <span class="k">print</span><span class="p">(</span><span class="n">problem</span><span class="o">.</span><span class="n">objectives_text_summary</span><span class="p">())</span>
  42. </pre></div>
  43. </div>
  44. <p>DnaChisel implements advanced constraints such as the preservation of coding
  45. sequences, or the inclusion or exclusion of advanced patterns (see
  46. <a class="reference external" href="https://edinburgh-genome-foundry.github.io/DnaChisel/ref/builtin_specifications.html">this page</a>
  47. for an overview of available specifications), but it is also easy to implement
  48. our own constraints and objectives as subclasses of <code class="docutils literal notranslate"><span class="pre">dnachisel.Specification</span></code>.</p>
  49. </div>
  50. <div class="section" id="defining-a-problem-via-genbank-features">
  51. <h3>Defining a problem via Genbank features</h3>
  52. <p>You can also define a problem by annotating directly a Genbank as follows:</p>
  53. <p align="center">
  54. <img alt="report" title="report" src="https://raw.githubusercontent.com/Edinburgh-Genome-Foundry/DnaChisel/master/docs/_static/images/example_sequence_map.png">
  55. <br><br>
  56. </p><p>I this record:</p>
  57. <ul class="simple">
  58. <li><p>Constraints (colored in blue in the illustration) are features of type
  59. <code class="docutils literal notranslate"><span class="pre">misc_feature</span></code> with a prefix <code class="docutils literal notranslate"><span class="pre">@</span></code> followed
  60. by the name of the constraints and its parameters, which are the same as in
  61. python scripts.</p></li>
  62. <li><p>Optimization objectives (colored in yellow in the illustration) are features
  63. of type <code class="docutils literal notranslate"><span class="pre">misc_feature</span></code> with a prefix <code class="docutils literal notranslate"><span class="pre">~</span></code> followed by the name of the
  64. constraints and its parameters.</p></li>
  65. </ul>
  66. <p>The file can be directly fed to the <a class="reference external" href="https://cuba.genomefoundry.org/sculpt_a_sequence">web app</a>
  67. or processed via the command line interface:</p>
  68. <div class="highlight-bash notranslate"><div class="highlight"><pre><span></span><span class="c1"># Output the result to "optimized_record.gb"</span>
  69. dnachisel annotated_record.gb optimized_record.gb
  70. </pre></div>
  71. </div>
  72. <p>Or via a Python script:</p>
  73. <div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">dnachisel</span> <span class="kn">import</span> <span class="n">DnaOptimizationProblem</span>
  74. <span class="n">problem</span> <span class="o">=</span> <span class="n">DnaOptimizationProblem</span><span class="o">.</span><span class="n">from_record</span><span class="p">(</span><span class="s2">"my_record.gb"</span><span class="p">)</span>
  75. <span class="n">problem</span><span class="o">.</span><span class="n">optimize_with_report</span><span class="p">(</span><span class="n">target</span><span class="o">=</span><span class="s2">"report.zip"</span><span class="p">)</span>
  76. </pre></div>
  77. </div>
  78. <p>By default, only the built-in specifications of DnaChisel can be used in the
  79. annotations, however it is easy to add your own specifications to the Genbank
  80. parser, and build applications supporting custom specifications on top of
  81. DnaChisel.</p>
  82. </div>
  83. <div class="section" id="reports">
  84. <h3>Reports</h3>
  85. <p>DnaChisel also implements features for verification and troubleshooting. For
  86. instance by generating optimization reports:</p>
  87. <div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">problem</span><span class="o">.</span><span class="n">optimize_with_report</span><span class="p">(</span><span class="n">target</span><span class="o">=</span><span class="s2">"report.zip"</span><span class="p">)</span>
  88. </pre></div>
  89. </div>
  90. <p>Here is an example of summary report:</p>
  91. <p align="center">
  92. <img alt="report" title="report" src="https://raw.githubusercontent.com/Edinburgh-Genome-Foundry/DnaChisel/master/docs/_static/images/report_screenshot.png">
  93. <br><br>
  94. </p></div>
  95. </div>
  96. <div class="section" id="how-it-works">
  97. <h2>How it works</h2>
  98. <p>DnaChisel hunts down every constraint breach and suboptimal region by
  99. recreating local version of the problem around these regions. Each type of
  100. constraint can be locally <em>reduced</em> and solved in its own way, to ensure fast
  101. and reliable resolution.</p>
  102. <p>Below is an animation of the algorithm in action:</p>
  103. <p align="center">
  104. <img alt="DNA Chisel algorithm" title="DNA Chisel" src="https://raw.githubusercontent.com/Edinburgh-Genome-Foundry/DnaChisel/master/docs/_static/images/dnachisel_algorithm.gif">
  105. <br>
  106. </p></div>
  107. <div class="section" id="installation">
  108. <h2>Installation</h2>
  109. <p>You can install DnaChisel through PIP:</p>
  110. <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">sudo</span> <span class="n">pip</span> <span class="n">install</span> <span class="n">dnachisel</span><span class="p">[</span><span class="n">reports</span><span class="p">]</span>
  111. </pre></div>
  112. </div>
  113. <p>The <code class="docutils literal notranslate"><span class="pre">[reports]</span></code> suffix will install some heavier libraries
  114. (Matplotlib, PDF reports, sequenticon) for report generation,
  115. you can omit it if you just want to use DNA chisel to edit sequences and
  116. generate genbanks (for any interactive use, reports are highly recommended).</p>
  117. <p>Alternatively, you can unzip the sources in a folder and type</p>
  118. <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">sudo</span> <span class="n">python</span> <span class="n">setup</span><span class="o">.</span><span class="n">py</span> <span class="n">install</span>
  119. </pre></div>
  120. </div>
  121. <p>Optionally, also install Bowtie to be able to use <code class="docutils literal notranslate"><span class="pre">AvoidMatches</span></code> (which
  122. removes short homologies with existing genomes). On Ubuntu:</p>
  123. <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">sudo</span> <span class="n">apt</span><span class="o">-</span><span class="n">get</span> <span class="n">install</span> <span class="n">bowtie</span>
  124. </pre></div>
  125. </div>
  126. </div>
  127. <div class="section" id="more-biology-software">
  128. <h2>More biology software</h2>
  129. <a class="reference external image-reference" href="https://edinburgh-genome-foundry.github.io/"><img alt="https://raw.githubusercontent.com/Edinburgh-Genome-Foundry/Edinburgh-Genome-Foundry.github.io/master/static/imgs/logos/egf-codon-horizontal.png" src="https://raw.githubusercontent.com/Edinburgh-Genome-Foundry/Edinburgh-Genome-Foundry.github.io/master/static/imgs/logos/egf-codon-horizontal.png"></a>
  130. <p>DNA Chisel is part of the <a class="reference external" href="https://edinburgh-genome-foundry.github.io/">EGF Codons</a> synthetic biology software suite for DNA design, manufacturing and validation.</p>