|
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138 |
- title: DNA Chisel - a versatile sequence optimizer
- url: https://edinburgh-genome-foundry.github.io/DnaChisel/
- hash_url: e8ebcfcc0dbd4336a82f618a1bd1a818
-
- <p>DNA Chisel (complete documentation <a class="reference external" href="https://edinburgh-genome-foundry.github.io/DnaChisel/">here</a>)
- is a Python library for optimizing DNA sequences with respect to a set of
- constraints and optimization objectives. It comes with over 15 classes of
- sequence specifications which can be composed to, for instance, codon-optimize
- genes, meet the constraints of a commercial DNA provider, avoid homologies
- between sequences, tune GC content, or all of this at once!</p>
- <p>DNA Chisel also allows users to define their own specifications in Python,
- making the library suitable for a large range of automated sequence design
- applications, and complex custom design projects. It can be used as a Python
- library, a command-line interface, or a <a class="reference external" href="https://cuba.genomefoundry.org/sculpt_a_sequence">web application</a>.</p>
- <div class="section" id="example-of-use">
- <h2>Example of use</h2>
- <div class="section" id="defining-a-problem-via-scripts">
- <h3>Defining a problem via scripts</h3>
- <p>In this basic example we generate a random sequence and optimize it so that</p>
- <ul class="simple">
- <li><p>It will be rid of BsaI sites.</p></li>
- <li><p>GC content will be between 30% and 70% on every 50bp window.</p></li>
- <li><p>The reading frame at position 500-1400 will be codon-optimized for <em>E. coli</em>.</p></li>
- </ul>
- <p>Here is the code to achieve that:</p>
- <div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">dnachisel</span> <span class="kn">import</span> <span class="o">*</span>
-
- <span class="c1"># DEFINE THE OPTIMIZATION PROBLEM</span>
-
- <span class="n">problem</span> <span class="o">=</span> <span class="n">DnaOptimizationProblem</span><span class="p">(</span>
- <span class="n">sequence</span><span class="o">=</span><span class="n">random_dna_sequence</span><span class="p">(</span><span class="mi">10000</span><span class="p">),</span>
- <span class="n">constraints</span><span class="o">=</span><span class="p">[</span>
- <span class="n">AvoidPattern</span><span class="p">(</span><span class="s2">"BsaI_site"</span><span class="p">),</span>
- <span class="n">EnforceGCContent</span><span class="p">(</span><span class="n">mini</span><span class="o">=</span><span class="mf">0.3</span><span class="p">,</span> <span class="n">maxi</span><span class="o">=</span><span class="mf">0.7</span><span class="p">,</span> <span class="n">window</span><span class="o">=</span><span class="mi">50</span><span class="p">),</span>
- <span class="n">EnforceTranslation</span><span class="p">(</span><span class="n">location</span><span class="o">=</span><span class="p">(</span><span class="mi">500</span><span class="p">,</span> <span class="mi">1400</span><span class="p">))</span>
- <span class="p">],</span>
- <span class="n">objectives</span><span class="o">=</span><span class="p">[</span><span class="n">CodonOptimize</span><span class="p">(</span><span class="n">species</span><span class="o">=</span><span class="s1">'e_coli'</span><span class="p">,</span> <span class="n">location</span><span class="o">=</span><span class="p">(</span><span class="mi">500</span><span class="p">,</span> <span class="mi">1400</span><span class="p">))]</span>
- <span class="p">)</span>
-
- <span class="c1"># SOLVE THE CONSTRAINTS, OPTIMIZE WITH RESPECT TO THE OBJECTIVE</span>
-
- <span class="n">problem</span><span class="o">.</span><span class="n">resolve_constraints</span><span class="p">()</span>
- <span class="n">problem</span><span class="o">.</span><span class="n">optimize</span><span class="p">()</span>
-
- <span class="c1"># PRINT SUMMARIES TO CHECK THAT CONSTRAINTS PASS</span>
-
- <span class="k">print</span><span class="p">(</span><span class="n">problem</span><span class="o">.</span><span class="n">constraints_text_summary</span><span class="p">())</span>
- <span class="k">print</span><span class="p">(</span><span class="n">problem</span><span class="o">.</span><span class="n">objectives_text_summary</span><span class="p">())</span>
- </pre></div>
- </div>
- <p>DnaChisel implements advanced constraints such as the preservation of coding
- sequences, or the inclusion or exclusion of advanced patterns (see
- <a class="reference external" href="https://edinburgh-genome-foundry.github.io/DnaChisel/ref/builtin_specifications.html">this page</a>
- for an overview of available specifications), but it is also easy to implement
- our own constraints and objectives as subclasses of <code class="docutils literal notranslate"><span class="pre">dnachisel.Specification</span></code>.</p>
- </div>
- <div class="section" id="defining-a-problem-via-genbank-features">
- <h3>Defining a problem via Genbank features</h3>
- <p>You can also define a problem by annotating directly a Genbank as follows:</p>
- <p align="center">
- <img alt="report" title="report" src="https://raw.githubusercontent.com/Edinburgh-Genome-Foundry/DnaChisel/master/docs/_static/images/example_sequence_map.png">
- <br><br>
- </p><p>I this record:</p>
- <ul class="simple">
- <li><p>Constraints (colored in blue in the illustration) are features of type
- <code class="docutils literal notranslate"><span class="pre">misc_feature</span></code> with a prefix <code class="docutils literal notranslate"><span class="pre">@</span></code> followed
- by the name of the constraints and its parameters, which are the same as in
- python scripts.</p></li>
- <li><p>Optimization objectives (colored in yellow in the illustration) are features
- of type <code class="docutils literal notranslate"><span class="pre">misc_feature</span></code> with a prefix <code class="docutils literal notranslate"><span class="pre">~</span></code> followed by the name of the
- constraints and its parameters.</p></li>
- </ul>
- <p>The file can be directly fed to the <a class="reference external" href="https://cuba.genomefoundry.org/sculpt_a_sequence">web app</a>
- or processed via the command line interface:</p>
- <div class="highlight-bash notranslate"><div class="highlight"><pre><span></span><span class="c1"># Output the result to "optimized_record.gb"</span>
- dnachisel annotated_record.gb optimized_record.gb
- </pre></div>
- </div>
- <p>Or via a Python script:</p>
- <div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">dnachisel</span> <span class="kn">import</span> <span class="n">DnaOptimizationProblem</span>
- <span class="n">problem</span> <span class="o">=</span> <span class="n">DnaOptimizationProblem</span><span class="o">.</span><span class="n">from_record</span><span class="p">(</span><span class="s2">"my_record.gb"</span><span class="p">)</span>
- <span class="n">problem</span><span class="o">.</span><span class="n">optimize_with_report</span><span class="p">(</span><span class="n">target</span><span class="o">=</span><span class="s2">"report.zip"</span><span class="p">)</span>
- </pre></div>
- </div>
- <p>By default, only the built-in specifications of DnaChisel can be used in the
- annotations, however it is easy to add your own specifications to the Genbank
- parser, and build applications supporting custom specifications on top of
- DnaChisel.</p>
- </div>
- <div class="section" id="reports">
- <h3>Reports</h3>
- <p>DnaChisel also implements features for verification and troubleshooting. For
- instance by generating optimization reports:</p>
- <div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">problem</span><span class="o">.</span><span class="n">optimize_with_report</span><span class="p">(</span><span class="n">target</span><span class="o">=</span><span class="s2">"report.zip"</span><span class="p">)</span>
- </pre></div>
- </div>
- <p>Here is an example of summary report:</p>
- <p align="center">
- <img alt="report" title="report" src="https://raw.githubusercontent.com/Edinburgh-Genome-Foundry/DnaChisel/master/docs/_static/images/report_screenshot.png">
- <br><br>
- </p></div>
- </div>
- <div class="section" id="how-it-works">
- <h2>How it works</h2>
- <p>DnaChisel hunts down every constraint breach and suboptimal region by
- recreating local version of the problem around these regions. Each type of
- constraint can be locally <em>reduced</em> and solved in its own way, to ensure fast
- and reliable resolution.</p>
- <p>Below is an animation of the algorithm in action:</p>
- <p align="center">
- <img alt="DNA Chisel algorithm" title="DNA Chisel" src="https://raw.githubusercontent.com/Edinburgh-Genome-Foundry/DnaChisel/master/docs/_static/images/dnachisel_algorithm.gif">
- <br>
- </p></div>
- <div class="section" id="installation">
- <h2>Installation</h2>
- <p>You can install DnaChisel through PIP:</p>
- <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">sudo</span> <span class="n">pip</span> <span class="n">install</span> <span class="n">dnachisel</span><span class="p">[</span><span class="n">reports</span><span class="p">]</span>
- </pre></div>
- </div>
- <p>The <code class="docutils literal notranslate"><span class="pre">[reports]</span></code> suffix will install some heavier libraries
- (Matplotlib, PDF reports, sequenticon) for report generation,
- you can omit it if you just want to use DNA chisel to edit sequences and
- generate genbanks (for any interactive use, reports are highly recommended).</p>
- <p>Alternatively, you can unzip the sources in a folder and type</p>
- <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">sudo</span> <span class="n">python</span> <span class="n">setup</span><span class="o">.</span><span class="n">py</span> <span class="n">install</span>
- </pre></div>
- </div>
- <p>Optionally, also install Bowtie to be able to use <code class="docutils literal notranslate"><span class="pre">AvoidMatches</span></code> (which
- removes short homologies with existing genomes). On Ubuntu:</p>
- <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">sudo</span> <span class="n">apt</span><span class="o">-</span><span class="n">get</span> <span class="n">install</span> <span class="n">bowtie</span>
- </pre></div>
- </div>
- </div>
-
- <div class="section" id="more-biology-software">
- <h2>More biology software</h2>
- <a class="reference external image-reference" href="https://edinburgh-genome-foundry.github.io/"><img alt="https://raw.githubusercontent.com/Edinburgh-Genome-Foundry/Edinburgh-Genome-Foundry.github.io/master/static/imgs/logos/egf-codon-horizontal.png" src="https://raw.githubusercontent.com/Edinburgh-Genome-Foundry/Edinburgh-Genome-Foundry.github.io/master/static/imgs/logos/egf-codon-horizontal.png"></a>
- <p>DNA Chisel is part of the <a class="reference external" href="https://edinburgh-genome-foundry.github.io/">EGF Codons</a> synthetic biology software suite for DNA design, manufacturing and validation.</p>
|