SWIFT finishing software

 

Finishing has long been the bottleneck in the sequencing process. Finishing draft sequences with existing software requires extensive and time-consuming human intervention.  Identifying and resolving misalignments, choosing templates, and selecting appropriate chemistries are common tasks that require the attention of the finisher.  To streamline this process, a unique combination of software and laboratory tools has been optimized in our laboratory to drastically reduce user intervention in finishing.  Moreover, this strategy enables us to readily finish clones drafted at other sites.  Our strategy is based on three laboratory techniques and is glued together by software we have developed.  One of the laboratory techniques is a standard application of in vitro transposon based sequencing.  The second technique is long range PCR.  The third is an approach that we call “universal template sequencing”.   

 

The SWIFT software is designed to identify problem areas using input reads assembled by PHRAP and then generate a worklist to resolve those areas. The software also interacts with a relational database to maintain a log to track the status of finishing projects.  The software tool is first run on a BAC clone in order to identify and tag possible regions of misassembly (i.e. high quality base discrepancies in repeat sequences or misoriented plasmid read pairs).  The software generates a list of misassembled reads and can automatically re-run PHRAP to remove high quality discrepancies.   After the assembly is sorted, the software automatically identifies both the gaps requiring additional sequence data and the sequence regions that are considered to be of low quality (below phred30 quality value).  The software may be run in two modes: custom oligos can be picked (using the Primer3 software from MIT/Whitehead) for resolving low quality regions and filling gaps by use of  PCR or ”universal template” sequencing , or plasmid subclones spanning the low quality regions and gaps can be picked for subsequent transposon based sequencing. A worklist is then generated which can be viewed and edited on the Finishing webpage.  Sequencing is carried out and the new sequences are added to the assembly (using either a PHRAP re-assembly or by adding a discreet file of reads to the existing assembly in the case of a repetitive clone).  The software may be run again if any gaps or low quality regions remain unresolved.

 

In-vitro plasmid transposition is performed using the Genome Priming System kit from New England Biolabs. To expedite this process, arraying of plasmid clones and subsequent transposition reaction set-up has been programmed on the Biomek robotic system.  The transposed plasmids are then used to set-up transformation reactions (in 96 well format) using chemically competent E. coli on the Biomek. A total of 24-48 colonies per transposed plasmid are then picked using the QPix colony picker and inoculated in 96 well plates for template preparation using a filter prep (from where?) which has also been programmed to run on the BioMek.  The template DNA is then sequenced (using the two universal transposon based primers) with Big Dye terminator chemistry.  This method enables us to resolve multiple clustered low quality regions and to fill large gaps in one sequencing round with a minimal number of templates.  Additionally, the insertion of the transposon into the sequencing template often helps to disrupt secondary structures that would otherwise impede sequencing reactions.  For increased efficiency we often pool plasmids prior to transposition.  Pooled transpositions leads to a reduction of the cost of transposon based finishing by a factor equal to the number of plasmids pooled.  However, repetitive clones may necessitate individual plasmid transpositions to aid in assembly sorting.

 

Long range PCR targets regions of up to 20kb.  PCR primers are chosen flanking gaps and/or low quality regions.  The targeted regions are amplified from either purified BAC DNA or genomic DNA and the PCR products are subsequently subcloned and sequenced with the transposon method detailed above.  This technique has the advantage of obviating the need for a plasmid subclone library when finishing clones drafted at other sites.

 

“Universal Template” sequencing: This title differentiates this process from universal primer sequencing methods in which one strategy or another (shearing, ExoIII deletion, etc) is used to bring the vector-born universal sequencing primer adjacent to the targeted region.  In universal template sequencing we use the same template (a BAC or pools of BACs) for all finishing reactions.  The universal template is then paired with custom oligos.  This approach has tremendous workflow advantages over universal primer sequencing.  Sequencing directly from high molecular weight fragments has historically been difficult.  To circumvent this problem we made several changes to typical BAC sequencing protocols.  Importantly, the BAC DNA is sheared into 0.5-1 kb fragments via nebulization.  This sheared DNA is then used as the template for all sequencing reactions, thereby greatly reducing the reaction set-up time required in the traditional method of matching each oligo to a specifed set of subclones.  Custom oligos are ordered in a 96 well format in order to expedite re-suspension and reaction set up.  Sequencing reactions are performed with Big Dye terminator chemistry using cycling parameters modified for universal template sequencing.

To further streamline the “universal template” finishing technology we have experimented with pooling BAC DNA preps.  We tested pooling of 4, 6 and 8 BACs.  We grow the BACs separately (this allows us to normalize DNA yield of each clone) and mix the cultures prior to DNA isolation and purification.   DNA is isolated, purified and nebulized using our normal conditions for direct BAC sequencing.  Sequencing is carried out on the pools using 1x, or 0.5x of the recommended amount of ABI Big Dye 3.1 mix. 

 

A standard automated system such as ours will help to reduce the number of highly trained technicians needed to achieve high quality finished sequence.  We believe this strategy will greatly increase the efficiency of the finishing process in terms of cost, labor and speed of clone completion.