In order to produce a protein array, a team needed to automate the process of cloning 1000 genes amplified from genomic DNA of a plant pathogen.
Protein array of a plant pathogen effector proteins
An academic team of scientists from multiple institutions wanted to characterize the biological function of 1,000 candidate effector proteins of a plant pathogen. They had received a grant from the USDA to develop a protein microarray. After creating this valuable resource, the team would share it to perform various functional assays. Upon completion of the project, this protein microarray would become publicly available and shared with the scientific community.
Since the pathogen genome was available, it was technically possible to order the synthesis of all 1,000 genes. However, the cost of gene synthesis would have been prohibitive. A quick calculation revealed that cloning the genes would be cheaper.
While the team had standard molecular biology skills, they were not prepared to handle a project of this magnitude. They quickly realized that their current cloning method—one gene at a time—would not be practical for 1,000 genes. Expedience was critical, given the intense pressure to meet project deadlines. Delays in the cloning phase would have delayed the rest of the scientific program which hinged on using the protein microarray.
To begin with, laying out the entire process was key to the project success. This involved customizing a Laboratory Information Management System (LIMS) data model to track all the samples generated in the project. For this, the team turned to GenoFAB for help. We at GenoFAB developed dashboards to track the progress of their project. Custom reporters closely monitored the status of each individual clone.
Automated primer design
In order to succeed, the team needed a high-throughput, automated method for designing DNA primers. For one thing, they needed to design 2,000 primers with similar characteristics, such as melting temperature and overhang sequences. They also needed to design thousands of additional primers to verify the cloned sequences. Further, many of the genes were too long to sequence simply using a universal primer and required extra primers to complete the job.
Automating the cloning workflow
First, individual process steps were optimized to maximize stability and reproducibility. The goal of this preliminary work was to reduce the rate of failed PCR and cloning reactions. With so many reactions, even a low failure rate would quickly multiply costs and cause significant delays. Detailed Standard Operating Procedures were developed to allow laboratory technicians with limited molecular biology experience to contribute to the project.
As the project moved forward, it became necessary to automate the analysis of quality control data. At such a scale, it was merely not possible to "look" at the data one by one. Visual examination of all sequencing traces would have been excessively time consuming and lack reproducibility. Thus, a rigorous automated data analysis process was necessary.
The first level of quality control was a capillary electrophoresis performed in an Agilent TapeStation. One the main benefits of using this instrument rather than performing traditional agarose-gel electrophoresis is that it can export electropherograms as spreadsheets. A custom script analyzed the electropherogram data and compared it to the expected size of the PCR product.
Each cloned plasmid that passed this first level of quality control was then sent to a sequencing facility for a more rigorous verification. We developed a script to analyze the resulting sequencing reads by comparing them to the desired gene sequence.
The team produced a collection of 900 plasmids—more than 90% of the target genes—in under 4 months. The plasmid collection was shared with the team of collaborators, who were able to use it to produce the protein microarray.
The scripts developed in the context of this project have been integrated in GenoFAB plasmid construction data services.