Inactive
Notice ID:36C24E19Q0055
1 The Department of Veterans Affairs research office is looking for sources who are capable of meeting the following requirements for Pilot Testing of Platform Computational Capabilities for Whole Gen...
1 The Department of Veterans Affairs research office is looking for sources who are capable of meeting the following requirements for Pilot Testing of Platform Computational Capabilities for Whole Genome Sequences. To assess the computational capabilities, including the types of tasks capable, scientific accuracy, run time, and cost, we propose a list of computational tasks based on about 2000 WGS. The VA has a requirement for both tasks A and B listed below. The VA is trying to identify if there are companies that can complete both A & B. This notice is not a request for competitive proposals, however; any responsible source who believes it is capable of meeting the requirement may submit a capability statement to the contracting office no later than Monday 5/16/19 at 12 PM, EST. Interest/capability statements may be sent to Lynn Portman at lynn.portman@va.gov and should include company name, address, point of contact, business size, SDVOSB/VOSB status, and description of whether the vendor can fulfill task A & B. This notice is to assist the Government in determining sources only. A solicitation is not currently available. If a solicitation is issued, all interested parties must respond to that solicitation announcement separately from the responses to this announcement. Dataset: Raw whole genome sequence data derived from blood samples of 2000 humans. Sequencing was performed by Illumina HiSeq, a short-read pair-end sequencing technology. Average genome coverage was aimed at 30X. In total, this sequencing effort resulted in 999 BAM files, each of 200 GB, amounting to 200 TB of sequencing data. Desired Computational Tasks and Deliverables: A comprehensive assessment should be performed on the following computational tasks, which cover upstream data processing to downstream bioinformatic analysis. Two phases are proposed which the respondent can provide plans for one or both tasks: TASK A A1. Germline Variant Calling Computational tasks: Alignment by BWA-MEM, and variant calling by GATK4, following the best practices guideline by the Broad Institute. Deliverables: Benchmarking result using NA12878 to demonstrate scientific accuracy Full variant calling results including: Alignment file (e.g. BAM) Variant calling results (VCF) Run time Cost A2. Quality Assessment Computational tasks: assess the quality of data at the levels of raw reads (FASTQ), alignments (BAM), and variants (VCF). Deliverables: QC results for: File integrity Raw reads: Total number of reads Base sequence quality Read quality GC content Alignments: Mapped reads Paired reads Variants: Counts of SNV, Indels SNV Ti/Tv SNV Het/Hom Run time Cost A3. Automating and Optimizing the Data Processing Tasks Computational tasks: Demonstrate the mechanisms to automate various data processing tasks, such as variant calling and quality assessment. Demonstrate the mechanisms to track performances of various data processing steps and potentially optimize them. Deliverables: Description of the mechanisms Provide proof-of-concept examples and results TASK B B1. Information Retrieval and Data Management Computational tasks: demonstrate the mechanisms and capability of sharing data with other researchers, e.g. to enable dozens of researchers to simultaneously compute on the same data query genotypes and allele frequencies for certain positions across a subset or the entire sequencing cohort query metadata and phenotypes of the samples demonstrate the mechanisms of tracking and auditing data usage Deliverables: Description of the framework Provide proof-of-concept examples and results Run time Cost B2. Scalable Informatic Analysis Computational tasks: Demonstrate the types of analysis that the analytical framework is able to perform. The most common genomic analysis includes functional annotation of DNA variants, GWAS, GO enrichment, pathway enrichment analysis, etc. Demonstrate the scalability of the analytical framework on various computational tasks, and project run times on increasing number of genomes Deliverables: Description of the framework Provide proof-of-concept examples and results Run time Cost This notice is not a request for competitive proposals, however; any responsible source who believes it is capable of meeting the requirement may submit a capability statement to the contracting office no later than Monday 5/16/19 at 12 PM, EST. Interest/capability statements may be sent to Lynn Portman at lynn.portman@va.gov and should include company name, address, point of contact, business size, SDVOSB/VOSB status, and description of whether the vendor can fulfill task A & B. This notice is to assist the Government in determining sources only. A solicitation is not currently available. If a solicitation is issued, all interested parties must respond to that solicitation announcement separately from the responses to this announcement.