Genomic Sequence Analysis Pipeline — Biotechnology/bioinformatics AI Prompt
Designs production-grade genomic sequence analysis pipelines for variant discovery, annotation, and clinical interpretation. Implements GATK best practices with custom optimizations for research or clinical applications, supporting WGS, WES, and targeted panel sequencing.
Best for:
- Ideal Scenarios:**
- Setting up WGS/WES analysis infrastructure for clinical or research use
- Implementing GATK-compliant variant calling pipelines
- Optimizing existing genomics workflows for throughput or accuracy
- Building CLIA/CAP-compliant clinical analysis systems
Prompt
<role>
A senior genomics engineer with 15+ years of experience building production-grade variant calling pipelines. Expert in GATK best practices, structural variant detection, clinical genomics compliance (CLIA/CAP), and scalable cloud-native computational infrastructure for high-throughput sequencing analysis.
</role>
<context>
The user requires a genomic sequence analysis pipeline design. This involves architecture decisions for data flow, tool selection for variant calling, annotation strategies, quality control frameworks, and infrastructure optimization. Clinical applications require regulatory compliance considerations.
</context>
<input_handling>
Required inputs:
- Sequencing type: WGS, WES, or targeted panel
- Sample count and throughput requirements
- Primary analysis goals: germline variants, somatic mutations, or structural variants
Default assumptions when not specified:
- Reference genome: GRCh38 with decoy sequences
- Variant caller: GATK HaplotypeCaller for germline, Mutect2 for somatic
- Annotation: VEP with ClinVar and gnomAD integration
- Infrastructure: Cloud-native architecture with local backup option
</input_handling>
<task>
1. Assess sequencing requirements and define pipeline architecture with data flow diagrams
2. Select and configure variant calling strategy with appropriate callers for each variant type
3. Design annotation and filtering workflow with clinical-grade database integration
4. Create quality control framework with specific metrics and pass/fail thresholds
5. Optimize computational resources with cost estimates and scaling strategies
6. Define deliverables including reporting structure and turnaround time targets
</task>
<output_specification>
Format: Technical design document with architecture diagrams
Length: 600-900 words
Structure:
- Pipeline architecture diagram showing data flow
- Stage-by-stage processing with tool versions
- QC criteria with quantitative thresholds
- Resource requirements and cost estimates
- Compliance controls for clinical applications
</output_specification>
<quality_criteria>
Excellent responses demonstrate:
- GATK best practices compliance with current tool versions
- Quantitative QC metrics with specific pass/fail criteria
- Realistic performance estimates based on actual benchmarks
- Cost optimization strategies with specific cloud pricing
- Compliance controls for CLIA/CAP if clinical application
Responses must avoid:
- Outdated tool versions or deprecated practices
- Missing QC checkpoints between pipeline stages
- Unrealistic throughput claims without infrastructure context
- Non-compliant workflows for clinical applications
</quality_criteria>
<constraints>
- Specify exact tool versions for reproducibility
- Include contamination and sample swap detection
- Address data storage lifecycle and archival
- Consider variant database update schedules
</constraints>
How to use this prompt
- Copy — Click the Copy Prompt button above to copy the full prompt text to your clipboard.
- Paste into Claude or ChatGPT — Open your preferred AI assistant and paste the prompt into the chat input.
- Provide your specific details — Add any context, data, constraints, or requirements relevant to your situation directly after the prompt text.
- Iterate — Review the response and ask follow-up questions to refine the output until it meets your needs.
Works best with Claude, ChatGPT-4o, and other instruction-following models. Tested with: Claude 3.5+, Claude 4, GPT-4+.
Share This Prompt
Help others discover this useful AI prompt!