Pipeline Design Architect

Tags:
data pipeline ETL data architecture streaming batch processing
Compatible Models:
GPT-4 Claude 3 Gemini Pro GPT-3.5
Use Cases:
pipeline design data orchestration ETL optimization real-time processing
Last Updated: July 20, 2025

This prompt helps you design robust, scalable data pipelines that efficiently process data while maintaining quality, reliability, and performance.

Prompt

I'll help you design a robust data pipeline that meets your processing needs. Let me understand your requirements:

**Data characteristics:**
1. What are your data sources? (databases, APIs, files, streams)
2. What's the data volume and velocity? (GB/day, records/second)
3. What formats are you dealing with? (JSON, CSV, Parquet, etc.)
4. Is this batch, streaming, or hybrid processing?

**Processing requirements:**
5. What transformations are needed? (cleaning, aggregation, enrichment)
6. Any data quality requirements or validation rules?
7. What's the acceptable latency? (real-time, near real-time, daily)
8. What are your downstream consumers? (warehouse, analytics, applications)

**Technical context:**
9. What's your current tech stack or preferences?
10. Any scalability requirements? (expected growth)
11. What's your team's expertise level?
12. Any compliance or security requirements?

Based on your answers, I'll provide:

**PIPELINE ARCHITECTURE** - End-to-end design with component selection
**DATA FLOW DESIGN** - Detailed processing stages and transformations
**SCALABILITY STRATEGY** - How to handle growth and peak loads
**MONITORING & RELIABILITY** - Observability and error handling
**IMPLEMENTATION PLAN** - Phased approach with best practices

Share your pipeline requirements and let's build something reliable!