Skip to main content

Amazon Data Firehose

Amazon Data Firehose

Amazon Data Firehose is a fully managed service for data ingestion in near real-time. It captures, transforms, and loads streaming data into data lakes, warehouses, and analytics services with automatic provisioning and scaling.

Processing your data
info
  • Data ingestion involves moving data from source systems into your chosen storage solution.
  • Use real-time ingestion when the data is needed immediately. Use batch ingestion when some latency is tolerable.

Core Benefits​

  • Simplified Ingestion: Easily load streaming data to destinations like S3, Redshift, and OpenSearch without writing code.
  • Near Real-Time Delivery: Delivers data within seconds, enabling timely analysis and monitoring.
  • Built-in Transformation: Supports on-the-fly data conversion (e.g., JSON to Parquet) and transformation with AWS Lambda.

Role in Data Pipeline​

Data Ingestion: Acts as a simple, reliable bridge to load streaming data from sources directly into storage or analytics destinations, often serving as the delivery mechanism for Kinesis Data Streams.

Use Cases​

  • IoT Data Loading: Streaming sensor data directly into an Amazon S3 data lake for archival and analysis.
  • Log Delivery: Delivering application logs from thousands of sources to Amazon OpenSearch Service for real-time monitoring.
info

Firehose is the simplest way to reliably load streaming data into AWS destinations, handling all the scaling, batching, and delivery logistics automatically.

Use case: Ideal for streaming ETL pipelines that need to reliably deliver data to storage or analytics tools with minimal management.

Additional Resources​