Introduction
InfluxDB 3 represents a complete rebuild of the core engine, designed to overcome the limitations of earlier versions and to support the growing needs of time series workloads.
This new architecture unlocks unlimited cardinality, leverages cheaper object storage, introduces SQL as a primary query language, and enhances analytical capabilities.
Built on open-source foundations, InfluxDB 3 uses the FDAP stack—Flight, DataFusion, Apache Arrow, and Parquet—with Rust at its core.
From Legacy Architecture to InfluxDB 3
Challenges of Previous Versions
- High-cardinality data was difficult to handle, leading to memory pressure and performance degradation.
- Storage systems were tied to local disk or SSDs, making historical data retention expensive and less scalable.
- Analytical queries were limited in flexibility and struggled with large, complex workloads.
The New Foundation
- FDAP stack: A modern data architecture based on Arrow, DataFusion, Flight, and Parquet.
- Rust: Chosen for its memory safety, high performance, and concurrency model.
- Object storage: Enables cost-effective retention of cold data while keeping recent data hot in memory.
What’s New in InfluxDB 3
Key Features
- Unlimited cardinality: No restrictions on the number of unique tags or series, ensuring scalability for modern workloads.
- Native SQL support: SQL is now the primary query language, while InfluxQL remains available for compatibility.
- Performance improvements: Optimized ingestion, query execution, and caching layers for both hot and cold data.
- Object storage integration: Seamless support for low-cost, cloud-based storage backends.
- Ecosystem compatibility: Standard formats (Arrow, Parquet) make data easier to use with BI tools and analytics platforms.
Understanding the FDAP Stack
Components
- Apache Arrow: Provides a high-performance in-memory columnar format for efficient analytics.
- Apache Parquet: Enables compressed, optimized storage with metadata for fast filtering.
- DataFusion: A query engine that executes SQL directly on Arrow and Parquet datasets.
- Arrow Flight / FlightSQL: A transport protocol that delivers data in Arrow format across the network with minimal overhead.
How It Works
Data ingestion begins with the Line Protocol and is buffered in memory as Arrow RecordBatches.
Periodically, data is written into Parquet files for long-term storage in object stores.
Queries combine hot data from memory with cold data from Parquet, and Arrow Flight handles fast, serialized transfer to clients.
Practical Examples
1. Writing Data with Line Protocol
weather,location=us-midwest temperature=82 1672531200000000000
weather,location=us-east temperature=76 1672534800000000000
weather,location=us-west temperature=88 1672538400000000000
Each line includes a measurement (weather), tags (location), a field (temperature),
and a timestamp in nanoseconds.
2. Querying Data with SQL
SELECT location,
AVG(temperature) AS avg_temp
FROM weather
WHERE time >= now() - interval '7 days'
GROUP BY location;
This query calculates the average temperature per location for the past week using the SQL engine inside InfluxDB 3.
3. Using FlightSQL in Python
import pyarrow.flight as fl
# Connect to InfluxDB 3 FlightSQL endpoint
client = fl.FlightClient("grpc+tcp://localhost:8082")
# SQL query to execute
sql_query = "SELECT * FROM weather LIMIT 5;"
# Prepare and execute the query
descriptor = fl.FlightDescriptor.for_command(sql_query.encode("utf-8"))
flight = client.get_flight_info(descriptor)
# Fetch the results
for endpoint in flight.endpoints:
reader = client.do_get(endpoint.ticket)
table = reader.read_all()
print(table.to_pandas())
The Arrow FlightSQL API allows efficient transport of query results directly into Python as Arrow tables or Pandas DataFrames.
Why Rust?
- Performance: Rust provides predictable performance without garbage collection overhead.
- Memory safety: The compiler prevents common runtime errors, reducing crashes and leaks.
- Concurrency: Rust’s fearless concurrency model ensures safe, multi-threaded processing.
- Integration: Many Arrow and DataFusion components are already implemented in Rust, making it a natural fit.
Current State and Limitations
Available Today
- General availability of InfluxDB 3 with unlimited cardinality and SQL support.
- Integration with object storage for cost-efficient scaling.
- Improved dashboards and caching mechanisms for real-time analytics.
Ongoing Challenges
- Flux language and task ecosystem are not fully replaced by SQL yet.
- Cold storage queries can be slower compared to hot in-memory queries.
- Migration from earlier InfluxDB versions may require schema adjustments and data transformation.
The Road Ahead
- Apache Iceberg integration: Future releases will connect InfluxDB with lakehouse and warehouse platforms like Databricks and Snowflake.
- Better migration tools: Easier pathways for users transitioning from older versions.
- Optimizations for cold data: Faster query execution with advanced caching and indexing strategies.
- Extended analytics: Potential support for user-defined functions, triggers, and WASM-based computation.
Conclusion
The rebuild of InfluxDB 3 marks a decisive step in the evolution of time series databases.
By leveraging Rust and the FDAP stack, the system now scales to unlimited cardinality, integrates with cloud-native storage, and positions SQL at the heart of its analytics.
While some challenges remain—such as migration and Flux replacement—the foundation is set for deeper integrations, faster analytics, and broader adoption across modern data ecosystems.