Skip to content

Data Formats

Helio Additive exports data in various formats to support different workflows and use cases. This guide explains the different data formats you’ll encounter and how to work with them.

Apache Parquet is a columnar storage file format designed for efficient data storage and retrieval. Unlike traditional row-based formats (like CSV), Parquet stores data by columns, which offers several advantages for analytical workloads.

Parquet is particularly well-suited for storing simulation and optimization data because:

  • Efficient Compression: Columnar storage allows for better compression ratios, reducing file sizes significantly
  • Fast Query Performance: Reading only the columns you need is much faster than reading entire rows
  • Type Safety: Parquet preserves data types (integers, floats, timestamps, etc.), unlike CSV which treats everything as text
  • Schema Evolution: The format supports adding new columns without breaking existing readers
  • Cross-Platform: Works seamlessly with Python, R, Java, C++, and most data analysis tools

Helio Additive exports the following data in Parquet format:

  • Thermal simulation data: Temperature profiles and thermal quality indices
  • Layer-by-layer analysis: Detailed metrics for each print layer
  • Mesh data: 3D geometry and quality information
  • Contact data: Inter-layer bonding information

The most common way to work with Parquet files is using Python with libraries like pandas or polars:

Using Pandas:

import pandas as pd
# Read a Parquet file
df = pd.read_parquet('thermal-data.parquet')
# View the first few rows
print(df.head())
# Get column information
print(df.info())
# Filter and analyze
hot_zones = df[df['temperature'] > 250]
print(f"Found {len(hot_zones)} hot zones")

Using Polars (faster for large files):

import polars as pl
# Read a Parquet file
df = pl.read_parquet('thermal-data.parquet')
# Lazy evaluation for better performance
df_lazy = pl.scan_parquet('thermal-data.parquet')
result = df_lazy.filter(pl.col('temperature') > 250).collect()
library(arrow)
# Read a Parquet file
df <- read_parquet('thermal-data.parquet')
# View the data
head(df)

You can also use command-line tools to inspect Parquet files:

Terminal window
# Install parquet-tools
pip install parquet-tools
# View schema
parquet-tools schema thermal-data.parquet
# View first few rows
parquet-tools head thermal-data.parquet
# Convert to CSV
parquet-tools csv thermal-data.parquet > output.csv

Here’s how Parquet compares to CSV for typical simulation data:

MetricCSVParquet
File Size500 MB50 MB (10x smaller)
Read Time15 seconds2 seconds (7x faster)
Column ReadMust read entire fileRead only needed columns
Type SafetyAll stringsNative types preserved

For simpler use cases and better compatibility with spreadsheet software, we also provide CSV (Comma-Separated Values) exports:

  • Plot data: Visualization coordinates and quality metrics
  • Summary reports: High-level statistics and results
  • Export flexibility: Easy to open in Excel, Google Sheets, or any text editor

Use CSV when:

  • You need to quickly view data in Excel or Google Sheets
  • File sizes are small (< 10 MB)
  • You’re doing simple one-time analysis
  • You need human-readable data

Use Parquet when:

  • Working with large datasets (> 10 MB)
  • Performing repeated analysis or queries
  • Building automated data pipelines
  • Memory efficiency is important
  • You need to preserve exact data types

JSON (JavaScript Object Notation) is used for structured configuration and report data:

  • Simulation reports: Summary statistics and metadata
  • Configuration files: Settings and parameters
  • API responses: Structured data from our API

JSON is human-readable and widely supported across all programming languages.

import json
# Read a JSON report
with open('report.json', 'r') as f:
report = json.load(f)
print(f"Simulation status: {report['status']}")
print(f"Quality score: {report['quality_score']}")
Use CaseRecommended Format
Large-scale data analysisParquet
Quick inspection in spreadsheetCSV
Configuration and metadataJSON
3D visualizationCSV (plot data)
Data warehousingParquet
Sharing with non-technical usersCSV

If you need assistance working with any of these data formats, please:

  • Check our Visualizing Data guide for code examples
  • Review the Schemas documentation for data structure details
  • Contact support at support@helioadditive.com