Quantity Extraction

RevitPy includes an extraction layer (revitpy.extract) for pulling measured quantities, material data, cost estimates, and structured exports from Revit elements. The module is designed around four core classes: QuantityExtractor, MaterialTakeoff, CostEstimator, and DataExporter.

Quick Start

The module provides convenience functions for the most common workflows:

from revitpy.extract import extract_quantities, material_takeoff, estimate_costs, export_data

# Extract quantities from a list of elements
quantities = extract_quantities(elements)

# Material takeoff with aggregation and classification
materials = material_takeoff(elements, aggregate=True, classify=True, classification_system="UniFormat")

# Cost estimation from extracted quantities
cost_summary = estimate_costs(quantities, cost_database={"Walls": 150.0, "Floors": 85.0})

# Export to CSV
from revitpy.extract import ExportConfig, ExportFormat
from pathlib import Path

config = ExportConfig(format=ExportFormat.CSV, output_path=Path("takeoff.csv"))
export_data([{"name": "Wall-1", "area": 25.5}], config=config)

QuantityExtractor

QuantityExtractor extracts measured quantities (area, volume, length, count, weight) from duck-typed element objects. Elements are expected to expose attributes such as area, volume, length, count, weight, name, category, level, and system.

Creating an Extractor

from revitpy.extract import QuantityExtractor

# Basic usage
extractor = QuantityExtractor()

# With an optional RevitContext
extractor = QuantityExtractor(context=my_context)

Extracting Quantities

The extract method returns a list of QuantityItem dataclasses. By default it extracts all quantity types. Pass quantity_types to limit extraction to specific types.

from revitpy.extract import QuantityExtractor, QuantityType

extractor = QuantityExtractor()

# Extract all quantity types
items = extractor.extract(elements)

# Extract only area and volume
items = extractor.extract(elements, quantity_types=[QuantityType.AREA, QuantityType.VOLUME])

Each returned QuantityItem contains the following fields:

Field	Type	Description
`element_id`	`Any`	The element’s `id` attribute
`element_name`	`str`	The element’s `name` attribute
`category`	`str`	The element’s `category` attribute
`quantity_type`	`QuantityType`	The type of quantity extracted
`value`	`float`	The numeric quantity value
`unit`	`str`	The unit of measurement (e.g. `"m2"`, `"m3"`)
`level`	`str`	The element’s `level` attribute (optional)
`system`	`str`	The element’s `system` attribute (optional)

Default units are assigned per quantity type but can be overridden per element using an attribute named {attr}_unit (e.g. area_unit).

Grouped Extraction

Use extract_grouped to extract and group results by an aggregation level in a single call:

from revitpy.extract import QuantityExtractor, AggregationLevel

extractor = QuantityExtractor()

# Group by category (default)
grouped = extractor.extract_grouped(elements)
# Returns: {"Walls": [QuantityItem, ...], "Floors": [QuantityItem, ...]}

# Group by level
grouped = extractor.extract_grouped(elements, group_by=AggregationLevel.LEVEL)

# Group by system with specific quantity types
grouped = extractor.extract_grouped(
    elements,
    group_by=AggregationLevel.SYSTEM,
    quantity_types=[QuantityType.LENGTH],
)

Summarizing Quantities

The summarize method sums values by quantity type:

items = extractor.extract(elements)
summary = extractor.summarize(items)
# Returns: {"area": 1250.5, "volume": 340.2, "count": 48.0}

Async Extraction

For large element sets, use extract_async with an optional progress callback:

import asyncio
from revitpy.extract import QuantityExtractor

extractor = QuantityExtractor()

def on_progress(current: int, total: int) -> None:
    print(f"Processing {current}/{total}")

items = asyncio.run(extractor.extract_async(elements, progress=on_progress))

The async method yields control to the event loop every 100 elements to keep the UI responsive.

MaterialTakeoff

MaterialTakeoff extracts material data from elements, aggregates quantities by material name, and classifies materials against industry standard systems.

Elements should expose material_name (or material), material_volume, material_area, and material_mass attributes. If the material-specific quantity attributes are absent, the extractor falls back to the generic volume and area attributes.

Extracting Materials

from revitpy.extract import MaterialTakeoff

takeoff = MaterialTakeoff()
materials = takeoff.extract(elements)

for mat in materials:
    print(f"{mat.material_name}: {mat.volume} m3, {mat.area} m2, {mat.mass} kg")

Each MaterialQuantity dataclass contains:

Field	Type	Default	Description
`material_name`	`str`	–	Name of the material
`category`	`str`	–	Element category
`volume`	`float`	`0.0`	Total volume in cubic metres
`area`	`float`	`0.0`	Total area in square metres
`mass`	`float`	`0.0`	Total mass in kilograms
`classification_code`	`str`	`""`	Classification code (after classification)
`classification_system`	`str`	`""`	Classification system name (after classification)

Aggregating Materials

The aggregate method combines entries that share the same material_name, summing their volume, area, and mass:

materials = takeoff.extract(elements)
aggregated = takeoff.aggregate(materials)
# One entry per unique material name, with totals summed

Classifying Materials

The classify method maps material names to standard classification codes using a built-in lookup table. Two systems are supported: UniFormat and MasterFormat.

materials = takeoff.extract(elements)
aggregated = takeoff.aggregate(materials)

# Classify with UniFormat (default)
classified = takeoff.classify(aggregated, system="UniFormat")

# Classify with MasterFormat
classified = takeoff.classify(aggregated, system="MasterFormat")

for mat in classified:
    print(f"{mat.material_name}: {mat.classification_code} ({mat.classification_system})")

Classification uses exact matching first, then partial matching on the lowercased material name. Built-in mappings include common materials such as concrete, steel, wood, masonry, glass, aluminum, gypsum, insulation, carpet, tile, paint, roofing, waterproofing, brick, stone, plaster, copper, asphalt, and gravel.

UniFormat Classification Codes (Built-in)

Material	UniFormat Code
Concrete	A1010
Steel	A1020
Wood	A1030
Masonry	A1040
Glass	B2020
Aluminum	B2010
Gypsum	C1010
Insulation	C1020
Roofing	B3010

MasterFormat Classification Codes (Built-in)

Material	MasterFormat Code
Concrete	03 00 00
Steel	05 00 00
Wood	06 00 00
Masonry	04 00 00
Glass	08 80 00
Aluminum	08 40 00
Gypsum	09 20 00
Insulation	07 20 00
Roofing	07 50 00

Full Pipeline Example

from revitpy.extract import MaterialTakeoff

takeoff = MaterialTakeoff()
materials = takeoff.extract(elements)
aggregated = takeoff.aggregate(materials)
classified = takeoff.classify(aggregated, system="UniFormat")

for mat in classified:
    print(f"{mat.material_name} [{mat.classification_code}]: "
          f"volume={mat.volume:.2f}, area={mat.area:.2f}, mass={mat.mass:.2f}")

Or use the convenience function for the same result:

from revitpy.extract import material_takeoff

classified = material_takeoff(
    elements,
    aggregate=True,
    classify=True,
    classification_system="UniFormat",
)

CostEstimator

CostEstimator maps extracted quantities to unit costs from a pluggable cost database and produces itemized cost breakdowns with aggregated summaries.

Setting Up a Cost Database

The cost database maps category or material names (strings) to unit costs (floats). It can be supplied as a dict, or loaded from CSV, JSON, or YAML files.

from revitpy.extract import CostEstimator
from pathlib import Path

# From a dict
estimator = CostEstimator(cost_database={
    "Walls": 150.0,
    "Floors": 85.0,
    "Roofs": 200.0,
    "Doors": 500.0,
    "Windows": 750.0,
})

# From a file path (auto-detected by extension)
estimator = CostEstimator(cost_database=Path("costs.json"))

# Load a database after construction
estimator = CostEstimator()
estimator.load_database(Path("costs.csv"))

Supported file formats:

Format	Extension	Expected Structure
CSV	`.csv`	Columns: `name` or `category`, `unit_cost` or `cost`
JSON	`.json`	Top-level dict `{"name": cost}` or list of `{"name": ..., "unit_cost": ...}`
YAML	`.yaml`, `.yml`	Top-level dict mapping names to costs

Estimating Costs

Pass a list of QuantityItem objects (from QuantityExtractor) to the estimate method. The estimator looks up unit costs by category name using exact match, case-insensitive match, then partial match.

from revitpy.extract import QuantityExtractor, CostEstimator, AggregationLevel

extractor = QuantityExtractor()
quantities = extractor.extract(elements, quantity_types=[QuantityType.AREA])

estimator = CostEstimator(cost_database={"Walls": 150.0, "Floors": 85.0})
summary = estimator.estimate(quantities, aggregation=AggregationLevel.CATEGORY)

print(f"Total cost: ${summary.total_cost:,.2f}")
print(f"Currency: {summary.currency}")

CostSummary Structure

The estimate method returns a CostSummary dataclass:

Field	Type	Description
`items`	`list[CostItem]`	Itemized cost line items
`total_cost`	`float`	Grand total of all line items
`by_category`	`dict[str, float]`	Costs aggregated by element category
`by_system`	`dict[str, float]`	Costs aggregated by building system
`by_level`	`dict[str, float]`	Costs aggregated by level
`currency`	`str`	Currency code (default `"USD"`)

Each CostItem contains:

Field	Type	Description
`description`	`str`	Formatted as `"element_name - quantity_type"`
`quantity`	`float`	The quantity value
`unit`	`str`	The unit of measurement
`unit_cost`	`float`	The cost per unit
`total_cost`	`float`	`quantity * unit_cost`
`source`	`CostSource`	Where the cost data came from
`category`	`str`	Element category
`system`	`str`	Building system

Working with Cost Breakdowns

summary = estimator.estimate(quantities)

# Iterate line items
for item in summary.items:
    print(f"{item.description}: {item.quantity} {item.unit} "
          f"x ${item.unit_cost:.2f} = ${item.total_cost:.2f}")

# Category breakdown
for category, cost in summary.by_category.items():
    print(f"{category}: ${cost:,.2f}")

# Level breakdown
for level, cost in summary.by_level.items():
    print(f"{level}: ${cost:,.2f}")

DataExporter

DataExporter writes tabular data (lists of dicts) to CSV, JSON, Excel, Parquet, or plain dicts.

Export Configuration

All exports are configured through the ExportConfig dataclass:

Field	Type	Default	Description
`format`	`ExportFormat`	`ExportFormat.CSV`	Output format
`output_path`	`Path` or `None`	`None`	Destination file path (required for file formats)
`include_headers`	`bool`	`True`	Include column headers (CSV, Excel)
`decimal_places`	`int`	`2`	Rounding precision for float values (CSV, JSON)
`sheet_name`	`str`	`"Sheet1"`	Worksheet name (Excel only)

Exporting to CSV

from revitpy.extract import DataExporter, ExportConfig, ExportFormat
from pathlib import Path

exporter = DataExporter()
data = [
    {"name": "Wall-1", "category": "Walls", "area": 25.5, "cost": 3825.0},
    {"name": "Floor-1", "category": "Floors", "area": 100.0, "cost": 8500.0},
]

config = ExportConfig(
    format=ExportFormat.CSV,
    output_path=Path("quantities.csv"),
    include_headers=True,
    decimal_places=2,
)
output_path = exporter.export(data, config)
print(f"Exported to {output_path}")

Exporting to JSON

config = ExportConfig(
    format=ExportFormat.JSON,
    output_path=Path("quantities.json"),
    decimal_places=3,
)
output_path = exporter.export(data, config)

Exporting to Excel (Optional Dependency)

Excel export requires the openpyxl package. Install it with pip install openpyxl.

config = ExportConfig(
    format=ExportFormat.EXCEL,
    output_path=Path("quantities.xlsx"),
    sheet_name="Takeoff Data",
    include_headers=True,
)
output_path = exporter.export(data, config)

Exporting to Parquet (Optional Dependency)

Parquet export requires the pyarrow package. Install it with pip install pyarrow.

config = ExportConfig(
    format=ExportFormat.PARQUET,
    output_path=Path("quantities.parquet"),
)
output_path = exporter.export(data, config)

Returning Dicts

Use ExportFormat.DICT to return a shallow copy of the data as plain dicts, useful for passing into pandas or other downstream tools:

config = ExportConfig(format=ExportFormat.DICT)
rows = exporter.export(data, config)
# Returns: [{"name": "Wall-1", ...}, {"name": "Floor-1", ...}]

Direct Export Methods

DataExporter also exposes format-specific methods directly:

exporter = DataExporter()

# CSV
exporter.to_csv(data, Path("out.csv"), include_headers=True, decimal_places=2)

# JSON
exporter.to_json(data, Path("out.json"), decimal_places=2)

# Excel
exporter.to_excel(data, Path("out.xlsx"), sheet_name="Data", include_headers=True)

# Parquet
exporter.to_parquet(data, Path("out.parquet"))

# Dicts (passthrough)
rows = exporter.to_dicts(data)

Enums Reference

QuantityType

Value	String	Default Unit	Description
`AREA`	`"area"`	`m2`	Surface area
`VOLUME`	`"volume"`	`m3`	Volume
`LENGTH`	`"length"`	`m`	Length
`COUNT`	`"count"`	`ea`	Element count (defaults to 1 per element)
`WEIGHT`	`"weight"`	`kg`	Weight

AggregationLevel

Value	String	Description
`ELEMENT`	`"element"`	Group by individual element ID
`CATEGORY`	`"category"`	Group by element category
`LEVEL`	`"level"`	Group by building level
`SYSTEM`	`"system"`	Group by building system
`BUILDING`	`"building"`	Single group for the entire building

ExportFormat

Value	String	Requires
`CSV`	`"csv"`	Built-in
`JSON`	`"json"`	Built-in
`EXCEL`	`"excel"`	`openpyxl`
`PARQUET`	`"parquet"`	`pyarrow`
`DICT`	`"dict"`	Built-in

CostSource

Value	String	Description
`CSV_FILE`	`"csv_file"`	Loaded from a CSV file
`JSON_FILE`	`"json_file"`	Loaded from a JSON file
`YAML_FILE`	`"yaml_file"`	Loaded from a YAML file
`MANUAL`	`"manual"`	Supplied directly as a dict

End-to-End Example

A complete workflow from quantity extraction through cost estimation to export:

from pathlib import Path
from revitpy.extract import (
    QuantityExtractor,
    MaterialTakeoff,
    CostEstimator,
    DataExporter,
    ExportConfig,
    ExportFormat,
    QuantityType,
    AggregationLevel,
)

# Step 1: Extract quantities
extractor = QuantityExtractor()
quantities = extractor.extract(elements, quantity_types=[QuantityType.AREA, QuantityType.VOLUME])

# Step 2: Material takeoff
takeoff = MaterialTakeoff()
materials = takeoff.extract(elements)
materials = takeoff.aggregate(materials)
materials = takeoff.classify(materials, system="UniFormat")

# Step 3: Cost estimation
estimator = CostEstimator(cost_database=Path("unit_costs.json"))
summary = estimator.estimate(quantities, aggregation=AggregationLevel.CATEGORY)
print(f"Total estimated cost: ${summary.total_cost:,.2f}")

# Step 4: Export cost breakdown
export_data = [
    {
        "description": item.description,
        "quantity": item.quantity,
        "unit": item.unit,
        "unit_cost": item.unit_cost,
        "total_cost": item.total_cost,
        "category": item.category,
    }
    for item in summary.items
]

exporter = DataExporter()
config = ExportConfig(
    format=ExportFormat.CSV,
    output_path=Path("cost_report.csv"),
    decimal_places=2,
)
exporter.export(export_data, config)

Error Handling

The extraction layer defines specific exceptions for each subsystem:

Exception	Raised By	Description
`ExtractionError`	`MaterialTakeoff.extract`	General extraction failure
`QuantityError`	`QuantityExtractor.extract`	Quantity extraction failure for an element
`CostEstimationError`	`CostEstimator.load_database`, `CostEstimator.estimate`	Cost database or estimation failure
`ExportError`	`DataExporter.export` and format-specific methods	Export operation failure
`ScheduleError`	`ScheduleBuilder`	Schedule building failure

All exceptions include contextual attributes such as element_id, export_format, or output_path depending on the error type.