Custom / in-house PIM mapping.

The custom or in-house path applies to manufacturers without a commercial PIM. Two source patterns are common: a custom relational database, or spreadsheets and Word documents paired with separate photometric files. Both lead to the same ULC record shape; the difference is the ETL surface.

If your master is a relational database

A typical schema:

Products table: one row per SKU or family, with columns for identity, category, basic dimensions, rating.
Attributes table (typed columns or key/value): for values that vary per product.
Categories table: hierarchical or flat.
Files / assets table: cutsheet PDFs, IES, LDT; stored by pointer (S3 key, SAN path), not by bytes.
Relationships table: accessory compatibility, replacement chains, kit composition.
Test reports table: lab attestations with doc references.

If your master is spreadsheets and Word documents

A typical setup:

One spreadsheet (or Word table) per product line, with one row per SKU or per orderable configuration.
Photometric files (IES, LDT) stored separately on a file share, named by SKU or product code.
Cutsheet PDFs stored separately, often versioned by filename suffix.

The conversion is deterministic: the open-source ulc from-sheet converter turns a workbook into validated records, no hand-authoring and no scripting. Fill the published workbook template (a tab per kind of data, one row per product, keyed by record_id; tabular data kept in Word goes into the same workbook), point each row at its IES, LDT, and cutsheet files, and run ulc from-sheet. The converter detects each record's authoring pattern (single-SKU cutsheet, configurator with applicability, per-IES with provenance, or per-foot linear scaling), computes the dual-unit companions, SHA-256 hashes, default provenance, and the index, then validates each record against the schema. It reads either a folder of CSVs or a native .xlsx.

The same command converts hundreds or thousands of SKUs in one pass. Records that fail validation are reported with the validator's findings as the punch list, so only passing records publish. Offline and repeatable.

The validator catches schema errors before publication. The taxonomy reference is the value-set authority for enum fields like CCT, CRI bracket, optic, and IP rating.

Core mapping patterns

Identity

SQL / ORM column	ULC path
`manufacturer`	`product_family.manufacturer.slug`
Family / model-code column	`product_family.catalog_model`
SKU / order-code column	`configuration.catalog_number`
`series`	`product_family.catalog_line`
Derived full slug	`record_id`

In-house category	`primary_category`	`mounting_types`
`recessed_downlight`	`downlight`	`["recessed_ceiling"]`
`linear_pendant`	`linear`	`["pendant"]`
`wallpack_exterior`	`bulkhead_wall_pack`	`["surface_wall"]`
`highbay_industrial`	`high_bay`	`["pendant", "surface_ceiling"]`
`bollard_outdoor`	`bollard`	`["surface_floor"]`

Dimensional fields

Convert single-unit columns to ULC dual-unit at emit time. Common columns:

DB column	ULC path
`overall_diameter_mm`	`product_family.physical_dimensions.overall_diameter`
`overall_length_mm`	`product_family.physical_dimensions.overall_length`
`recess_depth_mm`	`product_family.physical_dimensions.recess_depth`
`weight_kg`	`product_family.physical_dimensions.luminaire_mass`

Attestations

Common attestation programs the in-house schema should track: DLC QPL (DesignLights Consortium Qualified Products List, a lighting-specific energy attestation), ENERGY STAR Luminaires V2.0+, UL listings (UL 1598 and similar), and lumen-maintenance attestations (LM-80 / TM-21 derived L70 / L80 / L90 hours). The emitter maps each to a ULC attestations[] entry with program, status, and value_type populated per the schema.

Gotchas

Schemas without scenarios. If the PIM has one row per SKU without a scenario concept, synthesize scenarios from photometric data (one scenario per IES file). If there's only one IES per SKU, that's Pattern A (simplest).
Legacy free-text fields. In-house PIMs often have free-text fields mixing structured data ("90 CRI, 3000K, 120V"). Extract via regex during ETL, or flag for manual review.
Unit-of-measure inconsistency. Older schemas may mix metric and Imperial in the same column. Normalize during ETL.
Missing provenance history. If the PIM was hand-populated over years, default provenance to {source: "manufacturer_direct", method: "transcribed"} and plan a follow-up enrichment pass.
No change tracking. Without CDC, the emitter either runs full sweeps or polls updated_at columns.
Manual-edit drift in spreadsheets. Spreadsheet-based masters accumulate typos, inconsistent unit strings (3000K versus 3000 K), and bracket-vs-value mismatches. Normalize before emit; flag suspicious rows for manual review.
One row, multiple records. A spreadsheet row that bundles several CCTs in one cell (2700K/3000K/3500K) expands to multiple ULC records, one per scenario. The ETL must split.
Photometric file linkage. Spreadsheets typically reference IES/LDT files by relative path or filename. The emitter must resolve these to canonical URLs and SHA-256 hashes for the source_files[] block.

Full guide with scenario expansion SQL, S3 streaming patterns, migration path to a commercial PIM, and Python + SQLAlchemy transform skeleton: github.com/ulcspec/ULC/blob/main/mappings/pim/custom-pim.md