Data TransformationGenerationBeginner

Generate ID

Infoveave Data Automation — Generation

Records arrived from a legacy system without an ID column populated. Every null cell in UserID gets a unique UUID — existing IDs stay exactly as they are.

Data arriving from multiple source systems frequently contains records without populated identifier columns — legacy system exports that leave the ID field blank for certain record types, merged datasets where one source lacks a primary key, or staging tables populated before auto-increment keys are assigned. Processing these records downstream without identifiers prevents reliable deduplication, joining, tracking, and auditing. Generate ID solves this by scanning the configured column and filling only the null or empty cells with UUID v4 identifiers — globally unique 128-bit values formatted as 8-4-4-4-12 hexadecimal strings. Rows that already have identifiers are untouched, so running the step on a dataset that partially already has IDs will not overwrite any existing values. This makes it safe to use in incremental or re-run pipelines where some records may have already received IDs in a prior run.

Input:Tabular dataset with an identifier column that contains null or empty values in some rows — existing non-null values are left unchanged and only the null gaps are filled with newly generated UUID v4 identifiersOutput:Tabular dataset with the same column where all previously null or empty cells have been replaced with unique UUID v4 strings, while rows that already had non-null identifier values are unmodified

What Generate ID does

Automatically fill empty or null cells in an identifier column with unique UUID v4 values in Infoveave. Generate surrogate keys for records that arrived without a primary identifier from source systems — without NEWID() queries, uuid4() Python calls, or spreadsheet GUID formulas — while leaving existing non-null identifiers unchanged.

When to use Generate ID

  • Records arriving from a legacy source system or file export do not have a populated primary identifier column and need a unique ID assigned before loading into a target database or data warehouse that requires a non-null key
  • You are merging datasets from two sources where one source provides unique IDs and the other does not — Generate ID fills in IDs for the records from the ID-less source without affecting the records that already have IDs
  • You are rebuilding or backfilling a surrogate key column in a dataset where some historical records were imported without keys and need unique identifiers assigned before re-loading into the master table
  • You need a stable reference key for records in a pipeline that will be run incrementally — Generate ID only fills nulls, so re-running the pipeline on already-processed records will not reassign IDs to records that already have them

When to avoid it

  • You need to generate IDs for all rows unconditionally regardless of whether they already have a value — Generate ID only fills null/empty cells; to replace all IDs, clear the column first or use a full-column ID generation approach
  • You need sequential numeric identifiers (1, 2, 3...) rather than UUID format — Generate ID only produces UUID v4 format strings; for sequential IDs, use a row number or auto-increment step
  • You need to ensure that the same record always receives the same ID across multiple pipeline runs — UUID v4 values are random on each generation; for consistent hash-based surrogate keys from existing column values, use a SHA-256 hash of a stable attribute column instead

Where it fits in your Infoveave automation

Generate ID is one step inside a multi-step Infoveave workflow. Chain it with other activities — no code, no manual hand-offs.

ConnectLoad records from source systems, merged datasets, or legacy exports that may have null or empty identifier column values
You are hereGenerate IDsSelect the identifier column. Null and empty cells are filled with unique UUID v4 values. Existing non-null values are untouched.
LoadLoad the dataset into the target database or data warehouse where all records now have a non-null identifier for indexing, joining, and auditing

Build this workflow visually in Infoveave Data Automation — drag, connect, and schedule with no infrastructure setup.

Infoveave — Workflow Builder
● SavedSchedule: Daily 06:00
Data SourceConnectLoad records from source s…YOU ARE HEREGenerate IDsSelect the identifier colu…LoadLoad the dataset into the …Dashboard

How teams use Generate ID

Real scenarios where this transformation saves hours of manual work.

Retail

Assign Surrogate Keys to Imported Product Records Missing SKU

A retail data team imports product catalog records from a supplier's Excel export where some rows arrive without a SKU value in the ProductSKU column. The product data warehouse requires a non-null ProductSKU for all records before loading. Generate ID fills the null ProductSKU cells with UUID v4 values, creating internal surrogate identifiers for the no-SKU products. Existing rows with proper SKUs are unaffected.

Manufacturing

Backfill Missing Work Order IDs in Historical Production Records

A manufacturing company migrated legacy production records from a paper-based system. The WorkOrderID column is null for all historical records because paper work orders had no digital IDs. Generate ID fills the null WorkOrderID column with UUID v4 values in the migration pipeline. The team can now join historical production records to quality events using the newly assigned work order identifiers.

Finance

Generate Transaction Reference IDs for Records Arriving Without References

A financial data pipeline ingests transaction records from a partner institution where the TransactionRef column is null for a subset of transactions originating from channels that do not generate reference numbers. Before loading into the reconciliation database, Generate ID fills null TransactionRef cells with UUID identifiers, enabling all transactions to be tracked and audited with a unique reference regardless of origin channel.

See Generate ID in action

Input data (left) is transformed using the configuration below. The output table (right) is ready for dashboards or downstream steps.

Column:UserID

Input Data

NameEmailUserID
Alice Johnson[email protected]USR-001
Bob Smith[email protected]
Carol Davis[email protected]USR-003
Dave Wilson[email protected]
Eve Thompson[email protected]

Output Data

NameEmailUserID
Alice Johnson[email protected]USR-001
Bob Smith[email protected]550e8400-e29b-41d4-a716-446655440000
Carol Davis[email protected]USR-003
Dave Wilson[email protected]7f6e5d4c-3b2a-1f0e-9d8c-7b6a5f4e3d2c
Eve Thompson[email protected]9a8b7c6d-5e4f-3g2h-1i0j-9k8l7m6n5o4p

Configuration

Key fields to configure in the Infoveave workflow builder. Full reference available in the documentation.

Column

Select the single column whose null or empty values will be filled with UUID v4 identifiers. The transformation scans the column and generates a new unique UUID for each row where the cell is null or empty. Rows where the cell already contains a non-null, non-empty value are left exactly as they are. Only one column is configured per Generate ID step — to fill nulls in multiple separate ID columns, use multiple Generate ID steps.

Frequently asked questions

Everything you need to know about Generate ID in Infoveave.

Also in Generation — and what runs before & after

Transformations in the same family as Generate ID, often chained together in the same Infoveave workflow.

Part of Infoveave Data Automation

80+ transformations. Zero manual steps.

Generate ID is one of over 80 transformation activities available inside Infoveave workflows. Chain transformations together — no code, no exports, no waiting for IT.

Ready to see Infoveave in action?

Book a Demo
ISO 27001ISO 27017ISO 27701GDPRHIPAACCPAAICPACSR LogoCapterra Reviews — Infoveave

© 2026 Noesys Software Pvt Ltd

Infoveave® is a product of Noesys

All Rights Reserved