Data TransformationNumericBeginner

Normalize Columns

Infoveave Data Automation — Numeric

jOHN DOE becomes John Doe, leading spaces vanish, and accented é becomes e — all applied per column in one step so lookups and deduplication actually work.

Text data collected from forms, imports, and APIs arrives with inconsistent formatting that breaks matching, deduplication, and classification logic. A customer name stored as JOHN DOE in one system and john doe in another will not match in a join — even though they represent the same person. Category fields imported from different regional teams may use title case in some files and uppercase in others. Addresses with trailing spaces cause lookup mismatches. Normalize Columns applies a configurable set of text standardization rules to each column independently — case conversion, whitespace removal, special character stripping, and accent normalization — so the data that enters downstream matching, grouping, and classification steps is consistently formatted across all rows and sources.

Input:Tabular dataset with text columns containing inconsistently formatted string values such as mixed case names, values with leading or trailing whitespace, special characters, or accented charactersOutput:Tabular dataset with text columns standardized using the configured normalization techniques — case conversion, whitespace handling, special character stripping, or accent normalization — with original columns optionally preserved

What Normalize Columns does

Standardize text columns in Infoveave using title case, uppercase, lowercase, whitespace removal, accent normalization, and special character stripping. Prepare consistent text data for matching, deduplication, lookups, and classification pipelines without writing cleaning scripts.

When to use Normalize Columns

  • You are preparing a customer or master data file for deduplication or entity matching where name, city, and category columns arrive with mixed casing and inconsistent whitespace from different source systems, and consistent formatting is required before the matching algorithm runs
  • You are standardizing import data from multiple CSV files or API responses where text fields use different capitalization conventions — some use all-caps, some use title case, some use lowercase — and need a uniform format before loading to a reference dataset or running lookup joins
  • You are cleaning category or label columns that contain special characters, ampersands, or diacritical marks from international data sources, and these characters need to be removed or replaced with plain ASCII equivalents before the data can be used in filter conditions or grouping
  • You are string-matching address, product name, or job title columns across two datasets and need both sides to be formatted identically before the join so matches are not missed because of trivial formatting differences like an extra space or an accented character

When to avoid it

  • You need to format numeric columns — not text — using comma separators or percentage notation; that is Numerical Formatting, which operates on numeric values rather than text strings
  • You need to find and replace specific string values within text — replacing one keyword with another — rather than applying a systematic formatting transformation; use Find and Replace for targeted value substitution
  • You need to extract a sub-string or split a text column on a delimiter — those are Extract Numbers, Split Column, or Find Text operations, not normalization of the full column value

Where it fits in your Infoveave automation

Normalize Columns is one step inside a multi-step Infoveave workflow. Chain it with other activities — no code, no manual hand-offs.

ConnectLoad master data, import files, or any text dataset with inconsistently formatted name, category, or label columns
You are hereNormalize ColumnsAssign normalization techniques to each text column — case conversion, whitespace handling, special character stripping, or accent normalization
Find and ReplaceApply targeted value corrections after normalization if specific strings need further substitution beyond case and character changes
Join or MatchUse normalized columns as join keys to match records across datasets with confident that formatting differences no longer cause false non-matches
Filter or GroupApply Filter on Values or aggregate by canonical category values now that all variants have been unified to a consistent format

Build this workflow visually in Infoveave Data Automation — drag, connect, and schedule with no infrastructure setup.

Infoveave — Workflow Builder
● SavedSchedule: Daily 06:00
Data SourceConnectLoad master data, import f…YOU ARE HERENormalize ColumnsAssign normalization techn…Find and ReplaceApply targeted value corre…Join or MatchUse normalized columns as …Filter or GroupApply Filter on Values or …Dashboard

How teams use Normalize Columns

Real scenarios where this transformation saves hours of manual work.

Retail

Standardize Product Category Labels Across Regional Import Files

A retail data team ingests product catalog files from regional teams where the same category appears as IT & Tech in one file, it tech in another, and IT TECH in a third. Normalize Columns applies Title Case and Remove Special Characters to the category column, converting all three variants to It Tech after special character removal. A subsequent Find and Replace step corrects the It to IT for the final IT Tech label. The standardized category allows clean pivot and grouping operations across all regional files.

Manufacturing

Normalize Supplier Names for Master Data Matching

A procurement team processes supplier invoices from multiple entry points where the same supplier name is recorded as ACME CORP., Acme Corp, and acme_corp. Normalize Columns applies Title Case, Remove Whitespace, and Remove Special Characters to the supplier name column, producing AcmeCorp from all three variants. This normalized form is used as the matching key against a supplier master data table, reducing false non-matches that previously required manual review.

Finance

Remove Accents from Beneficiary Name Fields for SWIFT Compliance

A financial operations team processes international wire transfer records where beneficiary names include characters like é, ü, ñ, and ç from European and Latin American source systems. SWIFT payment message standards require plain ASCII character encoding. Normalize Columns applies Normalize Accents to the beneficiary name column, converting accented characters to their unaccented ASCII equivalents — é to e, ü to u — so processed records comply with the encoding requirements of the downstream payment messaging system.

See Normalize Columns in action

Input data (left) is transformed using the configuration below. The output table (right) is ready for dashboards or downstream steps.

Name column:Title Case
Description column:Uppercase
Category column:Remove Special Characters, Title Case
Include original:Enabled

Input Data

IDNameDescriptionCategory
1jOHN DOESoftware EngineerIT & Tech
2jane SMITHData ScientistAnalytics
3mark_o'learyMachine LearningAI & ML

Output Data

IDNameDescriptionCategory
1John DoeSOFTWARE ENGINEERIT Tech
2Jane SmithDATA SCIENTISTAnalytics
3Mark O'LearyMACHINE LEARNINGAI ML

Configuration

Key fields to configure in the Infoveave workflow builder. Full reference available in the documentation.

Column Map

Assign one or more normalization techniques to each text column. Techniques can be combined — for example applying Remove Special Characters first, then Title Case — and are applied in the order they are configured. Each column gets its own independent set of techniques, so Name can use Title Case while Category uses Remove Special Characters and Title Case in the same step.

Normalization Techniques

Available options include: Convert to lowercase (alice), Convert to uppercase (ALICE), Convert to title case (Alice Smith), Capitalize first letter (Alice smith), Trim whitespace (removes leading and trailing spaces), Remove whitespace (deletes all spaces), Remove special characters (strips punctuation and symbols like &, %, #, '), and Normalize accents (replaces é with e, ü with u, ñ with n). Select the combination that produces the target format for each column.

Include Original

Retain the source column alongside the normalized version. Useful for audit comparisons and when original values are needed as lookup references while normalized values are used as matching keys.

Frequently asked questions

Everything you need to know about Normalize Columns in Infoveave.

Also in Numeric — and what runs before & after

Transformations in the same family as Normalize Columns, often chained together in the same Infoveave workflow.

Part of Infoveave Data Automation

80+ transformations. Zero manual steps.

Normalize Columns is one of over 80 transformation activities available inside Infoveave workflows. Chain transformations together — no code, no exports, no waiting for IT.

Ready to see Infoveave in action?

Book a Demo
ISO 27001ISO 27017ISO 27701GDPRHIPAACCPAAICPACSR LogoCapterra Reviews — Infoveave

© 2026 Noesys Software Pvt Ltd

Infoveave® is a product of Noesys

All Rights Reserved