Data TransformationAdvancedAdvanced

Execute Python Script

Infoveave Data Automation — Advanced

When no built-in step does exactly what you need — write the logic yourself. The dataset arrives as df. Modify it with pandas. Return df. The pipeline continues.

Built-in transformation steps cover the most common data operations efficiently and without code. But every real-world data pipeline eventually encounters requirements that fall outside standard step coverage — a multi-condition classification that cannot be expressed in available filter steps, a custom weighted rolling calculation, a domain-specific regex pattern applied across multiple columns simultaneously, or an operation that combines three columns with conditional priority logic. Execute Python Script fills this gap by inserting a fully programmable pandas transformation step anywhere in the pipeline. The dataset at that point in the pipeline is passed in as a DataFrame named df, the script modifies it, and the resulting df is passed forward. This gives data engineers the full expressive power of Python and pandas for custom logic without breaking out of the pipeline architecture — all downstream steps continue to receive properly structured data from the script output.

Input:Any tabular dataset — the current pipeline DataFrame is injected as the variable df into the Python script execution environment, giving full pandas access to all columns and rowsOutput:The modified pandas DataFrame returned by the Python script, which replaces the current dataset in the pipeline and is passed to subsequent transformation steps

What Execute Python Script does

Run arbitrary Python code against the current pipeline dataset in Infoveave. Apply custom transformations, calculations, conditional logic, regex operations, multi-column derivations, or any pandas operation not covered by built-in steps — by writing Python that receives the dataset as a DataFrame (df) and returns the modified DataFrame.

When to use Execute Python Script

  • You need a transformation that combines multiple conditions, multiple columns, and custom logic that cannot be expressed with available built-in steps — such as a custom scoring formula, a domain-specific classification rule, or a pattern-based extraction using complex regex
  • You need to apply a specific pandas, numpy, or scipy operation that is not exposed as a built-in step — such as a rolling weighted average, a Z-score calculation, a custom log transformation with base-specific parameters, or a string operation using a highly specific regex pattern
  • You need to reshape or restructure the dataset in a way that built-in pivot, unpivot, or transpose steps do not support — Execute Python Script gives full DataFrame manipulation freedom including arbitrary merge, reshape, and multi-step transformation in one block
  • You are prototyping a new transformation logic that will eventually be requested as a built-in step, and you need it working in the pipeline immediately without waiting for the step to be added to the platform

When to avoid it

  • A built-in transformation step already covers the required operation — use the specific built-in step for reliability, performance, and configuration auditability; reserve Execute Python Script for genuinely custom logic
  • The transformation involves calling external APIs, writing to external databases, or accessing the network — Execute Python Script runs in a sandboxed environment and should not be used for network I/O or side effects; use dedicated connector or export steps for those operations
  • You need this logic to run reliably by non-technical users who cannot maintain Python code — custom script steps require Python knowledge to debug and update; for business-user-maintained pipelines, prefer configuration-driven built-in steps

Where it fits in your Infoveave automation

Execute Python Script is one step inside a multi-step Infoveave workflow. Chain it with other activities — no code, no manual hand-offs.

ConnectPrepare the dataset upstream with all required columns present and in the expected data types before the Python script step, so the script can immediately reference the columns by name
You are hereWrite ScriptWrite Python code that operates on df using standard pandas operations — create columns, apply functions, classify values, reshape, or compute custom metrics. Validate against a small sample row count before running on the full dataset.
Validate OutputConfirm the modified df has the expected columns, column names, and data types after the script step so downstream built-in steps receive the expected schema
Continue PipelineConnect subsequent built-in transformation steps normally — the script output is a standard DataFrame compatible with all downstream Infoveave steps

Build this workflow visually in Infoveave Data Automation — drag, connect, and schedule with no infrastructure setup.

Infoveave — Workflow Builder
● SavedSchedule: Daily 06:00
Data SourceConnectPrepare the dataset upstre…YOU ARE HEREWrite ScriptWrite Python code that ope…Validate OutputConfirm the modified df ha…Continue PipelineConnect subsequent built-i…Dashboard

How teams use Execute Python Script

Real scenarios where this transformation saves hours of manual work.

Retail

Apply Custom Discount Classification Formula with Multi-Column Conditional Logic

A retail analytics team needs to classify each order into a DiscountTier based on a combination of OrderValue, CustomerTier, and SeasonalFlag — a three-way conditional formula that the available classification steps cannot express in one configuration. Execute Python Script implements the custom logic: orders over 500 from Gold customers during seasonal sales receive Tier 1; orders 200-500 from any tier receive Tier 2; others receive Tier 3. The resulting DiscountTier column is used directly in the margin analysis dashboard.

Manufacturing

Compute Custom OEE Score with Weighted Sub-Component Averaging

A manufacturing analytics team calculates OEE using a domain-specific weighting formula that differs from standard OEE: Availability is weighted at 50%, Performance at 30%, and Quality at 20%. The built-in numeric column operations do not support weighted multi-column formulas in a single step. Execute Python Script implements the formula as df['OEE'] = df['Availability']*0.5 + df['Performance']*0.3 + df['Quality']*0.2, then classifies the result into bands using df['OEEBand'] = pd.cut(df['OEE'], ...). The two derived columns feed the production dashboard.

Finance

Derive Risk Score with Multi-Factor Conditional Scoring Logic

A bank's risk team needs a custom CreditRiskScore combining TransactionVelocity, AccountAge, and GeoRiskFlag with conditional priority weighting that varies based on whether the GeoRiskFlag is active. The built-in steps cannot express the priority switching logic. Execute Python Script implements the conditional scoring: high-geo-risk accounts get a boost added to base velocity score, while standard accounts use plain age-adjusted velocity. The resulting CreditRiskScore column is passed to the fraud alert pipeline as the primary risk input.

See Execute Python Script in action

Input data (left) is transformed using the configuration below. The output table (right) is ready for dashboards or downstream steps.

Code:df['Score'] = df['Score'].astype(float) * 10 df['Category'] = df['Score'].apply(lambda x: 'High' if x > 300 else 'Low')

Input Data

StudentIDScoreCategory
S00192Math
S00275Science
S00363Math
S00488Science
S00541Math

Output Data

StudentIDScoreCategory
S001920High
S002750High
S003630High
S004880High
S005410High

Configuration

Key fields to configure in the Infoveave workflow builder. Full reference available in the documentation.

Code

Enter the Python script that transforms the dataset. The DataFrame is available as the variable df at the start of the script — no import of the data is needed. All standard pandas operations are available: column creation, value mapping, apply functions, filtering, aggregation, reshaping, and regex operations. The script must end with the modified df available as df — it is automatically passed as the output to the next pipeline step. Do not print or return df explicitly — just ensure df is the final state of the dataset after your transformations.

Frequently asked questions

Everything you need to know about Execute Python Script in Infoveave.

Also in Advanced — and what runs before & after

Transformations in the same family as Execute Python Script, often chained together in the same Infoveave workflow.

Part of Infoveave Data Automation

80+ transformations. Zero manual steps.

Execute Python Script is one of over 80 transformation activities available inside Infoveave workflows. Chain transformations together — no code, no exports, no waiting for IT.

Ready to see Infoveave in action?

Book a Demo
ISO 27001ISO 27017ISO 27701GDPRHIPAACCPAAICPACSR LogoCapterra Reviews — Infoveave

© 2026 Noesys Software Pvt Ltd

Infoveave® is a product of Noesys

All Rights Reserved