Infoveave Data Automation — Text & String
A full URL column becomes six structured columns — protocol, host, port, path, query, fragment — each ready for its own analysis axis without any string parsing code.
URLs in web logs, referral tracking, API response data, and link analysis datasets contain multiple structured pieces of information compressed into a single string. Filtering by host domain, aggregating traffic by URL path, or extracting query parameters all require breaking the URL into its components first. Split URL handles the complete RFC-standard URL decomposition automatically — you specify which component should go into which named output column and Infoveave parses every row. For query parameter key-value extraction beyond the raw query string, chain Split HTTP Query after Split URL.
Break full URL strings into protocol, host, port, path, query, and fragment columns in Infoveave. Extract web metadata from URL columns for traffic analysis, path-level reporting, and query parameter investigation.
Split URL is one step inside a multi-step Infoveave workflow. Chain it with other activities — no code, no manual hand-offs.
Build this workflow visually in Infoveave Data Automation — drag, connect, and schedule with no infrastructure setup.
Real scenarios where this transformation saves hours of manual work.
A platform engineering team processes web server access logs where each record includes the full URL accessed by the user. Split URL decomposes each URL into protocol, host, port, and path columns. The team can then filter by protocol to audit HTTP-versus-HTTPS usage, group page view counts by top-level path to identify the most accessed service sections, and flag requests to unexpected host values that may indicate misrouting.
A digital marketing team processes campaign performance data where each impression or click record includes the full landing page URL with UTM parameters. Split URL separates the path into a dedicated column, allowing the team to aggregate campaign performance by landing page path independently of the query string. The raw query column is then passed to Split HTTP Query to extract individual UTM parameter values.
A marketing operations team audits a library of tracked links where each link record contains the full destination URL including protocol, host, and path. Split URL extracts each component into named columns. The team identifies URLs still using HTTP instead of HTTPS, links pointing to deprecated hosts, and redirects landing on unexpected path structures — all from column-level filters rather than string pattern matching on the raw URL.
Input data (left) is transformed using the configuration below. The output table (right) is ready for dashboards or downstream steps.
urlProtocolHostPortPathQueryFragmentInput Data
| Employee ID | Name | url |
|---|---|---|
| E001 | John Doe | https://company.com:8080/employee?id=E001&name=John+Doe#profile |
| E002 | Marie Dupont | http://marketing.com/employee?id=E002&name=Marie+Dupont |
| E003 | Carlos Gomez | ftp://fileserver.com:21/download?id=E003&name=Carlos+Gomez |
Output Data
| Employee ID | Name | url | Protocol | Host | Port | Path | Query | Fragment |
|---|---|---|---|---|---|---|---|---|
| E001 | John Doe | https://company.com:8080/employee?id=E001&name=John+Doe#profile | https | company.com | 8080 | /employee | id=E001&name=John+Doe | profile |
| E002 | Marie Dupont | http://marketing.com/employee?id=E002... | http | marketing.com | /employee | id=E002&name=Marie+Dupont | ||
| E003 | Carlos Gomez | ftp://fileserver.com:21/download?id=E003... | ftp | fileserver.com | 21 | /download | id=E003&name=Carlos+Gomez |
Key fields to configure in the Infoveave workflow builder. Full reference available in the documentation.
URL Column
Select the column containing the full URL strings to parse. Only one column can be selected. URLs should include the protocol scheme — http, https, ftp — for all components to be extracted correctly. URLs without a protocol may not produce a valid Protocol component.
Named component columns
Specify the output column name for each URL component: Protocol, Host, Port, Path, Query, and Fragment. You can omit any component you do not need by leaving that column name blank — no output column will be generated for omitted components. Each named column holds exactly one URL component per row, with empty values for rows where that component is not present in the URL.
Query column
The Query output column holds the raw query string — the portion after the question mark — as a single unparsed string like id=E001&name=John+Doe. This column does not split the query into individual key-value pairs. Pass this column into Split HTTP Query in a subsequent step if you need individual query parameters as separate named columns.
Everything you need to know about Split URL in Infoveave.
Transformations in the same family as Split URL, often chained together in the same Infoveave workflow.
Part of Infoveave Data Automation
Split URL is one of over 80 transformation activities available inside Infoveave workflows. Chain transformations together — no code, no exports, no waiting for IT.
Ready to see Infoveave in action?