Skip to content

bug: Block read_pickle and class definitions, restrict custom CSV field to read_csv() only#6257

Draft
chloebyun-wd wants to merge 2 commits intomainfrom
bug/523/remote-code-execution-vulneratbility-in-csv-agent
Draft

bug: Block read_pickle and class definitions, restrict custom CSV field to read_csv() only#6257
chloebyun-wd wants to merge 2 commits intomainfrom
bug/523/remote-code-execution-vulneratbility-in-csv-agent

Conversation

@chloebyun-wd
Copy link
Copy Markdown

@chloebyun-wd chloebyun-wd commented Apr 20, 2026

The vulnerability

The CSV Agent node lets users supply a "Custom Pandas Read_CSV Code" field. This input is injected into
pd.${customReadCSVFunc} and executed via Pyodide. A denylist blocks dangerous Python keywords, but pd.read_pickle() wasn't on it — allowing an attacker to deserialize a malicious pickle payload and achieve remote code execution.

How the fix works

Two independent layers — either one blocks the PoC.

1. Allowlist for the user-supplied field

The field is meant for read_csv() calls only, so we now enforce exactly that:

  • Must start with read_csv(
  • Must be a single line (no newlines or semicolons)
  • Denylist still applied on top as a safety net

This rejects the PoC's isnull("") class MiniBytesIO: ... pd.read_pickle(...) payload at the very first check.

2. New denylist entries

Four patterns added, protecting both user input and LLM-generated code:

  • read_pickle — the direct RCE vector
  • pickle — the deserialization module
  • marshal — another unsafe deserializer
  • class definitions — used in the PoC to build a fake file-like object

How the PoC is blocked

What the PoC does What stops it
Starts with isnull("") instead of read_csv( Allowlist rejects it
Defines a class MiniBytesIO Allowlist blocks newlines; denylist blocks class
Calls pd.read_pickle(...) Denylist blocks read_pickle
References pickle data Denylist blocks pickle

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enhances security by introducing a specialized validator for custom CSV reading and expanding the list of forbidden Python patterns to prevent unsafe deserialization and class definitions. A security concern was raised regarding the regex for read_pickle, suggesting a broader match to prevent bypasses via variable assignment.

Comment thread packages/components/src/pythonCodeValidator.ts Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant