Feature and its Use Cases
Feature Request: Local Data Redaction for PHI (HIPAA Privacy Compliance)
Description
Currently, the DocPilot application sends raw, unredacted transcripts containing Protected Health Information (PHI)—such as patient names, dates, and medical identifiers—directly to the Google Gemini API for summarization.
Without a Business Associate Agreement (BAA) and an Enterprise API tier, transmitting raw PHI to public LLM endpoints is a severe privacy vulnerability and a violation of HIPAA/GDPR compliance. We need to implement a mechanism to sanitize the transcript locally on the device before it gets sent to the network.
Proposed Solution
Implement a Local PII/PHI Redaction layer within the app:
- After Deepgram returns the transcript, intercept the String before passing it to the Gemini summarization function.
- Use a local Regex or lightweight NLP Named Entity Recognition (NER) pipeline (using
google_mlkit_nlp or standard Dart Regex formulas) to detect names, dates, and phone numbers.
- Replace the detected entities with placeholders (e.g.,
[PATIENT_NAME], [DATE], [ID]).
- Send the sanitized, placeholder-injected string to Gemini to generate the summary/prescription.
Tasks & Architecture
Alternatives Considered
- On-Device LLM (e.g., Gemma/Llama via ONNX): While 100% secure since no data leaves the device, compiling local models into Flutter is currently too heavy for the app size and performance budget. Local text redaction is the best immediate stop-gap.
Context
Medical applications demand high privacy standards. By adding a client-side redaction wall, we protect the patient's identity from third-party LLM data logging and make DocPilot significantly safer for real-world clinical usage.
Additional Context
No response
Code of Conduct
Feature and its Use Cases
Feature Request: Local Data Redaction for PHI (HIPAA Privacy Compliance)
Description
Currently, the DocPilot application sends raw, unredacted transcripts containing Protected Health Information (PHI)—such as patient names, dates, and medical identifiers—directly to the Google Gemini API for summarization.
Without a Business Associate Agreement (BAA) and an Enterprise API tier, transmitting raw PHI to public LLM endpoints is a severe privacy vulnerability and a violation of HIPAA/GDPR compliance. We need to implement a mechanism to sanitize the transcript locally on the device before it gets sent to the network.
Proposed Solution
Implement a Local PII/PHI Redaction layer within the app:
google_mlkit_nlpor standard Dart Regex formulas) to detect names, dates, and phone numbers.[PATIENT_NAME],[DATE],[ID]).Tasks & Architecture
AnonymizationServiceclass in theservices/directory.AnonymizationServiceinto theChatbotService/ Gemini processing flow.Alternatives Considered
Context
Medical applications demand high privacy standards. By adding a client-side redaction wall, we protect the patient's identity from third-party LLM data logging and make DocPilot significantly safer for real-world clinical usage.
Additional Context
No response
Code of Conduct