Register MaD library-modeling resources for rust and swift; address review feedback

Copilot · data-douser · web-flow · commit 0c68c4641e86 · 2026-04-24T20:05:36.000Z
Agent-Logs-Url: https://github.com/advanced-security/codeql-development-mcp-server/sessions/4266e55f-3c7d-4ab3-9bd9-338cdb43bbee Co-authored-by: data-douser <70299490+data-douser@users.noreply.github.com>
diff --git a/server/dist/codeql-development-mcp-server.js b/server/dist/codeql-development-mcp-server.js
diff --git a/server/dist/codeql-development-mcp-server.js.map b/server/dist/codeql-development-mcp-server.js.map
diff --git a/server/src/prompts/data-extension-development.prompt.md b/server/src/prompts/data-extension-development.prompt.md
@@ -7,7 +7,7 @@ agent: agent
 Use this workflow to create CodeQL data extensions (Models-as-Data) for third-party libraries and frameworks. Data extensions let you customize taint tracking without writing QL code — you author YAML files that declare which functions are sources, sinks, summaries, barriers, or barrier guards.
 
 For format reference, read the MCP resource: `codeql://learning/data-extensions`
-For language-specific guidance: `codeql://languages/{{language}}/library-modeling`
+For language-specific guidance, read the corresponding `codeql://languages/<language>/library-modeling` resource. Available for: `cpp`, `csharp`, `go`, `java`, `javascript`, `python`, `ruby`, `rust`, `swift`.
 
 ## Workflow Checklist
 
diff --git a/server/src/resources/data-extensions-overview.md b/server/src/resources/data-extensions-overview.md
@@ -171,7 +171,8 @@ name: my-org/security-models
 version: 1.0.0
 dependencies:
   codeql/<language>-all: '*'
-dataExtensions: '*.yml'
+dataExtensions:
+  - 'ext/*.model.yml'
 ```
 
 ### Testing Extensions
diff --git a/server/src/resources/languages/rust_library_modeling.md b/server/src/resources/languages/rust_library_modeling.md
@@ -0,0 +1,95 @@
+# Customizing Library Models for Rust
+
+## Purpose
+
+Customize data-flow and taint analysis for Rust by modeling crates and libraries via data extensions (YAML) and model packs. This enables accurate flow tracking through third-party crates not included in CodeQL databases.
+
+For common guidance on data extensions (YAML structure, model packs, development workflow), see `codeql://learning/data-extensions`.
+
+> Rust MaD support in CodeQL is evolving; column layouts and supported predicates may change between CodeQL releases. Always cross-reference the upstream `codeql/rust-all` pack and the official [CodeQL docs for Rust](https://codeql.github.com/docs/codeql-language-guides/codeql-for-rust/) for the column layout in use by the CodeQL CLI version pinned in this repo.
+
+## Data Extensions Overview
+
+### Structure
+
+Data extensions use YAML format to extend CodeQL's knowledge of library behavior:
+
+```yaml
+extensions:
+  - addsTo:
+      pack: codeql/rust-all
+      extensible: <extensible-predicate>
+    data:
+      - <tuple1>
+      - <tuple2>
+```
+
+### Union Semantics
+
+- Multiple YAML files are combined
+- Rows are merged across files
+- Duplicates are automatically removed
+- Order of files doesn't matter
+
+## Model Format
+
+Rust uses a **MaD (Models as Data)** format keyed on **crate path** (`crate::module::Type::method`-style canonical paths) rather than the namespace/type/name/signature columns used by Java/C#/C++/Go. Tuples are typically shorter than the MaD-tuple-format languages and closer in spirit to the API-graph access-path style used by JavaScript/Python/Ruby — but the exact column layout is defined by the `codeql/rust-all` pack.
+
+## Extensible Predicates for Rust
+
+| Predicate      | Purpose                                                               |
+| -------------- | --------------------------------------------------------------------- |
+| `sourceModel`  | Model sources of tainted data (e.g. data read from network or env)    |
+| `sinkModel`    | Model sinks where tainted data is used unsafely                       |
+| `summaryModel` | Model flow through opaque library functions (taint or value flow)     |
+| `neutralModel` | Mark functions as having no dataflow impact (suppress generated rows) |
+
+Refer to `codeql/rust-all` (the `ext/*.model.yml` files in the upstream `codeql` repository under `rust/ql/lib/ext/`) for canonical examples of the exact tuple shape required by the current CodeQL CLI release.
+
+## Crate Path Column
+
+The crate path identifies a function or method by its fully qualified Rust path:
+
+- Free function: `tokio::fs::read_to_string`
+- Inherent method: `<std::path::PathBuf>::push`
+- Trait method: `<T as core::iter::Iterator>::next`
+- Generic types may need to be normalised (e.g. lifetime/type parameters elided) per the upstream pack's conventions.
+
+## Access Paths
+
+Rust models use access paths similar to other MaD languages, with `Argument[n]`, `Argument[self]`, `ReturnValue`, and (where supported) field/element selectors. Always validate against `codeql/rust-all` for which selectors are supported by the current release.
+
+## Common Sink Kinds
+
+`command-injection`, `path-injection`, `sql-injection`, `request-forgery`, `url-redirection`, `code-injection`
+
+## Sample Model
+
+```yaml
+extensions:
+  - addsTo:
+      pack: codeql/rust-all
+      extensible: sinkModel
+    data:
+      - [
+          'repo:https://github.com/rust-lang/rust:std',
+          '<crate::process::Command>::arg',
+          'Argument[0]',
+          'command-injection',
+          'manual'
+        ]
+```
+
+> The exact column count and ordering above is **illustrative**; verify against the `codeql/rust-all` pack shipped with the CodeQL CLI version recorded in `.codeql-version`. Authoring a tuple with the wrong column count will fail to load (often silently).
+
+## Validation Workflow
+
+1. Place `*.model.yml` files in your model-pack directory (or under `.github/codeql/extensions/` for the single-repo path).
+2. Run `codeql_query_run` against a database that exercises the modelled APIs and confirm new findings appear (sources/sinks) or expected findings disappear (barriers/neutrals).
+3. Add a unit test that exercises the new chain end-to-end using `codeql_test_run`.
+
+## Related Resources
+
+- `codeql://learning/data-extensions` — Common data extensions overview (both model formats)
+- `codeql://languages/rust/ast` — Rust AST class reference
+- [CodeQL for Rust](https://codeql.github.com/docs/codeql-language-guides/codeql-for-rust/) — Official Rust language guide
diff --git a/server/src/resources/languages/swift_library_modeling.md b/server/src/resources/languages/swift_library_modeling.md
@@ -0,0 +1,107 @@
+# Customizing Library Models for Swift
+
+## Purpose
+
+Customize data-flow and taint analysis for Swift by modeling frameworks and libraries via data extensions (YAML) and model packs. This enables accurate flow tracking through third-party libraries not included in CodeQL databases.
+
+For common guidance on data extensions (YAML structure, model packs, development workflow), see `codeql://learning/data-extensions`.
+
+## Data Extensions Overview
+
+### Structure
+
+Data extensions use YAML format to extend CodeQL's knowledge of library behavior:
+
+```yaml
+extensions:
+  - addsTo:
+      pack: codeql/swift-all
+      extensible: <extensible-predicate>
+    data:
+      - <tuple1>
+      - <tuple2>
+```
+
+### Union Semantics
+
+- Multiple YAML files are combined
+- Rows are merged across files
+- Duplicates are automatically removed
+- Order of files doesn't matter
+
+## Model Format
+
+Swift uses a **MaD (Models as Data)** format with multi-column tuples that identify callables by module/type/name/signature — the same structural family as Java/Kotlin, C#, C/C++, and Go. Methods are keyed on Swift's module-qualified type and method names (e.g. `Foundation.URLRequest.init(url:)`).
+
+## Extensible Predicates for Swift
+
+| Predicate      | Purpose                                                               |
+| -------------- | --------------------------------------------------------------------- |
+| `sourceModel`  | Model sources of tainted data                                         |
+| `sinkModel`    | Model sinks where tainted data is used unsafely                       |
+| `summaryModel` | Model flow through opaque library functions/methods                   |
+| `barrierModel` | Model barriers (sanitizers) that stop taint flow                      |
+| `neutralModel` | Mark callables as having no dataflow impact (suppress generated rows) |
+
+Refer to `codeql/swift-all` (the `ext/*.model.yml` files under `swift/ql/lib/ext/` in the upstream `codeql` repository) for the canonical column layout used by the current CodeQL CLI release. Authoring a tuple with the wrong column count will fail to load (often silently).
+
+## Identifier Columns
+
+Swift models typically identify a callable by:
+
+- **module** — Swift module name (e.g. `Foundation`, `UIKit`, the package/target name for third-party code)
+- **type** — Type name (`""` for module-level free functions)
+- **subtypes** — Whether to apply to subtypes (`true`/`false`)
+- **name** — Method or function name (e.g. `init(url:)`, `data(using:)`)
+- **signature** — Parameter signature (`""` for any)
+
+The exact column count and order is defined by the `codeql/swift-all` pack — always cross-check before authoring rows.
+
+## Access Paths
+
+Swift access paths follow the same conventions as the other MaD-tuple languages:
+
+| Component        | Description                                     |
+| ---------------- | ----------------------------------------------- |
+| `Argument[n]`    | Argument at index n (0-based, excluding `self`) |
+| `Argument[self]` | The receiver of a method call                   |
+| `Parameter[n]`   | Parameter at index n (used by `summaryModel`)   |
+| `ReturnValue`    | Return value of a call                          |
+
+## Common Sink Kinds
+
+`command-injection`, `path-injection`, `sql-injection`, `request-forgery`, `url-redirection`, `code-injection`, `predicate-injection`
+
+## Sample Model
+
+```yaml
+extensions:
+  - addsTo:
+      pack: codeql/swift-all
+      extensible: sinkModel
+    data:
+      - [
+          'Foundation',
+          'NSPredicate',
+          false,
+          'init(format:argumentArray:)',
+          '',
+          '',
+          'Argument[0]',
+          'predicate-injection',
+          'manual'
+        ]
+```
+
+> The exact column count above is **illustrative**; verify against the `codeql/swift-all` pack shipped with the CodeQL CLI version recorded in `.codeql-version`.
+
+## Validation Workflow
+
+1. Place `*.model.yml` files in your model-pack directory (or under `.github/codeql/extensions/` for the single-repo path).
+2. Run `codeql_query_run` against a database that exercises the modelled APIs and confirm new findings appear (sources/sinks) or expected findings disappear (barriers/neutrals).
+3. Add a unit test that exercises the new chain end-to-end using `codeql_test_run`.
+
+## Related Resources
+
+- `codeql://learning/data-extensions` — Common data extensions overview (both model formats)
+- [CodeQL for Swift](https://codeql.github.com/docs/codeql-language-guides/codeql-for-swift/) — Official Swift language guide
diff --git a/server/src/resources/server-overview.md b/server/src/resources/server-overview.md
@@ -10,22 +10,22 @@ The CodeQL Development MCP Server wraps the CodeQL CLI and supporting utilities
 
 Read these resources via `resources/read` to deepen your understanding:
 
-| URI                                              | Purpose                                               |
-| ------------------------------------------------ | ----------------------------------------------------- |
-| `codeql://server/overview`                       | This guide — MCP server orientation                   |
-| `codeql://server/queries`                        | Bundled tools queries (PrintAST, PrintCFG, etc.)      |
-| `codeql://server/tools`                          | Complete default tool reference                       |
-| `codeql://server/prompts`                        | Complete prompt reference                             |
-| `codeql://learning/query-basics`                 | QL query writing reference (syntax, metadata, etc.)   |
-| `codeql://learning/test-driven-development`      | TDD theory and workflow for CodeQL                    |
-| `codeql://learning/data-extensions`              | Data extensions (Models-as-Data) overview and formats |
-| `codeql://templates/security`                    | Security query templates (multi-language)             |
-| `codeql://patterns/performance`                  | Performance profiling and optimization                |
-| `codeql://guides/query-unit-testing`             | Guide for creating and running CodeQL query tests     |
-| `codeql://guides/dataflow-migration-v1-to-v2`    | Migrating from v1 to v2 dataflow API                  |
-| `codeql://languages/{language}/ast`              | Language-specific AST class reference                 |
-| `codeql://languages/{language}/security`         | Language-specific security patterns                   |
-| `codeql://languages/{language}/library-modeling` | Language-specific library modeling (data extensions)  |
+| URI                                              | Purpose                                                                                                                                                                                 |
+| ------------------------------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `codeql://server/overview`                       | This guide — MCP server orientation                                                                                                                                                     |
+| `codeql://server/queries`                        | Bundled tools queries (PrintAST, PrintCFG, etc.)                                                                                                                                        |
+| `codeql://server/tools`                          | Complete default tool reference                                                                                                                                                         |
+| `codeql://server/prompts`                        | Complete prompt reference                                                                                                                                                               |
+| `codeql://learning/query-basics`                 | QL query writing reference (syntax, metadata, etc.)                                                                                                                                     |
+| `codeql://learning/test-driven-development`      | TDD theory and workflow for CodeQL                                                                                                                                                      |
+| `codeql://learning/data-extensions`              | Data extensions (Models-as-Data) overview and formats                                                                                                                                   |
+| `codeql://templates/security`                    | Security query templates (multi-language)                                                                                                                                               |
+| `codeql://patterns/performance`                  | Performance profiling and optimization                                                                                                                                                  |
+| `codeql://guides/query-unit-testing`             | Guide for creating and running CodeQL query tests                                                                                                                                       |
+| `codeql://guides/dataflow-migration-v1-to-v2`    | Migrating from v1 to v2 dataflow API                                                                                                                                                    |
+| `codeql://languages/{language}/ast`              | Language-specific AST class reference                                                                                                                                                   |
+| `codeql://languages/{language}/security`         | Language-specific security patterns                                                                                                                                                     |
+| `codeql://languages/{language}/library-modeling` | Language-specific library modeling — registered for every CodeQL language that supports Models-as-Data (`cpp`, `csharp`, `go`, `java`, `javascript`, `python`, `ruby`, `rust`, `swift`) |
 
 ## Quick-Start Workflows
 
diff --git a/server/src/types/language-types.ts b/server/src/types/language-types.ts
@@ -28,6 +28,8 @@ import pythonSecurity from '../resources/languages/python_security_query_guide.m
 import rubyAst from '../resources/languages/ruby_ast.md';
 import rubyLibraryModeling from '../resources/languages/ruby_library_modeling.md';
 import rustAst from '../resources/languages/rust_ast.md';
+import rustLibraryModeling from '../resources/languages/rust_library_modeling.md';
+import swiftLibraryModeling from '../resources/languages/swift_library_modeling.md';
 
 export interface LanguageResource {
   language: string;
@@ -99,6 +101,15 @@ export const LANGUAGE_RESOURCES: LanguageResource[] = [
   },
   {
     language: 'rust',
-    astContent: rustAst
+    astContent: rustAst,
+    additionalResources: {
+      'library-modeling': rustLibraryModeling,
+    }
+  },
+  {
+    language: 'swift',
+    additionalResources: {
+      'library-modeling': swiftLibraryModeling,
+    }
   }
 ];
diff --git a/server/test/src/resources/language-resources.test.ts b/server/test/src/resources/language-resources.test.ts
@@ -111,7 +111,7 @@ describe('Language Resources', () => {
     it('should register additional resources for all languages with additional content', () => {
       registerLanguageAdditionalResources(mockServer);
 
-      expect(mockServer.resource).toHaveBeenCalledTimes(9);
+      expect(mockServer.resource).toHaveBeenCalledTimes(11);
 
       const resourceCalls = (mockServer.resource as ReturnType<typeof vi.fn>).mock.calls;
       const resourceNames = resourceCalls.map((call: unknown[]) => call[0]);
@@ -125,6 +125,8 @@ describe('Language Resources', () => {
       expect(resourceNames).toContain('JAVASCRIPT Library Modeling');
       expect(resourceNames).toContain('PYTHON Library Modeling');
       expect(resourceNames).toContain('RUBY Library Modeling');
+      expect(resourceNames).toContain('RUST Library Modeling');
+      expect(resourceNames).toContain('SWIFT Library Modeling');
     });
 
     it('should register additional resources with correct URIs', () => {
@@ -146,8 +148,8 @@ describe('Language Resources', () => {
     it('should register all language resources', () => {
       registerLanguageResources(mockServer);
 
-      // 9 AST + 5 security + 9 additional = 23
-      expect(mockServer.resource).toHaveBeenCalledTimes(23);
+      // 9 AST + 5 security + 11 additional = 25
+      expect(mockServer.resource).toHaveBeenCalledTimes(25);
     });
 
     it('every registered handler should return non-empty content', async () => {