Skip to content

Commit 0c68c46

Browse files
Copilotdata-douser
andauthored
Register MaD library-modeling resources for rust and swift; address review feedback
Agent-Logs-Url: https://github.com/advanced-security/codeql-development-mcp-server/sessions/4266e55f-3c7d-4ab3-9bd9-338cdb43bbee Co-authored-by: data-douser <70299490+data-douser@users.noreply.github.com>
1 parent e1287a3 commit 0c68c46

9 files changed

Lines changed: 259 additions & 28 deletions

server/dist/codeql-development-mcp-server.js

Lines changed: 18 additions & 3 deletions
Large diffs are not rendered by default.

server/dist/codeql-development-mcp-server.js.map

Lines changed: 3 additions & 3 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

server/src/prompts/data-extension-development.prompt.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ agent: agent
77
Use this workflow to create CodeQL data extensions (Models-as-Data) for third-party libraries and frameworks. Data extensions let you customize taint tracking without writing QL code — you author YAML files that declare which functions are sources, sinks, summaries, barriers, or barrier guards.
88

99
For format reference, read the MCP resource: `codeql://learning/data-extensions`
10-
For language-specific guidance: `codeql://languages/{{language}}/library-modeling`
10+
For language-specific guidance, read the corresponding `codeql://languages/<language>/library-modeling` resource. Available for: `cpp`, `csharp`, `go`, `java`, `javascript`, `python`, `ruby`, `rust`, `swift`.
1111

1212
## Workflow Checklist
1313

server/src/resources/data-extensions-overview.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -171,7 +171,8 @@ name: my-org/security-models
171171
version: 1.0.0
172172
dependencies:
173173
codeql/<language>-all: '*'
174-
dataExtensions: '*.yml'
174+
dataExtensions:
175+
- 'ext/*.model.yml'
175176
```
176177

177178
### Testing Extensions
Lines changed: 95 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,95 @@
1+
# Customizing Library Models for Rust
2+
3+
## Purpose
4+
5+
Customize data-flow and taint analysis for Rust by modeling crates and libraries via data extensions (YAML) and model packs. This enables accurate flow tracking through third-party crates not included in CodeQL databases.
6+
7+
For common guidance on data extensions (YAML structure, model packs, development workflow), see `codeql://learning/data-extensions`.
8+
9+
> Rust MaD support in CodeQL is evolving; column layouts and supported predicates may change between CodeQL releases. Always cross-reference the upstream `codeql/rust-all` pack and the official [CodeQL docs for Rust](https://codeql.github.com/docs/codeql-language-guides/codeql-for-rust/) for the column layout in use by the CodeQL CLI version pinned in this repo.
10+
11+
## Data Extensions Overview
12+
13+
### Structure
14+
15+
Data extensions use YAML format to extend CodeQL's knowledge of library behavior:
16+
17+
```yaml
18+
extensions:
19+
- addsTo:
20+
pack: codeql/rust-all
21+
extensible: <extensible-predicate>
22+
data:
23+
- <tuple1>
24+
- <tuple2>
25+
```
26+
27+
### Union Semantics
28+
29+
- Multiple YAML files are combined
30+
- Rows are merged across files
31+
- Duplicates are automatically removed
32+
- Order of files doesn't matter
33+
34+
## Model Format
35+
36+
Rust uses a **MaD (Models as Data)** format keyed on **crate path** (`crate::module::Type::method`-style canonical paths) rather than the namespace/type/name/signature columns used by Java/C#/C++/Go. Tuples are typically shorter than the MaD-tuple-format languages and closer in spirit to the API-graph access-path style used by JavaScript/Python/Ruby — but the exact column layout is defined by the `codeql/rust-all` pack.
37+
38+
## Extensible Predicates for Rust
39+
40+
| Predicate | Purpose |
41+
| -------------- | --------------------------------------------------------------------- |
42+
| `sourceModel` | Model sources of tainted data (e.g. data read from network or env) |
43+
| `sinkModel` | Model sinks where tainted data is used unsafely |
44+
| `summaryModel` | Model flow through opaque library functions (taint or value flow) |
45+
| `neutralModel` | Mark functions as having no dataflow impact (suppress generated rows) |
46+
47+
Refer to `codeql/rust-all` (the `ext/*.model.yml` files in the upstream `codeql` repository under `rust/ql/lib/ext/`) for canonical examples of the exact tuple shape required by the current CodeQL CLI release.
48+
49+
## Crate Path Column
50+
51+
The crate path identifies a function or method by its fully qualified Rust path:
52+
53+
- Free function: `tokio::fs::read_to_string`
54+
- Inherent method: `<std::path::PathBuf>::push`
55+
- Trait method: `<T as core::iter::Iterator>::next`
56+
- Generic types may need to be normalised (e.g. lifetime/type parameters elided) per the upstream pack's conventions.
57+
58+
## Access Paths
59+
60+
Rust models use access paths similar to other MaD languages, with `Argument[n]`, `Argument[self]`, `ReturnValue`, and (where supported) field/element selectors. Always validate against `codeql/rust-all` for which selectors are supported by the current release.
61+
62+
## Common Sink Kinds
63+
64+
`command-injection`, `path-injection`, `sql-injection`, `request-forgery`, `url-redirection`, `code-injection`
65+
66+
## Sample Model
67+
68+
```yaml
69+
extensions:
70+
- addsTo:
71+
pack: codeql/rust-all
72+
extensible: sinkModel
73+
data:
74+
- [
75+
'repo:https://github.com/rust-lang/rust:std',
76+
'<crate::process::Command>::arg',
77+
'Argument[0]',
78+
'command-injection',
79+
'manual'
80+
]
81+
```
82+
83+
> The exact column count and ordering above is **illustrative**; verify against the `codeql/rust-all` pack shipped with the CodeQL CLI version recorded in `.codeql-version`. Authoring a tuple with the wrong column count will fail to load (often silently).
84+
85+
## Validation Workflow
86+
87+
1. Place `*.model.yml` files in your model-pack directory (or under `.github/codeql/extensions/` for the single-repo path).
88+
2. Run `codeql_query_run` against a database that exercises the modelled APIs and confirm new findings appear (sources/sinks) or expected findings disappear (barriers/neutrals).
89+
3. Add a unit test that exercises the new chain end-to-end using `codeql_test_run`.
90+
91+
## Related Resources
92+
93+
- `codeql://learning/data-extensions` — Common data extensions overview (both model formats)
94+
- `codeql://languages/rust/ast` — Rust AST class reference
95+
- [CodeQL for Rust](https://codeql.github.com/docs/codeql-language-guides/codeql-for-rust/) — Official Rust language guide
Lines changed: 107 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,107 @@
1+
# Customizing Library Models for Swift
2+
3+
## Purpose
4+
5+
Customize data-flow and taint analysis for Swift by modeling frameworks and libraries via data extensions (YAML) and model packs. This enables accurate flow tracking through third-party libraries not included in CodeQL databases.
6+
7+
For common guidance on data extensions (YAML structure, model packs, development workflow), see `codeql://learning/data-extensions`.
8+
9+
## Data Extensions Overview
10+
11+
### Structure
12+
13+
Data extensions use YAML format to extend CodeQL's knowledge of library behavior:
14+
15+
```yaml
16+
extensions:
17+
- addsTo:
18+
pack: codeql/swift-all
19+
extensible: <extensible-predicate>
20+
data:
21+
- <tuple1>
22+
- <tuple2>
23+
```
24+
25+
### Union Semantics
26+
27+
- Multiple YAML files are combined
28+
- Rows are merged across files
29+
- Duplicates are automatically removed
30+
- Order of files doesn't matter
31+
32+
## Model Format
33+
34+
Swift uses a **MaD (Models as Data)** format with multi-column tuples that identify callables by module/type/name/signature — the same structural family as Java/Kotlin, C#, C/C++, and Go. Methods are keyed on Swift's module-qualified type and method names (e.g. `Foundation.URLRequest.init(url:)`).
35+
36+
## Extensible Predicates for Swift
37+
38+
| Predicate | Purpose |
39+
| -------------- | --------------------------------------------------------------------- |
40+
| `sourceModel` | Model sources of tainted data |
41+
| `sinkModel` | Model sinks where tainted data is used unsafely |
42+
| `summaryModel` | Model flow through opaque library functions/methods |
43+
| `barrierModel` | Model barriers (sanitizers) that stop taint flow |
44+
| `neutralModel` | Mark callables as having no dataflow impact (suppress generated rows) |
45+
46+
Refer to `codeql/swift-all` (the `ext/*.model.yml` files under `swift/ql/lib/ext/` in the upstream `codeql` repository) for the canonical column layout used by the current CodeQL CLI release. Authoring a tuple with the wrong column count will fail to load (often silently).
47+
48+
## Identifier Columns
49+
50+
Swift models typically identify a callable by:
51+
52+
- **module** — Swift module name (e.g. `Foundation`, `UIKit`, the package/target name for third-party code)
53+
- **type** — Type name (`""` for module-level free functions)
54+
- **subtypes** — Whether to apply to subtypes (`true`/`false`)
55+
- **name** — Method or function name (e.g. `init(url:)`, `data(using:)`)
56+
- **signature** — Parameter signature (`""` for any)
57+
58+
The exact column count and order is defined by the `codeql/swift-all` pack — always cross-check before authoring rows.
59+
60+
## Access Paths
61+
62+
Swift access paths follow the same conventions as the other MaD-tuple languages:
63+
64+
| Component | Description |
65+
| ---------------- | ----------------------------------------------- |
66+
| `Argument[n]` | Argument at index n (0-based, excluding `self`) |
67+
| `Argument[self]` | The receiver of a method call |
68+
| `Parameter[n]` | Parameter at index n (used by `summaryModel`) |
69+
| `ReturnValue` | Return value of a call |
70+
71+
## Common Sink Kinds
72+
73+
`command-injection`, `path-injection`, `sql-injection`, `request-forgery`, `url-redirection`, `code-injection`, `predicate-injection`
74+
75+
## Sample Model
76+
77+
```yaml
78+
extensions:
79+
- addsTo:
80+
pack: codeql/swift-all
81+
extensible: sinkModel
82+
data:
83+
- [
84+
'Foundation',
85+
'NSPredicate',
86+
false,
87+
'init(format:argumentArray:)',
88+
'',
89+
'',
90+
'Argument[0]',
91+
'predicate-injection',
92+
'manual'
93+
]
94+
```
95+
96+
> The exact column count above is **illustrative**; verify against the `codeql/swift-all` pack shipped with the CodeQL CLI version recorded in `.codeql-version`.
97+
98+
## Validation Workflow
99+
100+
1. Place `*.model.yml` files in your model-pack directory (or under `.github/codeql/extensions/` for the single-repo path).
101+
2. Run `codeql_query_run` against a database that exercises the modelled APIs and confirm new findings appear (sources/sinks) or expected findings disappear (barriers/neutrals).
102+
3. Add a unit test that exercises the new chain end-to-end using `codeql_test_run`.
103+
104+
## Related Resources
105+
106+
- `codeql://learning/data-extensions` — Common data extensions overview (both model formats)
107+
- [CodeQL for Swift](https://codeql.github.com/docs/codeql-language-guides/codeql-for-swift/) — Official Swift language guide

server/src/resources/server-overview.md

Lines changed: 16 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -10,22 +10,22 @@ The CodeQL Development MCP Server wraps the CodeQL CLI and supporting utilities
1010

1111
Read these resources via `resources/read` to deepen your understanding:
1212

13-
| URI | Purpose |
14-
| ------------------------------------------------ | ----------------------------------------------------- |
15-
| `codeql://server/overview` | This guide — MCP server orientation |
16-
| `codeql://server/queries` | Bundled tools queries (PrintAST, PrintCFG, etc.) |
17-
| `codeql://server/tools` | Complete default tool reference |
18-
| `codeql://server/prompts` | Complete prompt reference |
19-
| `codeql://learning/query-basics` | QL query writing reference (syntax, metadata, etc.) |
20-
| `codeql://learning/test-driven-development` | TDD theory and workflow for CodeQL |
21-
| `codeql://learning/data-extensions` | Data extensions (Models-as-Data) overview and formats |
22-
| `codeql://templates/security` | Security query templates (multi-language) |
23-
| `codeql://patterns/performance` | Performance profiling and optimization |
24-
| `codeql://guides/query-unit-testing` | Guide for creating and running CodeQL query tests |
25-
| `codeql://guides/dataflow-migration-v1-to-v2` | Migrating from v1 to v2 dataflow API |
26-
| `codeql://languages/{language}/ast` | Language-specific AST class reference |
27-
| `codeql://languages/{language}/security` | Language-specific security patterns |
28-
| `codeql://languages/{language}/library-modeling` | Language-specific library modeling (data extensions) |
13+
| URI | Purpose |
14+
| ------------------------------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
15+
| `codeql://server/overview` | This guide — MCP server orientation |
16+
| `codeql://server/queries` | Bundled tools queries (PrintAST, PrintCFG, etc.) |
17+
| `codeql://server/tools` | Complete default tool reference |
18+
| `codeql://server/prompts` | Complete prompt reference |
19+
| `codeql://learning/query-basics` | QL query writing reference (syntax, metadata, etc.) |
20+
| `codeql://learning/test-driven-development` | TDD theory and workflow for CodeQL |
21+
| `codeql://learning/data-extensions` | Data extensions (Models-as-Data) overview and formats |
22+
| `codeql://templates/security` | Security query templates (multi-language) |
23+
| `codeql://patterns/performance` | Performance profiling and optimization |
24+
| `codeql://guides/query-unit-testing` | Guide for creating and running CodeQL query tests |
25+
| `codeql://guides/dataflow-migration-v1-to-v2` | Migrating from v1 to v2 dataflow API |
26+
| `codeql://languages/{language}/ast` | Language-specific AST class reference |
27+
| `codeql://languages/{language}/security` | Language-specific security patterns |
28+
| `codeql://languages/{language}/library-modeling` | Language-specific library modeling — registered for every CodeQL language that supports Models-as-Data (`cpp`, `csharp`, `go`, `java`, `javascript`, `python`, `ruby`, `rust`, `swift`) |
2929

3030
## Quick-Start Workflows
3131

server/src/types/language-types.ts

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,8 @@ import pythonSecurity from '../resources/languages/python_security_query_guide.m
2828
import rubyAst from '../resources/languages/ruby_ast.md';
2929
import rubyLibraryModeling from '../resources/languages/ruby_library_modeling.md';
3030
import rustAst from '../resources/languages/rust_ast.md';
31+
import rustLibraryModeling from '../resources/languages/rust_library_modeling.md';
32+
import swiftLibraryModeling from '../resources/languages/swift_library_modeling.md';
3133

3234
export interface LanguageResource {
3335
language: string;
@@ -99,6 +101,15 @@ export const LANGUAGE_RESOURCES: LanguageResource[] = [
99101
},
100102
{
101103
language: 'rust',
102-
astContent: rustAst
104+
astContent: rustAst,
105+
additionalResources: {
106+
'library-modeling': rustLibraryModeling,
107+
}
108+
},
109+
{
110+
language: 'swift',
111+
additionalResources: {
112+
'library-modeling': swiftLibraryModeling,
113+
}
103114
}
104115
];

server/test/src/resources/language-resources.test.ts

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -111,7 +111,7 @@ describe('Language Resources', () => {
111111
it('should register additional resources for all languages with additional content', () => {
112112
registerLanguageAdditionalResources(mockServer);
113113

114-
expect(mockServer.resource).toHaveBeenCalledTimes(9);
114+
expect(mockServer.resource).toHaveBeenCalledTimes(11);
115115

116116
const resourceCalls = (mockServer.resource as ReturnType<typeof vi.fn>).mock.calls;
117117
const resourceNames = resourceCalls.map((call: unknown[]) => call[0]);
@@ -125,6 +125,8 @@ describe('Language Resources', () => {
125125
expect(resourceNames).toContain('JAVASCRIPT Library Modeling');
126126
expect(resourceNames).toContain('PYTHON Library Modeling');
127127
expect(resourceNames).toContain('RUBY Library Modeling');
128+
expect(resourceNames).toContain('RUST Library Modeling');
129+
expect(resourceNames).toContain('SWIFT Library Modeling');
128130
});
129131

130132
it('should register additional resources with correct URIs', () => {
@@ -146,8 +148,8 @@ describe('Language Resources', () => {
146148
it('should register all language resources', () => {
147149
registerLanguageResources(mockServer);
148150

149-
// 9 AST + 5 security + 9 additional = 23
150-
expect(mockServer.resource).toHaveBeenCalledTimes(23);
151+
// 9 AST + 5 security + 11 additional = 25
152+
expect(mockServer.resource).toHaveBeenCalledTimes(25);
151153
});
152154

153155
it('every registered handler should return non-empty content', async () => {

0 commit comments

Comments
 (0)