Skip to content

Commit c408450

Browse files
committed
fix: correct descriptions and formatting in data extension prompts
1 parent 336310f commit c408450

9 files changed

Lines changed: 68 additions & 17 deletions

.github/ISSUE_TEMPLATE/data-extension-create.yml

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
1-
name: Request new CodeQL Data Exension
2-
description: Request a new CodeQL query for detecting specific code patterns
1+
name: Request new CodeQL Data Extension
2+
description: Request a new CodeQL data extension (models-as-data) for an unmodeled library or framework
33
title: "[Data Extension Create]: "
44
labels: ["data-extension-create", "enhancement"]
55
body:
@@ -12,9 +12,8 @@ body:
1212
id: target-language
1313
attributes:
1414
label: Target Language
15-
description: Which programming language should this query target?
15+
description: Which programming language should this data extension target?
1616
options:
17-
- actions
1817
- cpp
1918
- csharp
2019
- go

.github/prompts/cpp_data_extension_development.prompt.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ mode: agent
55
# C / C++ Data Extension
66

77
For general CodeQL data extension model development guidance, see [Common Data Extension Development](./data_extensions_development.prompt.md).
8-
For general CodeQL query development guidance, see [Common Query Development](./query_development.prompt.md).
8+
If you need to write a custom CodeQL query instead of a data extension, see [Common Query Development](./query_development.prompt.md).
99

1010
## C/C++-Specific Documentation
1111

.github/prompts/csharp_data_extension_development.prompt.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ mode: agent
55
# C# Data Extension
66

77
For general CodeQL data extension model development guidance, see [Common Data Extension Development](./data_extensions_development.prompt.md).
8-
For general CodeQL query development guidance, see [Common Query Development](./query_development.prompt.md).
8+
If you need to write a custom CodeQL query instead of a data extension, see [Common Query Development](./query_development.prompt.md).
99

1010
## C#-Specific Documentation
1111

.github/prompts/data_extensions_development.prompt.md

Lines changed: 57 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ This prompt provides common guidance for developing CodeQL data extensions acros
88

99
## Product Documentation
1010

11-
- [Extending coverage for a repository](https://docs.github.com/en/code-security/how-tos/scan-code-for-vulnerabilities/manage-your-configuration/editing-your-configuration-of-default-setup#extending-coverage-for-a-repository) - `.github/codeql/extensions directory` for local model pack refrences (does not need a qlpack.yml)
11+
- [Extending coverage for a repository](https://docs.github.com/en/code-security/how-tos/scan-code-for-vulnerabilities/manage-your-configuration/editing-your-configuration-of-default-setup#extending-coverage-for-a-repository) - `.github/codeql/extensions directory` for local model pack references (does not need a qlpack.yml)
1212
- [Extending coverage for all repositories in an organization](https://docs.github.com/en/code-security/how-tos/scan-code-for-vulnerabilities/manage-your-configuration/editing-your-configuration-of-default-setup#extending-coverage-for-all-repositories-in-an-organization) - publishing model packs and referencing them globally (must be done click button in UI)
1313
- [Creating a CodeQL model pack](https://docs.github.com/en/code-security/tutorials/customize-code-scanning/creating-and-working-with-codeql-packs?versionId=free-pro-team%40latest&productId=code-security&restPage=how-tos%2Cscan-code-for-vulnerabilities%2Cmanage-your-configuration%2Cediting-your-configuration-of-default-setup#creating-a-codeql-model-pack) - publishing a model pack + for dataExtensions via qlpack.yml
1414

@@ -17,7 +17,7 @@ This prompt provides common guidance for developing CodeQL data extensions acros
1717
CodeQL analysis can be customized by adding library models in data extension YAML files to recognize libraries and frameworks that are not supported by default.
1818
Model packs can be used to expand code scanning analysis at scale. Model packs use data extensions, which are implemented as YAML and describe how to add data for new dependencies. When a model pack is specified, the data extensions in that pack will be added to the code scanning analysis automatically.
1919

20-
Generally each language will allow customization of the following extensible prdicates:
20+
Generally each language will allow customization of the following extensible predicates:
2121

2222
- sourceModel - This is used to model sources of potentially tainted data. The `kind` of the sources defined using this predicate determine which **threat model** they are associated with (e.g., `remote`, `local`, `file`, `commandargs`). Different threat models can be used to customize the sources used in an analysis.
2323
- sinkModel - This is used to model sinks where tainted data maybe used in a way that makes the code vulnerable. The `kind` identifies the vulnerability class (e.g., `sql-injection`, `command-injection`).
@@ -174,6 +174,58 @@ Naming convention: `<library>-<module>.model.yml` (lowercase, hyphen-separated).
174174

175175
All `.model.yml` files within a model pack are automatically picked up via the `dataExtensions` glob in `qlpack.yml` (e.g., `dataExtensions: models/**/*.yml`).
176176

177+
### Common Workflows
178+
179+
Data extensions support three primary workflows. An agent should follow the appropriate procedure end-to-end rather than jumping straight to YAML authoring.
180+
181+
#### Workflow 1: Creating a new `.model.yml`
182+
183+
1. **Identify the library to model** — review the library's API documentation or source code and classify public methods as sources, sinks, summaries, barriers, or barrier guards (see "What to Model in a Library" above)
184+
2. **Determine the correct format** — check whether the target language uses API Graph (Python, Ruby, JS/TS) or MaD (Java/Kotlin, C#, Go, C/C++) tuples (see "Two Model Formats" below)
185+
3. **Create the YAML file** — use the naming convention `<library>-<module>.model.yml` and the appropriate column format for the language
186+
4. **Place the file** — choose one of two paths depending on scope:
187+
- **Single repository:** Place the `.model.yml` directly in `.github/codeql/extensions/<pack-name>/` — no `qlpack.yml` is needed; Code Scanning picks up extensions from this directory automatically
188+
- **Model pack (reusable across repos):** Place the file under a pack directory (e.g., `languages/<language>/custom/src/`) with a `qlpack.yml` that declares `extensionTargets` and `dataExtensions`
189+
5. **Test locally** — run a targeted query against a sample database to confirm new findings appear (see "Model Pack / Data Extension Options" below for `--additional-packs` usage):
190+
```bash
191+
codeql query run \
192+
--database=/path/to/db \
193+
--additional-packs=<path-to-pack-dir> \
194+
--output=results.bqrs \
195+
-- path/to/RelevantQuery.ql
196+
```
197+
6. **Validate results** — decode and inspect results with `codeql bqrs decode`; confirm expected findings appear and no false positives are introduced
198+
199+
#### Workflow 2: Updating an existing `.model.yml`
200+
201+
1. **Find the existing model file** — check these locations in order:
202+
- `.github/codeql/extensions/` in the current repository
203+
- `languages/<lang>/custom/src/` in this template repository
204+
- Published model packs (search GHCR or your org's CodeQL pack registry)
205+
- **Note:** Models in upstream `codeql/<lang>-all` packs cannot be edited directly — create a custom model pack that adds new rows alongside the built-in models
206+
2. **Add new rows** to the appropriate extensible predicate section (`sinkModel`, `sourceModel`, `summaryModel`, etc.) — do not remove existing rows unless they are incorrect
207+
3. **Maintain consistency** — match the existing formatting, column count, and provenance values in the file
208+
4. **Re-test** — run the same query or test suite that covers the library to confirm:
209+
- Existing findings are unchanged (no regressions)
210+
- New coverage produces expected results
211+
5. **Bump the version** — if the model file lives in a published model pack, increment the `version` field in `qlpack.yml` before publishing
212+
213+
#### Workflow 3: Publishing a model pack to GHCR
214+
215+
1. **Ensure `qlpack.yml` is configured correctly:**
216+
```yaml
217+
name: <org>/<language>-<pack-name>
218+
version: 1.0.0
219+
library: true
220+
extensionTargets:
221+
codeql/<language>-all: '*'
222+
dataExtensions:
223+
- models/**/*.yml
224+
```
225+
2. **Run `codeql pack publish`** to push the pack to the GitHub Container Registry
226+
3. **Configure for org-wide Default Setup** — in the GitHub organization settings, navigate to Code security → Default setup → Model packs and add `<org>/<language>-<pack-name>` (see [Extending coverage for all repositories in an organization](https://docs.github.com/en/code-security/how-tos/find-and-fix-code-vulnerabilities/manage-your-configuration/editing-your-configuration-of-default-setup#extending-codeql-coverage-with-codeql-model-packs-in-default-setup))
227+
4. **For updates to an already-published pack** — increment the `version` in `qlpack.yml`, then re-run `codeql pack publish`; Default Setup will pick up the new version automatically based on the version range configured
228+
177229
### Two Model Formats: API Graph vs MaD
178230

179231
CodeQL data extensions use one of two tuple formats depending on the language. Using the wrong format for a language will produce invalid extensions.
@@ -324,14 +376,14 @@ Enable selectively: `--threat-model commandargs --threat-model environment` enab
324376
- Use specific `local` subcategories (e.g., `"file"`, `"commandargs"`) when modeling local input mechanisms — be precise rather than using the generic `"local"` parent
325377
- When in doubt, use `"remote"` — it provides the broadest default coverage
326378

327-
### Query Quality Criteria
379+
### Model Quality Criteria
328380

329381
Your generated CodeQL models will be evaluated on:
330382

331383
1. **Code Quality**:
332384
- **Critical**: Extensions must be formatted without errors. Invalid extensions will fail the engine and have negative code quality.
333385
- **Important**: Minimize warning-level diagnostics (deprecated elements, style guide deviations)
334-
- **Best Practice**: Follow CodeQL naming conventions and idioms, provide comments with sensible organizaiton
386+
- **Best Practice**: Follow CodeQL naming conventions and idioms, provide comments with sensible organization
335387

336388
### Common Pitfalls
337389

@@ -341,7 +393,7 @@ Your generated CodeQL models will be evaluated on:
341393

342394
Access paths for data extensions are parsed using [shared/dataflow/codeql/dataflow/internal/AccessPathSyntax.qll](https://github.com/github/codeql/blob/main/shared/dataflow/codeql/dataflow/internal/AccessPathSyntax.qll)
343395

344-
For languages that support API Graphs as the access paths can be most easilly tested by:
396+
For languages that support API Graphs as the access paths can be most easily tested by:
345397

346398
1. creating a small codeql database with some sample code that has a full end to end flow for the suspected query
347399
2. writing/executing a sample codeql query using api graphs to verify with 100% certainty that the path to discover the suspected source/sink/summary is verified.

.github/prompts/go_data_extension_development.prompt.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ mode: agent
55
# Go Data Extension
66

77
For general CodeQL data extension model development guidance, see [Common Data Extension Development](./data_extensions_development.prompt.md).
8-
For general CodeQL query development guidance, see [Common Query Development](./query_development.prompt.md).
8+
If you need to write a custom CodeQL query instead of a data extension, see [Common Query Development](./query_development.prompt.md).
99

1010
## Go-Specific Documentation
1111

.github/prompts/java_data_extension_development.prompt.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ mode: agent
55
# Java / Kotlin Data Extension
66

77
For general CodeQL data extension model development guidance, see [Common Data Extension Development](./data_extensions_development.prompt.md).
8-
For general CodeQL query development guidance, see [Common Query Development](./query_development.prompt.md).
8+
If you need to write a custom CodeQL query instead of a data extension, see [Common Query Development](./query_development.prompt.md).
99

1010
## Java/Kotlin-Specific Documentation
1111

.github/prompts/javascript_data_extension_development.prompt.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ mode: agent
55
# JavaScript / TypeScript Data Extension
66

77
For general CodeQL data extension model development guidance, see [Common Data Extension Development](./data_extensions_development.prompt.md).
8-
For general CodeQL query development guidance, see [Common Query Development](./query_development.prompt.md).
8+
If you need to write a custom CodeQL query instead of a data extension, see [Common Query Development](./query_development.prompt.md).
99

1010
## JavaScript/TypeScript-Specific Documentation
1111

.github/prompts/python_data_extension_development.prompt.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ mode: agent
55
# Python Data Extension
66

77
For general CodeQL data extension model development guidance, see [Common Data Extension Development](./data_extensions_development.prompt.md).
8-
For general CodeQL query development guidance, see [Common Query Development](./query_development.prompt.md).
8+
If you need to write a custom CodeQL query instead of a data extension, see [Common Query Development](./query_development.prompt.md).
99

1010
## Python-Specific Documentation
1111

@@ -14,7 +14,7 @@ For general CodeQL query development guidance, see [Common Query Development](./
1414
- [Customizing Library Models for Python](https://codeql.github.com/docs/codeql-language-guides/customizing-library-models-for-python/)
1515
- Can also be found at [Customizing Library Models for Python Docs](https://github.com/github/codeql/blob/main/docs/codeql/codeql-language-guides/customizing-library-models-for-python.rst)
1616

17-
- [Using API graphs in Python](https://codeql.github.com/docs/codeql-language-guides/using-api-graphs-in-python/) - the acess paths input to the extension tuple are powered by API graphs
17+
- [Using API graphs in Python](https://codeql.github.com/docs/codeql-language-guides/using-api-graphs-in-python/) - the access paths input to the extension tuple are powered by API graphs
1818

1919
### API Graphs
2020

.github/prompts/ruby_data_extension_development.prompt.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ mode: agent
55
# Ruby Data Extension
66

77
For general CodeQL data extension model development guidance, see [Common Data Extension Development](./data_extensions_development.prompt.md).
8-
For general CodeQL query development guidance, see [Common Query Development](./query_development.prompt.md).
8+
If you need to write a custom CodeQL query instead of a data extension, see [Common Query Development](./query_development.prompt.md).
99

1010
## Ruby-Specific Documentation
1111

0 commit comments

Comments
 (0)