Skip to content

Commit 9c5b6f7

Browse files
authored
Add data extension prompts, templates, and barrier/barrierGuard support (#42)
* Add data extension prompts, templates, and barrier/barrierGuard support Add comprehensive CodeQL data extension development guidance: - Common prompt with core principles, threat models, and CLI references - Language-specific prompts for C++, C#, Go, Java/Kotlin, JS/TS, Python, Ruby - Issue template and PR template for data extension workflow - barrierModel (sanitizers) and barrierGuardModel (validators) support across all languages (CodeQL 2.25.2+) * chore: format data extension files with prettier * fix: correct descriptions and formatting in data extension prompts
1 parent 8c13ba1 commit 9c5b6f7

10 files changed

Lines changed: 2435 additions & 0 deletions
Lines changed: 142 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,142 @@
1+
name: Request new CodeQL Data Extension
2+
description: Request a new CodeQL data extension (models-as-data) for an unmodeled library or framework
3+
title: "[Data Extension Create]: "
4+
labels: ["data-extension-create", "enhancement"]
5+
body:
6+
- type: markdown
7+
attributes:
8+
value: |
9+
Thanks for requesting a new CodeQL data extension! This template helps Copilot Coding Agent understand your requirements.
10+
11+
- type: dropdown
12+
id: target-language
13+
attributes:
14+
label: Target Language
15+
description: Which programming language should this data extension target?
16+
options:
17+
- cpp
18+
- csharp
19+
- go
20+
- java
21+
- javascript
22+
- python
23+
- ruby
24+
default: 0
25+
validations:
26+
required: true
27+
28+
- type: input
29+
id: library-url
30+
attributes:
31+
label: Library Repository / Documentation URL
32+
description: "Link to the library's source code or API documentation. A GitHub repository URL is ideal — it allows the agent to browse the source code directly to identify sources, sinks, and summaries."
33+
placeholder: "e.g., https://github.com/databricks/databricks-sql-python"
34+
validations:
35+
required: true
36+
37+
- type: input
38+
id: extension-name
39+
attributes:
40+
label: Data Extension Name (Optional)
41+
description: "Extension name (e.g., databricks-sql.model.yml). Use <library>-<module>.model.yml naming. If the library has multiple modules/sub-packages (e.g., library-core, library-web, library-api), create separate model files per module."
42+
placeholder: "e.g., databricks-sql.model.yml, django-http.model.yml"
43+
validations:
44+
required: false
45+
46+
- type: textarea
47+
id: library-modules
48+
attributes:
49+
label: Library Modules / Components
50+
description: "If the library has distinct modules or sub-packages, list them here. Each module may become a separate model file (e.g., library-core.model.yml, library-web.model.yml). Include the import paths or package names."
51+
placeholder: |
52+
- databricks.sql (SQL connector: connect, cursor, execute)
53+
- databricks.sdk (SDK client: WorkspaceClient, jobs, clusters)
54+
- databricks.connect (Spark session bridge)
55+
validations:
56+
required: false
57+
58+
- type: textarea
59+
id: description
60+
attributes:
61+
label: Data Extension Description
62+
description: "Describe the library/framework to model. What methods are sources of untrusted data? What methods are security-sensitive sinks? What methods sanitize data (barriers) or validate data (barrier guards)? All applicable model types (sourceModel, sinkModel, summaryModel, barrierModel, barrierGuardModel, typeModel, neutralModel) will be generated automatically."
63+
placeholder: |
64+
Library: databricks-sql-connector
65+
- Sources: None (uses Flask request sources)
66+
- Sinks: cursor.execute(query) is a SQL injection sink
67+
- Summaries: connect() returns a connection, connection.cursor() returns a cursor
68+
- Barriers: db_escape(value) sanitizes output for SQL injection
69+
- Barrier Guards: is_safe_query(query) returns true when query is safe for SQL injection
70+
71+
Docs: https://docs.databricks.com/...
72+
validations:
73+
required: true
74+
75+
- type: textarea
76+
id: examples
77+
attributes:
78+
label: Code Examples
79+
description: Provide sample end to code that should be detected
80+
placeholder: |
81+
```java
82+
package org.example;
83+
84+
# Undertow is not supported out of the box
85+
import io.undertow.Undertow;
86+
import io.undertow.server.HttpHandler;
87+
import io.undertow.server.HttpServerExchange;
88+
import io.undertow.util.Headers;
89+
import java.util.Deque;
90+
import javax.crypto.Cipher;
91+
92+
public class App {
93+
public String getGreeting() {
94+
return "Hello World!";
95+
}
96+
97+
public static void main(String[] args) {
98+
System.out.println(new App().getGreeting());
99+
try {
100+
Runtime.getRuntime().exec("ls");
101+
Cipher rsanopad = Cipher.getInstance("RSA/ECB/NoPadding");
102+
} catch (Exception e) {
103+
System.out.println(e.getMessage());
104+
}
105+
106+
Undertow server = Undertow.builder()
107+
.addHttpListener(8080, "localhost")
108+
.setHandler(new HttpHandler() {
109+
@Override
110+
public void handleRequest(final HttpServerExchange exchange) throws Exception {
111+
String name = "world";
112+
Deque<String> res = exchange.getQueryParameters().get("namex"); // SOURCE
113+
if (res != null) {
114+
name = res.getFirst();
115+
}
116+
exchange.getResponseHeaders().put(Headers.CONTENT_TYPE, "text/html");
117+
exchange.getResponseSender().send("<html><body>Hello " + name + "</body<</html>"); // SINK XSS
118+
}
119+
}).build();
120+
server.start();
121+
}
122+
}
123+
```
124+
validations:
125+
required: false
126+
127+
- type: input
128+
id: references
129+
attributes:
130+
label: Additional References (Optional)
131+
description: "Any other links — API docs, CWE references, related CodeQL queries, or security advisories."
132+
placeholder: "e.g., https://docs.databricks.com/sql/connector.html"
133+
validations:
134+
required: false
135+
136+
- type: checkboxes
137+
id: terms
138+
attributes:
139+
label: Code of Conduct
140+
options:
141+
- label: I agree to follow this project's Code of Conduct
142+
required: true
Lines changed: 83 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,83 @@
1+
---
2+
name: 📦 New CodeQL Data Extension
3+
about: Pull request for creating a new CodeQL data extension model
4+
title: '[NEW DATA EXTENSION] '
5+
labels:
6+
- data-extension-create
7+
- enhancement
8+
---
9+
10+
## 📝 Data Extension Information
11+
12+
- **Language**: <!-- e.g., java, python, javascript -->
13+
- **Extension Name(s)**: <!-- e.g., databricks-sql.model.yml. Use <library>-<module>.model.yml naming. List all files if multiple modules. -->
14+
- **Extension Types**: <!-- sourceModel, sinkModel, summaryModel, barrierModel, barrierGuardModel, neutralModel, typeModel -->
15+
- **Target Library/Framework**: <!-- e.g., Undertow, Databricks SQL -->
16+
- **Library Modules Covered**: <!-- List the distinct modules/sub-packages modeled, one per model file. e.g., databricks.sql, databricks.sdk -->
17+
18+
## 🎯 Description
19+
20+
### What This Data Extension Models
21+
22+
<!-- Clear description of the library/framework being modeled and what sources, sinks, summaries, barriers (sanitizers), or barrier guards (validators) it adds -->
23+
24+
### Threat Model
25+
26+
<!-- e.g., remote, local (file, commandargs, database, environment, stdin, windows-registry) -->
27+
28+
### Example Vulnerable Code
29+
30+
```[language]
31+
// Code that should be detected with this data extension
32+
```
33+
34+
### Example Safe Code
35+
36+
```[language]
37+
// Code that should NOT be detected
38+
```
39+
40+
## 📦 Extension Details
41+
42+
### Extension YAML
43+
44+
<!-- Provide the data extension YAML content or a summary of the models added -->
45+
46+
```yaml
47+
extensions:
48+
- addsTo:
49+
pack: codeql/[language]-all
50+
extensible: sinkModel
51+
data:
52+
# - ["package","Member[...].Argument[0]","sink-kind"]
53+
```
54+
55+
### Access Path Explanation
56+
57+
<!-- Explain the access path(s) used and how they map to the target API -->
58+
59+
## 🧪 Testing
60+
61+
- [ ] Extension YAML resolves without errors
62+
- [ ] Database created with sample code (`codeql database create` or `codeql test extract`)
63+
- [ ] Single query verified with extension applied (`codeql query run --additional-packs=<model-pack-dir>`)
64+
- [ ] Unit tests pass with extension applied (`codeql test run --additional-packs=<model-pack-dir>`)
65+
- [ ] Positive test cases (vulnerable code detected)
66+
- [ ] Negative test cases (safe code not flagged)
67+
68+
## 📋 Checklist
69+
70+
- [ ] Extension YAML is valid and properly formatted
71+
- [ ] Extension placed in correct location (`languages/[language]/custom/src/`)
72+
- [ ] `qlpack.yml` includes `dataExtensions` configuration
73+
- [ ] Access paths verified via API graph queries
74+
- [ ] No false positives in test cases
75+
- [ ] Documentation/comments included in YAML
76+
77+
## 🔗 References
78+
79+
<!-- Links to library/framework docs, CWE, OWASP, or related queries -->
80+
81+
---
82+
83+
**Note**: This data extension was developed following CodeQL Models as Data best practices.

0 commit comments

Comments
 (0)