Skip to content

Commit 3da1d26

Browse files
authored
Merge pull request #365 from nanotaboada/feat/copilot-token-efficiency
feat(copilot): implement token efficiency strategy (#364)
2 parents 5fd85c6 + 5096724 commit 3da1d26

3 files changed

Lines changed: 203 additions & 28 deletions

File tree

.github/copilot-instructions.md

Lines changed: 34 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1,40 +1,41 @@
11
# GitHub Copilot Instructions
22

3-
> **Token Efficiency Note**: This is a minimal pointer file (~500 tokens, auto-loaded by Copilot).
4-
> For complete operational details, reference: `#file:AGENTS.md` (~2,500 tokens, loaded on-demand)
5-
> For specialized knowledge, use: `#file:SKILLS/<skill-name>/SKILL.md` (loaded on-demand when needed)
3+
> **Token Budget**: Target 600, limit 650 (auto-loaded)
4+
> Details: `#file:AGENTS.md` (~2,550 tokens, on-demand)
5+
> Skills: `#file:SKILLS/<name>/SKILL.md` (on-demand)
66
7-
## 🎯 Quick Context
7+
## Quick Context
88

9-
**Project**: ASP.NET Core 8 REST API demonstrating layered architecture patterns
10-
**Stack**: .NET 8 (LTS) • EF Core 9SQLiteDocker xUnit
11-
**Pattern**: Repository + Service Layer + AutoMapper + FluentValidation
12-
**Philosophy**: Learning-focused PoC emphasizing clarity and best practices
9+
ASP.NET Core 8 REST API with layered architecture
10+
**Stack**: .NET 8 LTS, EF Core 9, SQLite, Docker, xUnit
11+
**Pattern**: Repository + Service + AutoMapper + FluentValidation
12+
**Focus**: Learning PoC emphasizing clarity and best practices
1313

14-
## 📐 Core Conventions
14+
## Core Conventions
1515

1616
- **Naming**: PascalCase (public), camelCase (private)
1717
- **DI**: Primary constructors everywhere
1818
- **Async**: All I/O operations use async/await
1919
- **Logging**: Serilog with structured logging
2020
- **Testing**: xUnit + Moq + FluentAssertions
21-
- **Formatting**: CSharpier (opinionated)
21+
- **Formatting**: CSharpier
22+
- **Commits**: Subject ≤80 chars, include issue number (#123), body lines ≤80 chars, conventional commits
2223

23-
## 🏗️ Architecture at a Glance
24+
## Architecture
2425

2526
```text
2627
Controller → Service → Repository → Database
2728
↓ ↓
2829
Validation Caching
2930
```
3031

31-
- **Controllers**: Minimal logic, delegate to services
32-
- **Services**: Business logic + caching with `IMemoryCache`
33-
- **Repositories**: Generic `Repository<T>` + specific implementations
34-
- **Models**: `Player` entity + Request/Response DTOs
35-
- **Validators**: FluentValidation for input structure (business rules in services)
32+
Controllers: Minimal logic, delegate to services
33+
Services: Business logic + `IMemoryCache` caching
34+
Repositories: Generic `Repository<T>` + specific implementations
35+
Models: `Player` entity + DTOs
36+
Validators: FluentValidation (structure only, business rules in services)
3637

37-
## Copilot Should
38+
## Copilot Should
3839

3940
- Generate idiomatic ASP.NET Core code with minimal controller logic
4041
- Use EF Core async APIs with `AsNoTracking()` for reads
@@ -44,14 +45,14 @@ Validation Caching
4445
- Use primary constructors for DI
4546
- Implement structured logging with `ILogger<T>`
4647

47-
## 🚫 Copilot Should Avoid
48+
## Copilot Should Avoid
4849

4950
- Synchronous EF Core APIs
5051
- Controller business logic (belongs in services)
5152
- Static service/repository classes
5253
- `ConfigureAwait(false)` (unnecessary in ASP.NET Core)
5354

54-
## Quick Commands
55+
## Quick Commands
5556

5657
```bash
5758
# Run with hot reload
@@ -66,12 +67,21 @@ docker compose up
6667
# Swagger: https://localhost:9000/swagger
6768
```
6869

69-
## 📚 Need More Detail?
70+
## Load On-Demand Files
7071

71-
**For operational procedures**: Load `#file:AGENTS.md`
72-
**For Docker expertise**: *(Planned)* `#file:SKILLS/docker-containerization/SKILL.md`
73-
**For testing patterns**: *(Planned)* `#file:SKILLS/testing-patterns/SKILL.md`
72+
**Load `#file:AGENTS.md` when:**
73+
- "How do I run tests with coverage?"
74+
- "CI/CD pipeline setup or troubleshooting"
75+
- "Database migration procedures"
76+
- "Publishing/deployment workflows"
77+
- "Detailed troubleshooting guides"
78+
79+
**Load `#file:SKILLS/<skill-name>/SKILL.md` (planned):**
80+
- Docker optimization: `docker-containerization/SKILL.md`
81+
- Testing patterns: `testing-patterns/SKILL.md`
82+
83+
**Human-readable overview**: See `README.md` (not auto-loaded)
7484

7585
---
7686

77-
💡 **Why this structure?** Copilot auto-loads this file on every chat (~500 tokens). Loading `AGENTS.md` or `SKILLS/` explicitly gives you deep context only when needed, saving 80% of your token budget!
87+
**Why this structure?** Base instructions (~600 tokens) load automatically. On-demand files (~2,550 tokens) load only when needed, saving 80% of tokens per chat.

AGENTS.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
11
# AGENTS.md
22

3-
> **Token Efficiency Note**: This file contains complete operational instructions (~2,500 tokens).
4-
> **Auto-loaded**: NO (load explicitly with `#file:AGENTS.md` when you need detailed procedures)
5-
> **When to load**: Complex workflows, troubleshooting, CI/CD setup, detailed architecture questions
6-
> **Related files**: See `#file:.github/copilot-instructions.md` for quick context (auto-loaded, ~500 tokens)
3+
> **Token Efficiency**: Complete operational instructions (~2,550 tokens).
4+
> **Auto-loaded**: NO (load explicitly with `#file:AGENTS.md` when needed)
5+
> **When to load**: Complex workflows, troubleshooting, CI/CD setup, detailed architecture
6+
> **Related files**: `#file:.github/copilot-instructions.md` (auto-loaded, ~650 tokens)
77
88
---
99

scripts/count-tokens.sh

Lines changed: 165 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,165 @@
1+
#!/bin/bash
2+
# 📊 Token Counter for Copilot Instruction Files
3+
# Uses tiktoken (OpenAI's tokenizer) for accurate counting
4+
# Approximation: ~0.75 words per token (English text)
5+
6+
set -e
7+
8+
echo "📊 Token Analysis for Copilot Instructions"
9+
echo "=========================================="
10+
echo ""
11+
12+
# Check if tiktoken is available
13+
if command -v python3 &> /dev/null; then
14+
# Try to use tiktoken for accurate counting
15+
if python3 -c "import tiktoken" 2>/dev/null; then
16+
echo "✅ Using tiktoken (accurate Claude/GPT tokenization)"
17+
echo ""
18+
else
19+
# tiktoken not found - offer to install
20+
echo "⚠️ tiktoken not installed"
21+
echo ""
22+
23+
# Detect non-interactive environment (CI/CD)
24+
if [ ! -t 0 ] || [ -n "$CI" ] || [ -n "$CI_CD" ]; then
25+
echo "🤖 Non-interactive environment detected (CI/CD)"
26+
echo "📝 Using word-based approximation"
27+
echo " (To auto-install in CI, set AUTO_INSTALL_TIKTOKEN=1)"
28+
echo ""
29+
USE_APPROX=1
30+
elif [ -n "$AUTO_INSTALL_TIKTOKEN" ]; then
31+
echo "📥 Installing tiktoken (AUTO_INSTALL_TIKTOKEN=1)..."
32+
if pip3 install tiktoken --quiet; then
33+
echo "✅ tiktoken installed successfully!"
34+
echo ""
35+
# Re-run the script after installation
36+
exec "$0" "$@"
37+
else
38+
echo "❌ Installation failed. Using word-based approximation instead."
39+
echo ""
40+
USE_APPROX=1
41+
fi
42+
else
43+
echo "tiktoken provides accurate token counting for Claude/GPT models."
44+
read -p "📦 Install tiktoken now? (y/n): " -n 1 -r
45+
echo ""
46+
if [[ $REPLY =~ ^[Yy]$ ]]; then
47+
echo "📥 Installing tiktoken..."
48+
if pip3 install tiktoken --quiet; then
49+
echo "✅ tiktoken installed successfully!"
50+
echo ""
51+
# Re-run the script after installation
52+
exec "$0" "$@"
53+
else
54+
echo "❌ Installation failed. Using word-based approximation instead."
55+
echo ""
56+
USE_APPROX=1
57+
fi
58+
else
59+
echo "📝 Using word-based approximation instead"
60+
echo " (Install manually: pip3 install tiktoken)"
61+
echo ""
62+
USE_APPROX=1
63+
fi
64+
fi
65+
fi
66+
67+
# Only run tiktoken if it's available and we didn't set USE_APPROX
68+
if [ -z "$USE_APPROX" ] && python3 -c "import tiktoken" 2>/dev/null; then
69+
70+
# Create temporary Python script
71+
cat > /tmp/count_tokens.py << 'PYTHON'
72+
import tiktoken
73+
import sys
74+
75+
# cl100k_base is used by GPT-4, Claude uses similar tokenization
76+
encoding = tiktoken.get_encoding("cl100k_base")
77+
78+
file_path = sys.argv[1]
79+
with open(file_path, 'r', encoding='utf-8') as f:
80+
content = f.read()
81+
82+
tokens = encoding.encode(content)
83+
print(len(tokens))
84+
PYTHON
85+
86+
# Count tokens for each file
87+
echo "📄 .github/copilot-instructions.md"
88+
if [ -f ".github/copilot-instructions.md" ]; then
89+
COPILOT_TOKENS=$(python3 /tmp/count_tokens.py .github/copilot-instructions.md 2>&1 | grep -v "ERROR:root:code for hash" | tail -1)
90+
echo " Tokens: $COPILOT_TOKENS"
91+
else
92+
echo " ⚠️ File not found, skipping"
93+
COPILOT_TOKENS=0
94+
fi
95+
echo ""
96+
97+
echo "📄 AGENTS.md"
98+
if [ -f "AGENTS.md" ]; then
99+
AGENTS_TOKENS=$(python3 /tmp/count_tokens.py AGENTS.md 2>&1 | grep -v "ERROR:root:code for hash" | tail -1)
100+
echo " Tokens: $AGENTS_TOKENS"
101+
else
102+
echo " ⚠️ File not found, skipping"
103+
AGENTS_TOKENS=0
104+
fi
105+
echo ""
106+
107+
# Calculate total
108+
TOTAL=$((COPILOT_TOKENS + AGENTS_TOKENS))
109+
echo "📊 Summary"
110+
echo " Base load (auto): $COPILOT_TOKENS tokens"
111+
echo " On-demand load: $AGENTS_TOKENS tokens"
112+
echo " Total (if both): $TOTAL tokens"
113+
echo ""
114+
115+
# Check against target
116+
TARGET=600
117+
LIMIT=650
118+
if [ $COPILOT_TOKENS -le $TARGET ]; then
119+
echo "✅ copilot-instructions.md within target ($TARGET tokens)"
120+
elif [ $COPILOT_TOKENS -le $LIMIT ]; then
121+
echo "⚠️ copilot-instructions.md over target but within limit ($LIMIT tokens)"
122+
else
123+
echo "❌ copilot-instructions.md exceeds limit! Optimization required."
124+
fi
125+
126+
# Calculate savings (guard against division by zero)
127+
if [ $TOTAL -gt 0 ]; then
128+
SAVINGS=$((AGENTS_TOKENS * 100 / TOTAL))
129+
echo "💡 Savings: ${SAVINGS}% saved when AGENTS.md not needed"
130+
else
131+
echo "💡 Savings: 0% (no tokens to count)"
132+
fi
133+
134+
# Cleanup
135+
rm /tmp/count_tokens.py
136+
fi
137+
else
138+
echo "❌ Python3 not found"
139+
echo " Python 3 is required for token counting"
140+
echo " Install from: https://www.python.org/downloads/"
141+
echo ""
142+
exit 1
143+
fi
144+
145+
# Fallback: word-based approximation
146+
if [ -n "$USE_APPROX" ]; then
147+
echo "📄 .github/copilot-instructions.md"
148+
WORDS=$(wc -w < .github/copilot-instructions.md | tr -d ' ')
149+
APPROX_TOKENS=$((WORDS * 4 / 3))
150+
echo " Words: $WORDS"
151+
echo " Approx tokens: $APPROX_TOKENS"
152+
echo ""
153+
154+
echo "📄 AGENTS.md"
155+
WORDS=$(wc -w < AGENTS.md | tr -d ' ')
156+
APPROX_TOKENS=$((WORDS * 4 / 3))
157+
echo " Words: $WORDS"
158+
echo " Approx tokens: $APPROX_TOKENS"
159+
echo ""
160+
161+
echo "💡 Note: Run script again to install tiktoken for accurate counts"
162+
fi
163+
164+
echo ""
165+
echo "=========================================="

0 commit comments

Comments
 (0)