Compare commits
3 commits
3065243f73
...
2b0f858b46
| Author | SHA1 | Date | |
|---|---|---|---|
| 2b0f858b46 | |||
| d3536c74a4 | |||
| 34fd5a2bf3 |
22 changed files with 1170 additions and 99 deletions
104
CLAUDE.md
Normal file
104
CLAUDE.md
Normal file
|
|
@ -0,0 +1,104 @@
|
|||
# WetGIT — Nederlandse wetgeving als code
|
||||
|
||||
## Project
|
||||
|
||||
Elke Nederlandse wet als Markdown-bestand, elke wijziging als Git-commit. Data van officiële BWB (Basis Wettenbestand) XML bronnen.
|
||||
|
||||
## Stack
|
||||
|
||||
- **Python 3.12+** (pyproject.toml, setuptools)
|
||||
- **FastAPI** — API + web frontend (module: `wetgit.api.app:app`)
|
||||
- **Meilisearch v1.12** — full-text search
|
||||
- **Qdrant v1.13** — semantic search (Mistral embeddings)
|
||||
- **Forgejo** — self-hosted git (git.wetgit.nl)
|
||||
- **Redis** — Celery broker
|
||||
- **Nginx** — reverse proxy (3 vhosts: wetgit.nl, api.wetgit.nl, git.wetgit.nl)
|
||||
- **Codeberg** — public mirror (codeberg.org/wetgit, daily push)
|
||||
|
||||
## API Endpoints
|
||||
|
||||
```
|
||||
GET /health → status, version, count
|
||||
GET /api/v1/regelingen → lijst (filter: ?type=wet&status=geldend)
|
||||
GET /api/v1/regelingen/{bwb_id} → metadata
|
||||
GET /api/v1/regelingen/{bwb_id}/tekst → volledige Markdown
|
||||
GET /api/v1/regelingen/{bwb_id}/artikelen → artikellijst
|
||||
GET /api/v1/regelingen/{bwb_id}/artikelen/{nr} → artikel detail
|
||||
GET /api/v1/regelingen/{bwb_id}/versies → historische versies (git log)
|
||||
GET /api/v1/regelingen/{bwb_id}/diff → diff (?van=DATE&tot=DATE)
|
||||
GET /api/v1/regelingen/{bwb_id}/domeinen → compliance-domein classificatie
|
||||
GET /api/v1/regelingen/{bwb_id}/referenties → cross-referenties
|
||||
GET /api/v1/zoeken → search (?q=term&mode=keyword|semantic|hybrid)
|
||||
GET /api/v1/domeinen → lijst van compliance-domeinen
|
||||
GET /api/v1/feed.xml → Atom feed van wijzigingen
|
||||
GET /api/docs → Swagger UI
|
||||
```
|
||||
|
||||
Rate limiting: 60 req/min default, 30 req/min voor zoeken (slowapi).
|
||||
|
||||
## Deployment
|
||||
|
||||
```bash
|
||||
# Full deploy
|
||||
ansible-playbook ansible/site.yml
|
||||
|
||||
# Selective
|
||||
ansible-playbook ansible/site.yml --tags app # code + deps + services
|
||||
ansible-playbook ansible/site.yml --tags docker # Docker stack
|
||||
ansible-playbook ansible/site.yml --tags nginx # nginx vhosts + SSL
|
||||
ansible-playbook ansible/site.yml --tags forgejo # Forgejo + Redis + cron
|
||||
|
||||
# Dry-run
|
||||
ansible-playbook ansible/site.yml --check --diff
|
||||
```
|
||||
|
||||
Server: dt-prod-01 via Tailscale (100.98.29.89), user: deploy.
|
||||
Vault password: `ansible/.vault_pass`
|
||||
|
||||
**Important:** Data lives at `/opt/wetgit/app/rijk/` (env var `WETGIT_GIT_REPOS_DIR=/opt/wetgit/app`).
|
||||
|
||||
## Shared server rules
|
||||
|
||||
This server is shared with **dt-platform**. Do NOT:
|
||||
- Modify global nginx.conf or system packages
|
||||
- Use ports 8001 (dt-chatbot) or 8200 (grimoire)
|
||||
- Touch /opt/dt-chatbot, /opt/grimoire, /opt/dt-skills-portal
|
||||
|
||||
## Development
|
||||
|
||||
```bash
|
||||
# Enter nix shell (provides Python + dependencies)
|
||||
nix develop
|
||||
|
||||
# Run tests
|
||||
pytest
|
||||
|
||||
# Run locally
|
||||
uvicorn wetgit.api.app:app --reload
|
||||
|
||||
# Lint
|
||||
ruff check src/
|
||||
black --check src/
|
||||
```
|
||||
|
||||
## Key directories
|
||||
|
||||
```
|
||||
src/wetgit/ # Application code
|
||||
api/ # FastAPI routes, search, models
|
||||
web/ # Jinja2 templates + static
|
||||
pipeline/ # BWB parser, SRU client, sync, indexer
|
||||
ai/ # Summaries, semantic search, alerts, domains, crossref
|
||||
cli/ # CLI tool
|
||||
tasks.py # Celery app (sync, reindex, alerts)
|
||||
ansible/ # Deployment playbooks
|
||||
roles/ # wetgit-forgejo, wetgit-app, wetgit-nginx
|
||||
group_vars/wetgit/ # main.yml (vars), vault.yml (secrets)
|
||||
PRD/ # Product Requirements Document
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
- **Port 8002 bezet na deploy:** `ssh deploy@100.98.29.89 "sudo fuser -k 8002/tcp && sudo systemctl restart wetgit"`
|
||||
- **Meilisearch crasht:** Check image version — v1.12 data is incompatibel met v1.13
|
||||
- **Ansible check-mode faalt op wetgit-web.conf:** Verwacht — file wordt niet echt geschreven in check-mode
|
||||
BIN
PRD/WetGit-PRD-v1.0.docx
Normal file
BIN
PRD/WetGit-PRD-v1.0.docx
Normal file
Binary file not shown.
BIN
PRD/WetGit-PRD-v1.0.pdf
Normal file
BIN
PRD/WetGit-PRD-v1.0.pdf
Normal file
Binary file not shown.
|
|
@ -19,6 +19,7 @@ backend_host: "127.0.0.1"
|
|||
# --- Domains ---
|
||||
server_name: "api.wetgit.nl"
|
||||
forgejo_domain: "git.wetgit.nl"
|
||||
web_domain: "wetgit.nl"
|
||||
|
||||
# --- Forgejo ---
|
||||
forgejo_port: 3000
|
||||
|
|
@ -30,6 +31,17 @@ forgejo_admin_email: coornhert@wetgit.nl
|
|||
redis_port: 6379
|
||||
redis_host: "127.0.0.1"
|
||||
|
||||
# --- Meilisearch ---
|
||||
meili_port: 7700
|
||||
meili_host: "127.0.0.1"
|
||||
meili_env: "development"
|
||||
meili_master_key: "{{ vault_meili_master_key | default('') }}"
|
||||
|
||||
# --- Qdrant ---
|
||||
qdrant_port: 6333
|
||||
qdrant_host: "127.0.0.1"
|
||||
qdrant_api_key: "{{ vault_qdrant_api_key | default('') }}"
|
||||
|
||||
# --- Celery ---
|
||||
celery_concurrency: 2
|
||||
|
||||
|
|
@ -42,7 +54,16 @@ codeberg_api_token: "{{ vault_codeberg_api_token | default('') }}"
|
|||
# --- AgentMail ---
|
||||
agentmail_api_key: "{{ vault_agentmail_api_key }}"
|
||||
|
||||
# --- Mistral AI ---
|
||||
mistral_api_key: "{{ vault_mistral_api_key }}"
|
||||
|
||||
# --- Local source path (for rsync deploy) ---
|
||||
local_src_dir: "{{ playbook_dir }}/../"
|
||||
|
||||
# --- Secrets (from vault.yml) ---
|
||||
# vault_agentmail_api_key
|
||||
# vault_forgejo_api_token
|
||||
# vault_codeberg_api_token (add when Codeberg account is ready)
|
||||
# vault_meili_master_key
|
||||
# vault_qdrant_api_key
|
||||
# vault_mistral_api_key
|
||||
|
|
|
|||
|
|
@ -1,14 +1,22 @@
|
|||
$ANSIBLE_VAULT;1.1;AES256
|
||||
35323237613730303463313335643433616238663932643630636530356461323433666435653436
|
||||
3433343462343538333335343165353538613435613962650a656166366364393564353733343561
|
||||
66643462313261643538653839393365643634376432373665653133383464313636633762366163
|
||||
6562336332396535390a333062323534373963356439353336633964383832313431623934653739
|
||||
37646339376338623536323336353931343039323263666265363763373266343533333236346635
|
||||
37656436623764393037393138343536313666613439666535656631313031343061346130376136
|
||||
64383164643466643162393537343265313632343432336238393030306164636434356463396434
|
||||
34656334383731326131393061333138643435366534333965376666393535316334396662633561
|
||||
61386636336438383563326565336635643663313934326333323939663637653531363261613733
|
||||
38646631333739303737616630663337663265616462346637326539306338613866313762306662
|
||||
38633066323936623233336631653836656531633839643739313966623065313931356630613134
|
||||
39636539643065663963626437383637643932633164306337626330623466313737623532366631
|
||||
6435
|
||||
63366162323335343538313162623831383134396331623163663630653637303866633539653338
|
||||
6230646432306464636466306161316164313533656430340a663833643334636437313133616434
|
||||
62366133353936633734353938323561303334626162383964633734613334373233363138306433
|
||||
3865303365356665620a333037376230306632613032303937383931366533346137633766303062
|
||||
33373962326562313731313030353936373837626435323265333636623631333432373962653561
|
||||
33616435623937313963316531313262346162303961303932383930333831303266393630386635
|
||||
66616463643333303834623932313032343638613333373362313439303436333137626638353062
|
||||
37353434383764633162373862316535626635353436353735346531366364343138623737383138
|
||||
65316363333565336631333333633263643130653965376235333163343335356361643866333661
|
||||
35306666373635646238393961356266623732363233646435393939646165623366326130303533
|
||||
32326366336337633232656435663230396636353164653563626534613433313437656238666539
|
||||
32393630333131376263336136653439393831353662383466346365303532663134623537313531
|
||||
38323739376434303261623235393338363938616535363738653631303737373566633763623862
|
||||
35666165636132356463366237393263626561343139343833373439383265303438633338323131
|
||||
62376364346134346636393330633134363631383234383766363234653565303733653032616230
|
||||
36623730376330343331303064383365366338643834663937356262663466353965313936316237
|
||||
38643933306163363634373236333761326437636434306565623261316430653565373431303064
|
||||
37623864386663613730306431363966323937613961633363343366643864613338326535353232
|
||||
31623835663466333336303434303765353233646531626132323933633835353638323038653763
|
||||
36663333323762653933633462346561313331633162303033646162643236353233363731613635
|
||||
61303164313262613763313231633635626638616366383961646465343163666232
|
||||
|
|
|
|||
|
|
@ -1,23 +1,86 @@
|
|||
---
|
||||
# WetGIT FastAPI application + Celery worker
|
||||
# Deploys to /opt/wetgit/backend with own venv and systemd services
|
||||
# Deploys to /opt/wetgit/backend via rsync from local checkout.
|
||||
#
|
||||
# Directories are created by wetgit-forgejo role (runs first).
|
||||
# This role only manages the FastAPI app and Celery worker.
|
||||
#
|
||||
# NOTE: Services are only enabled when application code exists.
|
||||
# On first deploy (no code yet), this role is effectively a no-op.
|
||||
# This role syncs source code, installs deps, and manages systemd services.
|
||||
|
||||
- name: Check if application code exists
|
||||
# --- Code deployment via rsync ---
|
||||
# NOTE: become: no is required on synchronize tasks because rsync
|
||||
# runs locally and connects to the remote via SSH directly.
|
||||
|
||||
- name: Sync application code to server
|
||||
ansible.posix.synchronize:
|
||||
src: "{{ local_src_dir }}/src/"
|
||||
dest: "{{ app_dir }}/backend/src/"
|
||||
delete: yes
|
||||
rsync_opts:
|
||||
- "--exclude=__pycache__"
|
||||
- "--exclude=*.pyc"
|
||||
become: no
|
||||
notify: restart wetgit
|
||||
|
||||
- name: Sync pyproject.toml
|
||||
ansible.posix.synchronize:
|
||||
src: "{{ local_src_dir }}/pyproject.toml"
|
||||
dest: "{{ app_dir }}/backend/pyproject.toml"
|
||||
become: no
|
||||
notify: restart wetgit
|
||||
|
||||
- name: Check if local templates directory exists
|
||||
stat:
|
||||
path: "{{ app_dir }}/backend/requirements.txt"
|
||||
register: app_code
|
||||
path: "{{ local_src_dir }}/templates"
|
||||
delegate_to: localhost
|
||||
register: local_templates
|
||||
become: no
|
||||
|
||||
- name: Sync web templates
|
||||
ansible.posix.synchronize:
|
||||
src: "{{ local_src_dir }}/templates/"
|
||||
dest: "{{ app_dir }}/backend/templates/"
|
||||
delete: yes
|
||||
rsync_opts:
|
||||
- "--exclude=__pycache__"
|
||||
become: no
|
||||
when: local_templates.stat.exists
|
||||
notify: restart wetgit
|
||||
|
||||
- name: Check if local static directory exists
|
||||
stat:
|
||||
path: "{{ local_src_dir }}/static"
|
||||
delegate_to: localhost
|
||||
register: local_static
|
||||
become: no
|
||||
|
||||
- name: Sync static assets
|
||||
ansible.posix.synchronize:
|
||||
src: "{{ local_src_dir }}/static/"
|
||||
dest: "{{ app_dir }}/backend/static/"
|
||||
delete: yes
|
||||
become: no
|
||||
when: local_static.stat.exists
|
||||
|
||||
- name: Set backend ownership
|
||||
file:
|
||||
path: "{{ app_dir }}/backend"
|
||||
owner: www-data
|
||||
group: www-data
|
||||
recurse: yes
|
||||
|
||||
# --- Python venv and dependencies ---
|
||||
|
||||
- name: Create Python venv
|
||||
command: python3 -m venv {{ app_dir }}/backend/venv
|
||||
args:
|
||||
creates: "{{ app_dir }}/backend/venv/bin/python"
|
||||
when: app_code.stat.exists
|
||||
|
||||
- name: Install application with API dependencies
|
||||
command: "{{ app_dir }}/backend/venv/bin/pip install --upgrade '.[api]'"
|
||||
args:
|
||||
chdir: "{{ app_dir }}/backend"
|
||||
register: pip_install
|
||||
changed_when: "'Successfully installed' in pip_install.stdout"
|
||||
notify: restart wetgit
|
||||
|
||||
- name: Set venv ownership
|
||||
file:
|
||||
|
|
@ -25,14 +88,8 @@
|
|||
owner: www-data
|
||||
group: www-data
|
||||
recurse: yes
|
||||
when: app_code.stat.exists
|
||||
|
||||
- name: Install Python dependencies
|
||||
pip:
|
||||
requirements: "{{ app_dir }}/backend/requirements.txt"
|
||||
virtualenv: "{{ app_dir }}/backend/venv"
|
||||
when: app_code.stat.exists
|
||||
notify: restart wetgit
|
||||
# --- Configuration ---
|
||||
|
||||
- name: Deploy environment file
|
||||
template:
|
||||
|
|
@ -43,6 +100,8 @@
|
|||
mode: "0600"
|
||||
notify: restart wetgit
|
||||
|
||||
# --- Systemd services ---
|
||||
|
||||
- name: Deploy WetGIT systemd service
|
||||
template:
|
||||
src: wetgit.service.j2
|
||||
|
|
@ -61,19 +120,18 @@
|
|||
mode: "0644"
|
||||
notify: restart wetgit-celery
|
||||
|
||||
# Only start services when app code is deployed
|
||||
- name: Enable and start WetGIT service
|
||||
systemd:
|
||||
name: wetgit
|
||||
enabled: yes
|
||||
state: started
|
||||
daemon_reload: yes
|
||||
when: app_code.stat.exists
|
||||
|
||||
- name: Enable and start Celery worker
|
||||
# Celery worker disabled — sync runs via cron, not Celery
|
||||
# Enable when wetgit.pipeline has a proper Celery app
|
||||
- name: Disable Celery worker (not yet configured)
|
||||
systemd:
|
||||
name: wetgit-celery
|
||||
enabled: yes
|
||||
state: started
|
||||
enabled: no
|
||||
state: stopped
|
||||
daemon_reload: yes
|
||||
when: app_code.stat.exists
|
||||
|
|
|
|||
|
|
@ -9,7 +9,7 @@ User=www-data
|
|||
Group=www-data
|
||||
WorkingDirectory={{ app_dir }}/backend
|
||||
EnvironmentFile={{ app_dir }}/backend/.env
|
||||
ExecStart={{ app_dir }}/backend/venv/bin/celery -A tasks worker --loglevel=info --concurrency={{ celery_concurrency }}
|
||||
ExecStart={{ app_dir }}/backend/venv/bin/celery -A wetgit.tasks worker --loglevel=info --concurrency={{ celery_concurrency }}
|
||||
Restart=always
|
||||
RestartSec=10
|
||||
|
||||
|
|
|
|||
|
|
@ -11,9 +11,28 @@ REDIS_URL=redis://{{ redis_host }}:{{ redis_port }}/0
|
|||
CELERY_BROKER_URL=redis://{{ redis_host }}:{{ redis_port }}/0
|
||||
CELERY_RESULT_BACKEND=redis://{{ redis_host }}:{{ redis_port }}/1
|
||||
|
||||
# Meilisearch
|
||||
MEILI_URL=http://{{ meili_host }}:{{ meili_port }}
|
||||
{% if meili_master_key | length > 0 %}
|
||||
MEILI_MASTER_KEY={{ meili_master_key }}
|
||||
{% endif %}
|
||||
|
||||
# Qdrant
|
||||
QDRANT_URL=http://{{ qdrant_host }}:{{ qdrant_port }}
|
||||
{% if qdrant_api_key | length > 0 %}
|
||||
QDRANT_API_KEY={{ qdrant_api_key }}
|
||||
{% endif %}
|
||||
|
||||
# Mistral AI
|
||||
MISTRAL_API_KEY={{ mistral_api_key }}
|
||||
|
||||
# AgentMail
|
||||
AGENTMAIL_API_KEY={{ agentmail_api_key }}
|
||||
|
||||
# Forgejo
|
||||
FORGEJO_URL=https://{{ forgejo_domain }}
|
||||
FORGEJO_API_TOKEN={{ forgejo_api_token }}
|
||||
|
||||
# Data
|
||||
WETGIT_DATA_DIR={{ data_dir }}
|
||||
WETGIT_GIT_REPOS_DIR={{ data_dir }}/git-repos
|
||||
WETGIT_GIT_REPOS_DIR={{ app_dir }}/app
|
||||
|
|
|
|||
|
|
@ -9,7 +9,7 @@ User=www-data
|
|||
Group=www-data
|
||||
WorkingDirectory={{ app_dir }}/backend
|
||||
EnvironmentFile={{ app_dir }}/backend/.env
|
||||
ExecStart={{ app_dir }}/backend/venv/bin/uvicorn main:app --host {{ backend_host }} --port {{ backend_port }} --workers {{ backend_workers }}
|
||||
ExecStart={{ app_dir }}/backend/venv/bin/uvicorn wetgit.api.app:app --host {{ backend_host }} --port {{ backend_port }} --workers {{ backend_workers }}
|
||||
Restart=always
|
||||
RestartSec=5
|
||||
|
||||
|
|
|
|||
|
|
@ -1,5 +1,12 @@
|
|||
---
|
||||
- name: restart forgejo
|
||||
community.docker.docker_compose_v2:
|
||||
project_src: "{{ app_dir }}/docker"
|
||||
services:
|
||||
- forgejo
|
||||
state: restarted
|
||||
|
||||
- name: restart docker stack
|
||||
community.docker.docker_compose_v2:
|
||||
project_src: "{{ app_dir }}/docker"
|
||||
state: restarted
|
||||
|
|
|
|||
|
|
@ -45,19 +45,22 @@
|
|||
group: "{{ item.group }}"
|
||||
mode: "0755"
|
||||
loop:
|
||||
# Parents first (owned by root)
|
||||
- { path: "{{ app_dir }}", owner: root, group: root }
|
||||
- { path: "{{ data_dir }}", owner: root, group: root }
|
||||
# Forgejo directories (owned by wetgit user)
|
||||
- { path: "{{ app_dir }}/docker", owner: wetgit, group: wetgit }
|
||||
- { path: "{{ forgejo_data_dir }}", owner: wetgit, group: wetgit }
|
||||
- { path: "{{ forgejo_data_dir }}/gitea/conf", owner: wetgit, group: wetgit }
|
||||
- { path: "{{ data_dir }}/redis", owner: wetgit, group: wetgit }
|
||||
- { path: "{{ data_dir }}/meilisearch", owner: wetgit, group: wetgit }
|
||||
- { path: "{{ data_dir }}/qdrant", owner: wetgit, group: wetgit }
|
||||
- { path: "{{ app_dir }}/scripts", owner: wetgit, group: wetgit }
|
||||
- { path: "{{ app_dir }}/backups", owner: wetgit, group: wetgit }
|
||||
- { path: "{{ app_dir }}/logs", owner: wetgit, group: wetgit }
|
||||
- { path: "{{ app_dir }}/mirrors", owner: wetgit, group: wetgit }
|
||||
# Application directories (owned by www-data for FastAPI/Celery)
|
||||
- { path: "{{ app_dir }}", owner: root, group: root }
|
||||
- { path: "{{ app_dir }}/backend", owner: www-data, group: www-data }
|
||||
- { path: "{{ data_dir }}", owner: root, group: root }
|
||||
- { path: "{{ data_dir }}/git-repos", owner: www-data, group: www-data }
|
||||
|
||||
# --- Forgejo config ---
|
||||
|
|
@ -81,8 +84,8 @@
|
|||
dest: "{{ app_dir }}/docker/docker-compose.yml"
|
||||
owner: wetgit
|
||||
group: wetgit
|
||||
mode: "0644"
|
||||
notify: restart forgejo
|
||||
mode: "0640"
|
||||
notify: restart docker stack
|
||||
|
||||
- name: Start WetGIT Docker stack
|
||||
community.docker.docker_compose_v2:
|
||||
|
|
@ -141,7 +144,7 @@
|
|||
- name: Configure backup cron (weekly Sunday 02:00)
|
||||
cron:
|
||||
name: "wetgit-backup"
|
||||
user: root
|
||||
user: wetgit
|
||||
weekday: "0"
|
||||
hour: "2"
|
||||
minute: "0"
|
||||
|
|
@ -164,3 +167,16 @@
|
|||
hour: "5"
|
||||
minute: "0"
|
||||
job: "find {{ app_dir }}/logs -name '*.log' -mtime +30 -delete"
|
||||
|
||||
# --- IPv4 preference (Hetzner IPv6 causes timeouts to external APIs) ---
|
||||
# TODO: migrate to dt-platform's server role when appropriate
|
||||
|
||||
- name: Ensure IPv4 precedence in gai.conf
|
||||
lineinfile:
|
||||
path: /etc/gai.conf
|
||||
regexp: '^precedence\s+::ffff:0:0/96'
|
||||
line: "precedence ::ffff:0:0/96 100"
|
||||
create: yes
|
||||
owner: root
|
||||
group: root
|
||||
mode: "0644"
|
||||
|
|
|
|||
|
|
@ -40,6 +40,64 @@ services:
|
|||
networks:
|
||||
- wetgit
|
||||
|
||||
meilisearch:
|
||||
image: getmeili/meilisearch:v1.12
|
||||
container_name: wetgit-meilisearch
|
||||
restart: unless-stopped
|
||||
ports:
|
||||
- "{{ backend_host }}:{{ meili_port }}:7700"
|
||||
volumes:
|
||||
- {{ data_dir }}/meilisearch:/meili_data
|
||||
environment:
|
||||
- MEILI_ENV={{ meili_env }}
|
||||
{% if meili_master_key | length > 0 %}
|
||||
- MEILI_MASTER_KEY={{ meili_master_key }}
|
||||
{% endif %}
|
||||
- MEILI_LOG_LEVEL=WARN
|
||||
deploy:
|
||||
resources:
|
||||
limits:
|
||||
memory: 1G
|
||||
cpus: "2.0"
|
||||
reservations:
|
||||
memory: 256M
|
||||
cpus: "0.5"
|
||||
healthcheck:
|
||||
test: ["CMD", "curl", "-f", "http://localhost:7700/health"]
|
||||
interval: 30s
|
||||
timeout: 10s
|
||||
retries: 3
|
||||
networks:
|
||||
- wetgit
|
||||
|
||||
qdrant:
|
||||
image: qdrant/qdrant:v1.13.2
|
||||
container_name: wetgit-qdrant
|
||||
restart: unless-stopped
|
||||
ports:
|
||||
- "{{ backend_host }}:{{ qdrant_port }}:6333"
|
||||
volumes:
|
||||
- {{ data_dir }}/qdrant:/qdrant/storage
|
||||
{% if qdrant_api_key | length > 0 %}
|
||||
environment:
|
||||
- QDRANT__SERVICE__API_KEY={{ qdrant_api_key }}
|
||||
{% endif %}
|
||||
deploy:
|
||||
resources:
|
||||
limits:
|
||||
memory: 512M
|
||||
cpus: "1.0"
|
||||
reservations:
|
||||
memory: 128M
|
||||
cpus: "0.25"
|
||||
healthcheck:
|
||||
test: ["CMD-SHELL", "bash -c 'echo > /dev/tcp/localhost/6333'"]
|
||||
interval: 30s
|
||||
timeout: 10s
|
||||
retries: 3
|
||||
networks:
|
||||
- wetgit
|
||||
|
||||
networks:
|
||||
wetgit:
|
||||
name: wetgit-network
|
||||
|
|
|
|||
|
|
@ -3,7 +3,11 @@
|
|||
# IMPORTANT: Only adds vhost configs. Does NOT touch global nginx.conf
|
||||
# (managed by dt-platform's nginx role).
|
||||
#
|
||||
# Strategy: Deploy HTTP-only first → get SSL certs → deploy full HTTPS config.
|
||||
# Strategy:
|
||||
# 1. Check if SSL certs exist
|
||||
# 2. If no cert: deploy HTTP-only config → certbot → deploy HTTPS
|
||||
# 3. If cert exists: deploy HTTPS config directly
|
||||
# 4. Enable vhosts (symlinks) after config files exist
|
||||
|
||||
# --- Step 1: Check existing SSL certificates ---
|
||||
|
||||
|
|
@ -17,7 +21,12 @@
|
|||
path: "/etc/letsencrypt/live/{{ forgejo_domain }}/fullchain.pem"
|
||||
register: ssl_cert_git
|
||||
|
||||
# --- Step 2: Deploy HTTP-only configs for domains without certs ---
|
||||
- name: Check if Web SSL certificate exists
|
||||
stat:
|
||||
path: "/etc/letsencrypt/live/{{ web_domain }}/fullchain.pem"
|
||||
register: ssl_cert_web
|
||||
|
||||
# --- Step 2: Deploy HTTP-only configs for domains that need new certs ---
|
||||
|
||||
- name: Deploy API HTTP-only vhost (pre-SSL)
|
||||
copy:
|
||||
|
|
@ -55,23 +64,51 @@
|
|||
when: not ssl_cert_git.stat.exists
|
||||
notify: reload nginx
|
||||
|
||||
# --- Step 3: Enable vhosts and reload nginx ---
|
||||
- name: Deploy Web HTTP-only vhost (pre-SSL)
|
||||
copy:
|
||||
content: |
|
||||
# Temporary HTTP-only config for SSL provisioning — managed by Ansible
|
||||
server {
|
||||
listen 80;
|
||||
listen [::]:80;
|
||||
server_name {{ web_domain }};
|
||||
location /.well-known/acme-challenge/ { root /var/www/certbot; }
|
||||
location / { return 503; }
|
||||
}
|
||||
dest: /etc/nginx/sites-available/wetgit-web.conf
|
||||
owner: root
|
||||
group: root
|
||||
mode: "0644"
|
||||
when: not ssl_cert_web.stat.exists
|
||||
notify: reload nginx
|
||||
|
||||
- name: Enable API vhost
|
||||
# --- Step 3: Enable vhosts that need new certs + reload for certbot ---
|
||||
|
||||
- name: Enable API vhost (pre-SSL)
|
||||
file:
|
||||
src: /etc/nginx/sites-available/wetgit-api.conf
|
||||
dest: /etc/nginx/sites-enabled/wetgit-api.conf
|
||||
state: link
|
||||
when: not ssl_cert_api.stat.exists
|
||||
notify: reload nginx
|
||||
|
||||
- name: Enable Forgejo vhost
|
||||
- name: Enable Forgejo vhost (pre-SSL)
|
||||
file:
|
||||
src: /etc/nginx/sites-available/wetgit-git.conf
|
||||
dest: /etc/nginx/sites-enabled/wetgit-git.conf
|
||||
state: link
|
||||
when: not ssl_cert_git.stat.exists
|
||||
notify: reload nginx
|
||||
|
||||
# Force handler to run now so nginx has the HTTP configs before certbot
|
||||
- name: Enable Web vhost (pre-SSL)
|
||||
file:
|
||||
src: /etc/nginx/sites-available/wetgit-web.conf
|
||||
dest: /etc/nginx/sites-enabled/wetgit-web.conf
|
||||
state: link
|
||||
when: not ssl_cert_web.stat.exists
|
||||
notify: reload nginx
|
||||
|
||||
# Force handler to run so nginx has the HTTP configs before certbot
|
||||
- name: Flush handlers (reload nginx for certbot)
|
||||
meta: flush_handlers
|
||||
|
||||
|
|
@ -97,7 +134,34 @@
|
|||
when: not ssl_cert_git.stat.exists
|
||||
register: certbot_git
|
||||
|
||||
# --- Step 5: Deploy full HTTPS configs ---
|
||||
- name: Obtain SSL certificate for {{ web_domain }}
|
||||
command: >
|
||||
certbot certonly --webroot
|
||||
-w /var/www/certbot
|
||||
-d {{ web_domain }}
|
||||
--non-interactive --agree-tos
|
||||
--email coornhert@wetgit.nl
|
||||
when: not ssl_cert_web.stat.exists
|
||||
register: certbot_web
|
||||
|
||||
# --- Step 5: Re-check SSL certs after certbot ---
|
||||
|
||||
- name: Re-check API SSL certificate
|
||||
stat:
|
||||
path: "/etc/letsencrypt/live/{{ server_name }}/fullchain.pem"
|
||||
register: ssl_cert_api_final
|
||||
|
||||
- name: Re-check Forgejo SSL certificate
|
||||
stat:
|
||||
path: "/etc/letsencrypt/live/{{ forgejo_domain }}/fullchain.pem"
|
||||
register: ssl_cert_git_final
|
||||
|
||||
- name: Re-check Web SSL certificate
|
||||
stat:
|
||||
path: "/etc/letsencrypt/live/{{ web_domain }}/fullchain.pem"
|
||||
register: ssl_cert_web_final
|
||||
|
||||
# --- Step 6: Deploy full HTTPS configs + enable vhosts ---
|
||||
|
||||
- name: Deploy API nginx vhost (full HTTPS)
|
||||
template:
|
||||
|
|
@ -106,6 +170,7 @@
|
|||
owner: root
|
||||
group: root
|
||||
mode: "0644"
|
||||
when: ssl_cert_api_final.stat.exists
|
||||
notify: reload nginx
|
||||
|
||||
- name: Deploy Forgejo nginx vhost (full HTTPS)
|
||||
|
|
@ -115,4 +180,37 @@
|
|||
owner: root
|
||||
group: root
|
||||
mode: "0644"
|
||||
when: ssl_cert_git_final.stat.exists
|
||||
notify: reload nginx
|
||||
|
||||
- name: Deploy Web nginx vhost (full HTTPS)
|
||||
template:
|
||||
src: wetgit-web.conf.j2
|
||||
dest: /etc/nginx/sites-available/wetgit-web.conf
|
||||
owner: root
|
||||
group: root
|
||||
mode: "0644"
|
||||
when: ssl_cert_web_final.stat.exists
|
||||
notify: reload nginx
|
||||
|
||||
# Enable all vhosts (idempotent — creates symlink if not exists)
|
||||
- name: Enable API vhost
|
||||
file:
|
||||
src: /etc/nginx/sites-available/wetgit-api.conf
|
||||
dest: /etc/nginx/sites-enabled/wetgit-api.conf
|
||||
state: link
|
||||
notify: reload nginx
|
||||
|
||||
- name: Enable Forgejo vhost
|
||||
file:
|
||||
src: /etc/nginx/sites-available/wetgit-git.conf
|
||||
dest: /etc/nginx/sites-enabled/wetgit-git.conf
|
||||
state: link
|
||||
notify: reload nginx
|
||||
|
||||
- name: Enable Web vhost
|
||||
file:
|
||||
src: /etc/nginx/sites-available/wetgit-web.conf
|
||||
dest: /etc/nginx/sites-enabled/wetgit-web.conf
|
||||
state: link
|
||||
notify: reload nginx
|
||||
|
|
|
|||
51
ansible/roles/wetgit-nginx/templates/wetgit-web.conf.j2
Normal file
51
ansible/roles/wetgit-nginx/templates/wetgit-web.conf.j2
Normal file
|
|
@ -0,0 +1,51 @@
|
|||
# WetGIT frontend (wetgit.nl) — managed by WetGIT Ansible (not dt-platform)
|
||||
# Do NOT edit manually
|
||||
|
||||
server {
|
||||
listen 80;
|
||||
listen [::]:80;
|
||||
server_name {{ web_domain }};
|
||||
|
||||
location /.well-known/acme-challenge/ {
|
||||
root /var/www/certbot;
|
||||
}
|
||||
|
||||
location / {
|
||||
return 301 https://$host$request_uri;
|
||||
}
|
||||
}
|
||||
|
||||
server {
|
||||
listen 443 ssl http2;
|
||||
listen [::]:443 ssl http2;
|
||||
server_name {{ web_domain }};
|
||||
|
||||
ssl_certificate /etc/letsencrypt/live/{{ web_domain }}/fullchain.pem;
|
||||
ssl_certificate_key /etc/letsencrypt/live/{{ web_domain }}/privkey.pem;
|
||||
|
||||
# Security headers
|
||||
add_header Strict-Transport-Security "max-age=63072000; includeSubDomains" always;
|
||||
add_header X-Content-Type-Options "nosniff" always;
|
||||
add_header X-Frame-Options "SAMEORIGIN" always;
|
||||
add_header Referrer-Policy "strict-origin-when-cross-origin" always;
|
||||
|
||||
# Frontend proxy (same FastAPI app serves the web UI)
|
||||
location / {
|
||||
proxy_pass http://{{ backend_host }}:{{ backend_port }};
|
||||
proxy_set_header Host $host;
|
||||
proxy_set_header X-Real-IP $remote_addr;
|
||||
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
|
||||
proxy_set_header X-Forwarded-Proto $scheme;
|
||||
|
||||
proxy_read_timeout 120s;
|
||||
proxy_connect_timeout 10s;
|
||||
}
|
||||
|
||||
# Static assets (served by FastAPI/Starlette)
|
||||
location /static/ {
|
||||
proxy_pass http://{{ backend_host }}:{{ backend_port }}/static/;
|
||||
proxy_set_header Host $host;
|
||||
expires 1d;
|
||||
add_header Cache-Control "public";
|
||||
}
|
||||
}
|
||||
|
|
@ -13,7 +13,6 @@ classifiers = [
|
|||
"Development Status :: 2 - Pre-Alpha",
|
||||
"Intended Audience :: Developers",
|
||||
"Intended Audience :: Legal Industry",
|
||||
"License :: OSI Approved :: MIT License",
|
||||
"Programming Language :: Python :: 3.13",
|
||||
"Topic :: Text Processing :: Markup",
|
||||
]
|
||||
|
|
@ -34,6 +33,9 @@ api = [
|
|||
"uvicorn>=0.30",
|
||||
"celery>=5.4",
|
||||
"redis>=5.0",
|
||||
"markdown>=3.5",
|
||||
"jinja2>=3.1",
|
||||
"slowapi>=0.1.9",
|
||||
]
|
||||
dev = [
|
||||
"pytest>=8.0",
|
||||
|
|
@ -61,6 +63,9 @@ build-backend = "setuptools.build_meta"
|
|||
[tool.setuptools.packages.find]
|
||||
where = ["src"]
|
||||
|
||||
[tool.setuptools.package-data]
|
||||
"wetgit.web" = ["static/**/*", "templates/**/*"]
|
||||
|
||||
[tool.pytest.ini_options]
|
||||
testpaths = ["tests"]
|
||||
markers = [
|
||||
|
|
|
|||
|
|
@ -1,21 +1,30 @@
|
|||
"""Change-alerts — stuur notificaties bij wetswijzigingen.
|
||||
|
||||
Vergelijkt de huidige staat met de vorige en stuurt een e-mail
|
||||
met een AI-gegenereerde change-summary via AgentMail.
|
||||
Vergelijkt de huidige staat met de vorige en stuurt notificaties
|
||||
via e-mail (AgentMail) en/of webhooks.
|
||||
|
||||
Ondersteunt domein-filtering (NIS2, DORA, AVG, etc.) zodat
|
||||
abonnees alleen relevante wijzigingen ontvangen.
|
||||
|
||||
Usage:
|
||||
python -m wetgit.ai.alerts --bwb-id BWBR0001840 --diff "..."
|
||||
python -m wetgit.ai.alerts --test # Stuur test-alert
|
||||
python -m wetgit.ai.alerts --domains # Toon beschikbare domeinen
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import logging
|
||||
import os
|
||||
from datetime import date
|
||||
|
||||
import sys
|
||||
|
||||
import httpx
|
||||
|
||||
from wetgit.ai.domains import classify_regeling
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
MISTRAL_API_URL = "https://api.mistral.ai/v1/chat/completions"
|
||||
|
|
@ -36,22 +45,34 @@ def send_change_alert(
|
|||
titel: str,
|
||||
diff_text: str,
|
||||
recipients: list[str] | None = None,
|
||||
webhooks: list[str] | None = None,
|
||||
domain_filter: list[str] | None = None,
|
||||
mistral_api_key: str | None = None,
|
||||
agentmail_api_key: str | None = None,
|
||||
) -> bool:
|
||||
"""Genereer een change-summary en stuur een e-mail alert.
|
||||
"""Genereer een change-summary en stuur notificaties.
|
||||
|
||||
Args:
|
||||
bwb_id: BWB identificatienummer van de gewijzigde regeling.
|
||||
titel: Titel van de regeling.
|
||||
diff_text: Git diff van de wijziging.
|
||||
recipients: E-mailadressen (default: coornhert@wetgit.nl).
|
||||
webhooks: Lijst van webhook URLs om een POST naar te sturen.
|
||||
domain_filter: Alleen alert sturen als regeling bij deze domeinen hoort.
|
||||
Als None, altijd sturen.
|
||||
mistral_api_key: Mistral API key.
|
||||
agentmail_api_key: AgentMail API key.
|
||||
|
||||
Returns:
|
||||
True als de alert succesvol verstuurd is.
|
||||
True als minstens één notificatie verstuurd is.
|
||||
"""
|
||||
# Domein-filtering
|
||||
if domain_filter:
|
||||
matched_domains = classify_regeling(titel, diff_text)
|
||||
if not any(d in domain_filter for d in matched_domains):
|
||||
logger.info("Regeling %s matcht niet met domeinen %s, alert overgeslagen", bwb_id, domain_filter)
|
||||
return False
|
||||
|
||||
mistral_key = mistral_api_key or os.environ.get("MISTRAL_API_KEY", "")
|
||||
agentmail_key = agentmail_api_key or os.environ.get("AGENTMAIL_API_KEY", "")
|
||||
recipients = recipients or ["coornhert@wetgit.nl"]
|
||||
|
|
@ -100,7 +121,7 @@ Dit is geen juridisch advies.
|
|||
"""
|
||||
|
||||
# Stap 4: Verstuur via AgentMail
|
||||
return _send_email(
|
||||
email_ok = _send_email(
|
||||
from_address="coornhert@wetgit.nl",
|
||||
to_addresses=recipients,
|
||||
subject=subject,
|
||||
|
|
@ -108,6 +129,23 @@ Dit is geen juridisch advies.
|
|||
agentmail_key=agentmail_key,
|
||||
)
|
||||
|
||||
# Stap 5: Verstuur webhooks
|
||||
webhook_ok = True
|
||||
matched_domains = classify_regeling(titel, diff_text)
|
||||
for url in (webhooks or []):
|
||||
webhook_ok = _send_webhook(url, {
|
||||
"event": "regeling.gewijzigd",
|
||||
"bwb_id": bwb_id,
|
||||
"titel": titel,
|
||||
"datum": date.today().isoformat(),
|
||||
"regels_toegevoegd": added,
|
||||
"regels_verwijderd": removed,
|
||||
"domeinen": matched_domains,
|
||||
"samenvatting": summary,
|
||||
}) and webhook_ok
|
||||
|
||||
return email_ok or webhook_ok
|
||||
|
||||
|
||||
def _generate_change_summary(titel: str, diff_text: str, api_key: str) -> str | None:
|
||||
"""Genereer een AI-samenvatting van de wetswijziging."""
|
||||
|
|
@ -170,6 +208,23 @@ def _send_email(
|
|||
return False
|
||||
|
||||
|
||||
def _send_webhook(url: str, payload: dict) -> bool:
|
||||
"""Verstuur een webhook POST."""
|
||||
try:
|
||||
resp = httpx.post(
|
||||
url,
|
||||
json=payload,
|
||||
headers={"Content-Type": "application/json", "User-Agent": "WetGIT/1.0"},
|
||||
timeout=10,
|
||||
)
|
||||
resp.raise_for_status()
|
||||
logger.info("Webhook verstuurd naar %s", url)
|
||||
return True
|
||||
except httpx.HTTPError as e:
|
||||
logger.error("Webhook fout voor %s: %s", url, e)
|
||||
return False
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
import argparse
|
||||
|
||||
|
|
@ -177,10 +232,19 @@ if __name__ == "__main__":
|
|||
|
||||
parser = argparse.ArgumentParser(description="WetGit change alerts")
|
||||
parser.add_argument("--test", action="store_true", help="Stuur een test-alert")
|
||||
parser.add_argument("--domains", action="store_true", help="Toon beschikbare domeinen")
|
||||
parser.add_argument("--bwb-id", default="BWBR0001840")
|
||||
parser.add_argument("--to", default="coornhert@wetgit.nl")
|
||||
parser.add_argument("--domain-filter", nargs="*", help="Filter op domeinen")
|
||||
parser.add_argument("--webhook", nargs="*", help="Webhook URLs")
|
||||
args = parser.parse_args()
|
||||
|
||||
if args.domains:
|
||||
from wetgit.ai.domains import list_domains
|
||||
for d in list_domains():
|
||||
print(f" {d['naam']}: {', '.join(d['keywords'][:3])}...")
|
||||
sys.exit(0)
|
||||
|
||||
if args.test:
|
||||
# Simuleer een wijziging in de Grondwet
|
||||
test_diff = """--- a/wet/grondwet/BWBR0001840/README.md
|
||||
|
|
|
|||
209
src/wetgit/ai/crossref.py
Normal file
209
src/wetgit/ai/crossref.py
Normal file
|
|
@ -0,0 +1,209 @@
|
|||
"""Cross-referentie analyse — extract verwijzingen tussen regelingen.
|
||||
|
||||
Parseert wetteksten op verwijzingen naar andere regelingen en bouwt
|
||||
een doorzoekbare graaf op als JSON adjacency list.
|
||||
|
||||
Usage:
|
||||
python -m wetgit.ai.crossref --repo /path/to/rijk
|
||||
python -m wetgit.ai.crossref --repo /path/to/rijk --query BWBR0001840
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import logging
|
||||
import re
|
||||
from pathlib import Path
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Patronen voor verwijzingen naar andere regelingen
|
||||
# "de Telecommunicatiewet", "de Wet open overheid", "het Burgerlijk Wetboek"
|
||||
WET_REF_PATTERN = re.compile(
|
||||
r"(?:de|het|van de|in de|bij de|krachtens de|bedoeld in de)\s+"
|
||||
r"((?:Wet|Algemene wet|Wetboek|Boek|Grondwet|Besluit|Regeling|Verordening)"
|
||||
r"(?:\s+(?:op|van|tot|inzake|betreffende|ter))?"
|
||||
r"(?:\s+\w+){0,6}?)"
|
||||
r"(?=[,.\s;)])",
|
||||
re.IGNORECASE,
|
||||
)
|
||||
|
||||
# "artikel 6 van de AVG", "artikel 1.1, tweede lid"
|
||||
ARTIKEL_REF_PATTERN = re.compile(
|
||||
r"artikel(?:en)?\s+([\d.]+(?:\s*(?:,\s*[\d.]+|tot en met\s+[\d.]+))*)"
|
||||
r"(?:\s*,?\s*(?:eerste|tweede|derde|vierde|vijfde|zesde|zevende|achtste|negende|tiende)\s+lid)?"
|
||||
r"(?:\s+van\s+(?:de|het)\s+(.+?))?(?=[,.\s;)])",
|
||||
re.IGNORECASE,
|
||||
)
|
||||
|
||||
# BWB-ID verwijzingen (zeldzaam in tekst, maar soms in metadata)
|
||||
BWB_REF_PATTERN = re.compile(r"BWBR\d{7}")
|
||||
|
||||
|
||||
def extract_references(bwb_id: str, tekst: str) -> list[dict]:
|
||||
"""Extraheer verwijzingen uit een wettekst.
|
||||
|
||||
Args:
|
||||
bwb_id: BWB-ID van de bronregeling.
|
||||
tekst: Volledige Markdown tekst.
|
||||
|
||||
Returns:
|
||||
Lijst van dicts met bron, doel, type, en context.
|
||||
"""
|
||||
refs: list[dict] = []
|
||||
seen: set[str] = set()
|
||||
|
||||
# Zoek directe BWB-ID verwijzingen
|
||||
for match in BWB_REF_PATTERN.finditer(tekst):
|
||||
target_bwb = match.group(0)
|
||||
if target_bwb != bwb_id and target_bwb not in seen:
|
||||
seen.add(target_bwb)
|
||||
start = max(0, match.start() - 50)
|
||||
end = min(len(tekst), match.end() + 50)
|
||||
refs.append({
|
||||
"bron": bwb_id,
|
||||
"doel": target_bwb,
|
||||
"type": "bwb_id",
|
||||
"context": tekst[start:end].strip(),
|
||||
})
|
||||
|
||||
# Zoek wet-naam verwijzingen
|
||||
for match in WET_REF_PATTERN.finditer(tekst):
|
||||
wet_naam = match.group(1).strip().rstrip(".,;")
|
||||
if len(wet_naam) < 5 or wet_naam.lower() in ("wet", "wetboek", "besluit"):
|
||||
continue
|
||||
ref_key = wet_naam.lower()
|
||||
if ref_key not in seen:
|
||||
seen.add(ref_key)
|
||||
start = max(0, match.start() - 30)
|
||||
end = min(len(tekst), match.end() + 30)
|
||||
refs.append({
|
||||
"bron": bwb_id,
|
||||
"doel_naam": wet_naam,
|
||||
"type": "wet_naam",
|
||||
"context": tekst[start:end].strip(),
|
||||
})
|
||||
|
||||
return refs
|
||||
|
||||
|
||||
def build_reference_graph(repo_path: Path) -> dict:
|
||||
"""Bouw de volledige cross-referentie graaf.
|
||||
|
||||
Args:
|
||||
repo_path: Pad naar de wetgit/rijk repo.
|
||||
|
||||
Returns:
|
||||
Dict met nodes (regelingen) en edges (verwijzingen).
|
||||
"""
|
||||
index_path = repo_path / "index.json"
|
||||
if index_path.exists():
|
||||
data = json.loads(index_path.read_text(encoding="utf-8"))
|
||||
regelingen = data.get("regelingen", [])
|
||||
else:
|
||||
from wetgit.pipeline.indexer import generate_index
|
||||
regelingen = generate_index(repo_path)
|
||||
|
||||
# Bouw titel → bwb_id lookup
|
||||
titel_to_bwb: dict[str, str] = {}
|
||||
for r in regelingen:
|
||||
titel_to_bwb[r.get("titel", "").lower()] = r["bwb_id"]
|
||||
if r.get("citeertitel"):
|
||||
titel_to_bwb[r["citeertitel"].lower()] = r["bwb_id"]
|
||||
|
||||
all_refs: list[dict] = []
|
||||
edges: dict[str, list[str]] = {} # bwb_id → [verwezen bwb_ids]
|
||||
|
||||
for regeling in regelingen:
|
||||
bwb_id = regeling["bwb_id"]
|
||||
md_path = repo_path / regeling["pad"] / "README.md"
|
||||
if not md_path.exists():
|
||||
continue
|
||||
|
||||
tekst = md_path.read_text(encoding="utf-8")
|
||||
refs = extract_references(bwb_id, tekst)
|
||||
|
||||
targets: list[str] = []
|
||||
for ref in refs:
|
||||
# Probeer wet_naam te resolven naar bwb_id
|
||||
if ref["type"] == "wet_naam":
|
||||
doel_naam = ref["doel_naam"].lower()
|
||||
resolved = titel_to_bwb.get(doel_naam)
|
||||
if resolved:
|
||||
ref["doel"] = resolved
|
||||
targets.append(ref.get("doel", ref.get("doel_naam", "")))
|
||||
else:
|
||||
targets.append(ref["doel"])
|
||||
|
||||
all_refs.append(ref)
|
||||
|
||||
if targets:
|
||||
edges[bwb_id] = list(set(targets))
|
||||
|
||||
# Bereken inbound references
|
||||
inbound: dict[str, list[str]] = {}
|
||||
for src, dests in edges.items():
|
||||
for dest in dests:
|
||||
inbound.setdefault(dest, []).append(src)
|
||||
|
||||
logger.info(
|
||||
"Graaf: %d regelingen met verwijzingen, %d unieke edges",
|
||||
len(edges),
|
||||
sum(len(v) for v in edges.values()),
|
||||
)
|
||||
|
||||
return {
|
||||
"nodes": len(edges),
|
||||
"total_edges": sum(len(v) for v in edges.values()),
|
||||
"outbound": edges,
|
||||
"inbound": inbound,
|
||||
"references": all_refs,
|
||||
}
|
||||
|
||||
|
||||
def query_references(
|
||||
graph: dict, bwb_id: str, direction: str = "both",
|
||||
) -> dict:
|
||||
"""Query verwijzingen voor een specifieke regeling.
|
||||
|
||||
Args:
|
||||
graph: De cross-referentie graaf.
|
||||
bwb_id: BWB-ID om te querien.
|
||||
direction: "outbound" (verwijst naar), "inbound" (verwezen door), "both".
|
||||
|
||||
Returns:
|
||||
Dict met verwijzingen in de gevraagde richting.
|
||||
"""
|
||||
result: dict = {"bwb_id": bwb_id}
|
||||
|
||||
if direction in ("outbound", "both"):
|
||||
result["verwijst_naar"] = graph.get("outbound", {}).get(bwb_id, [])
|
||||
|
||||
if direction in ("inbound", "both"):
|
||||
result["verwezen_door"] = graph.get("inbound", {}).get(bwb_id, [])
|
||||
|
||||
return result
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
import argparse
|
||||
|
||||
logging.basicConfig(level=logging.INFO, format="%(asctime)s %(levelname)s %(message)s")
|
||||
|
||||
parser = argparse.ArgumentParser(description="WetGit cross-referentie analyse")
|
||||
parser.add_argument("--repo", type=Path, required=True)
|
||||
parser.add_argument("--query", help="BWB-ID om te querien")
|
||||
parser.add_argument("--output", type=Path, help="Schrijf graaf naar JSON bestand")
|
||||
args = parser.parse_args()
|
||||
|
||||
graph = build_reference_graph(args.repo)
|
||||
print(f"Graaf: {graph['nodes']} regelingen, {graph['total_edges']} verwijzingen")
|
||||
|
||||
if args.output:
|
||||
with open(args.output, "w", encoding="utf-8") as f:
|
||||
json.dump(graph, f, ensure_ascii=False, indent=2)
|
||||
print(f"Geschreven: {args.output}")
|
||||
|
||||
if args.query:
|
||||
result = query_references(graph, args.query)
|
||||
print(json.dumps(result, ensure_ascii=False, indent=2))
|
||||
112
src/wetgit/ai/domains.py
Normal file
112
src/wetgit/ai/domains.py
Normal file
|
|
@ -0,0 +1,112 @@
|
|||
"""Domein-classificatie voor change-alerts.
|
||||
|
||||
Classificeert regelingen naar compliance-domeinen (NIS2, DORA, AVG, etc.)
|
||||
op basis van keyword-matching in titel en tekst.
|
||||
|
||||
Domeinen zijn gedefinieerd als sets van zoektermen. Een regeling hoort bij
|
||||
een domein als minstens één term voorkomt in de titel of tekst.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import re
|
||||
|
||||
# Domein-definities: naam → keywords die matchen in titel/tekst
|
||||
DOMAINS: dict[str, list[str]] = {
|
||||
"nis2-cybersecurity": [
|
||||
"netwerk- en informatiebeveiliging",
|
||||
"NIS2",
|
||||
"cybersecurity",
|
||||
"cyberbeveiligingswet",
|
||||
"beveiligingsincident",
|
||||
"digitale weerbaarheid",
|
||||
"CSIRT",
|
||||
"Telecommunicatiewet",
|
||||
"Wet beveiliging netwerk- en informatiesystemen",
|
||||
],
|
||||
"dora-financieel": [
|
||||
"DORA",
|
||||
"digitale operationele veerkracht",
|
||||
"financiële sector",
|
||||
"Wet op het financieel toezicht",
|
||||
"DNB",
|
||||
"AFM",
|
||||
"ICT-risicobeheer",
|
||||
"Pensioenwet",
|
||||
"Bankwet",
|
||||
],
|
||||
"avg-privacy": [
|
||||
"persoonsgegevens",
|
||||
"AVG",
|
||||
"GDPR",
|
||||
"Uitvoeringswet Algemene verordening gegevensbescherming",
|
||||
"verwerking",
|
||||
"Autoriteit Persoonsgegevens",
|
||||
"gegevensbescherming",
|
||||
"privacy",
|
||||
"betrokkene",
|
||||
],
|
||||
"omgevingswet": [
|
||||
"Omgevingswet",
|
||||
"omgevingsplan",
|
||||
"omgevingsvergunning",
|
||||
"omgevingsvisie",
|
||||
"Besluit activiteiten leefomgeving",
|
||||
"Besluit kwaliteit leefomgeving",
|
||||
"milieueffectrapportage",
|
||||
],
|
||||
"arbeidsrecht": [
|
||||
"arbeidsovereenkomst",
|
||||
"Burgerlijk Wetboek Boek 7",
|
||||
"Arbeidstijdenwet",
|
||||
"Arbeidsomstandighedenwet",
|
||||
"Wet minimumloon",
|
||||
"Wet werk en zekerheid",
|
||||
"Ontslagrecht",
|
||||
"CAO",
|
||||
"Wet arbeid vreemdelingen",
|
||||
"WIA",
|
||||
"Werkloosheidswet",
|
||||
],
|
||||
"belastingrecht": [
|
||||
"Wet inkomstenbelasting",
|
||||
"Wet op de vennootschapsbelasting",
|
||||
"Wet op de omzetbelasting",
|
||||
"Algemene wet inzake rijksbelastingen",
|
||||
"Invorderingswet",
|
||||
"belastingplichtige",
|
||||
],
|
||||
}
|
||||
|
||||
|
||||
def classify_regeling(titel: str, tekst: str | None = None) -> list[str]:
|
||||
"""Classificeer een regeling naar domeinen.
|
||||
|
||||
Args:
|
||||
titel: Titel van de regeling.
|
||||
tekst: Optioneel: volledige tekst (voor diepere matching).
|
||||
|
||||
Returns:
|
||||
Lijst van domein-namen waar de regeling bij hoort.
|
||||
"""
|
||||
search_text = titel.lower()
|
||||
if tekst:
|
||||
# Eerste 5000 chars is genoeg voor domein-detectie
|
||||
search_text += " " + tekst[:5000].lower()
|
||||
|
||||
matched: list[str] = []
|
||||
for domain, keywords in DOMAINS.items():
|
||||
for kw in keywords:
|
||||
if kw.lower() in search_text:
|
||||
matched.append(domain)
|
||||
break
|
||||
|
||||
return matched
|
||||
|
||||
|
||||
def list_domains() -> list[dict[str, str | list[str]]]:
|
||||
"""Lijst van alle beschikbare domeinen met hun keywords."""
|
||||
return [
|
||||
{"naam": name, "keywords": keywords}
|
||||
for name, keywords in DOMAINS.items()
|
||||
]
|
||||
|
|
@ -9,11 +9,16 @@ from __future__ import annotations
|
|||
import os
|
||||
from pathlib import Path
|
||||
|
||||
from fastapi import FastAPI, HTTPException, Query
|
||||
from fastapi import FastAPI, HTTPException, Query, Request
|
||||
from fastapi.middleware.cors import CORSMiddleware
|
||||
from slowapi import Limiter, _rate_limit_exceeded_handler
|
||||
from slowapi.errors import RateLimitExceeded
|
||||
from slowapi.util import get_remote_address
|
||||
|
||||
from wetgit import __version__
|
||||
from wetgit.api.data import RegelingStore
|
||||
from fastapi.responses import Response
|
||||
|
||||
from wetgit.api.models import (
|
||||
ArtikelDetail,
|
||||
ArtikelItem,
|
||||
|
|
@ -25,10 +30,13 @@ from wetgit.api.models import (
|
|||
ZoekResultaat,
|
||||
)
|
||||
|
||||
REPO_PATH = Path(os.environ.get("WETGIT_REPO", "/tmp/wetgit-index-test"))
|
||||
_git_repos = os.environ.get("WETGIT_GIT_REPOS_DIR", os.environ.get("WETGIT_REPO", "/data/wetgit/git-repos"))
|
||||
REPO_PATH = Path(_git_repos) / "rijk" if "WETGIT_GIT_REPOS_DIR" in os.environ else Path(_git_repos)
|
||||
MEILI_URL = os.environ.get("MEILI_URL", "http://127.0.0.1:7700")
|
||||
QDRANT_URL = os.environ.get("QDRANT_URL", "http://127.0.0.1:6333")
|
||||
|
||||
limiter = Limiter(key_func=get_remote_address, default_limits=["60/minute"])
|
||||
|
||||
app = FastAPI(
|
||||
title="WetGit API",
|
||||
description="Nederlandse wetgeving als code — REST API",
|
||||
|
|
@ -38,6 +46,9 @@ app = FastAPI(
|
|||
openapi_url="/api/openapi.json",
|
||||
)
|
||||
|
||||
app.state.limiter = limiter
|
||||
app.add_exception_handler(RateLimitExceeded, _rate_limit_exceeded_handler)
|
||||
|
||||
app.add_middleware(
|
||||
CORSMiddleware,
|
||||
allow_origins=["*"],
|
||||
|
|
@ -156,63 +167,182 @@ def get_diff(
|
|||
# --- Zoeken ---
|
||||
|
||||
@app.get("/api/v1/zoeken", response_model=list[ZoekResultaat])
|
||||
@limiter.limit("30/minute")
|
||||
def zoeken(
|
||||
request: Request,
|
||||
q: str = Query(..., min_length=2, description="Zoekterm"),
|
||||
type: str | None = Query(None, description="Filter op type"),
|
||||
mode: str = Query("keyword", description="Zoekmodus: keyword, semantic, of hybrid"),
|
||||
limit: int = Query(20, ge=1, le=100, description="Max resultaten"),
|
||||
) -> list[dict]:
|
||||
"""Doorzoek alle wetgeving. Modes: keyword (Meilisearch), semantic (Qdrant), hybrid (beide)."""
|
||||
from wetgit.api.search import MeiliSearch
|
||||
from wetgit.ai.semantic import SemanticSearch
|
||||
|
||||
semantic_results: list[dict] = []
|
||||
keyword_results: list[dict] = []
|
||||
|
||||
# Semantic search
|
||||
if mode in ("semantic", "hybrid"):
|
||||
from wetgit.ai.semantic import SemanticSearch
|
||||
sem = SemanticSearch(qdrant_url=QDRANT_URL)
|
||||
if sem.health():
|
||||
results = sem.search(q, limit=limit)
|
||||
semantic_results = sem.search(q, limit=limit)
|
||||
if mode == "semantic":
|
||||
return results
|
||||
return semantic_results
|
||||
|
||||
# Hybrid: combineer met keyword
|
||||
semantic_results = {r["artikel"]: r for r in results}
|
||||
# Keyword search (Meilisearch → grep fallback)
|
||||
if mode in ("keyword", "hybrid"):
|
||||
meili = MeiliSearch(url=MEILI_URL)
|
||||
if meili.health():
|
||||
filter_str = f'type = "{type}"' if type else None
|
||||
result = meili.search(q, filter_=filter_str, limit=limit)
|
||||
keyword_results = [
|
||||
{
|
||||
"bwb_id": hit["bwb_id"],
|
||||
"titel": hit.get("regeling_titel", ""),
|
||||
"artikel": f"Artikel {hit.get('artikel_nummer', '?')}",
|
||||
"context": hit.get("tekst", "")[:200],
|
||||
}
|
||||
for hit in result.get("hits", [])
|
||||
]
|
||||
else:
|
||||
# Fallback: grep-style zoeken
|
||||
for regeling in store.list_regelingen():
|
||||
if type and regeling.get("type") != type:
|
||||
continue
|
||||
tekst = store.get_tekst(regeling["bwb_id"])
|
||||
if tekst is None or q.lower() not in tekst.lower():
|
||||
continue
|
||||
current_artikel = ""
|
||||
for line in tekst.split("\n"):
|
||||
if line.startswith("### Artikel"):
|
||||
current_artikel = line.replace("### ", "")
|
||||
if q.lower() in line.lower() and current_artikel:
|
||||
keyword_results.append({
|
||||
"bwb_id": regeling["bwb_id"],
|
||||
"titel": regeling.get("titel", ""),
|
||||
"artikel": current_artikel,
|
||||
"context": line.strip()[:200],
|
||||
})
|
||||
if len(keyword_results) >= limit:
|
||||
break
|
||||
|
||||
from wetgit.api.search import MeiliSearch
|
||||
meili = MeiliSearch(url=MEILI_URL)
|
||||
if mode == "keyword":
|
||||
return keyword_results
|
||||
|
||||
# Probeer Meilisearch, fallback naar grep
|
||||
if meili.health():
|
||||
filter_str = f'type = "{type}"' if type else None
|
||||
result = meili.search(q, filter_=filter_str, limit=limit)
|
||||
# Hybrid: Reciprocal Rank Fusion (RRF)
|
||||
k = 60 # RRF constant
|
||||
scores: dict[str, float] = {}
|
||||
items: dict[str, dict] = {}
|
||||
|
||||
return [
|
||||
{
|
||||
"bwb_id": hit["bwb_id"],
|
||||
"titel": hit.get("regeling_titel", ""),
|
||||
"artikel": f"Artikel {hit.get('artikel_nummer', '?')}",
|
||||
"context": hit.get("tekst", "")[:200],
|
||||
}
|
||||
for hit in result.get("hits", [])
|
||||
]
|
||||
for rank, r in enumerate(semantic_results):
|
||||
key = f"{r['bwb_id']}_{r['artikel']}"
|
||||
scores[key] = scores.get(key, 0) + 1 / (k + rank + 1)
|
||||
items[key] = r
|
||||
|
||||
# Fallback: grep-style zoeken
|
||||
resultaten: list[dict] = []
|
||||
for regeling in store.list_regelingen():
|
||||
if type and regeling.get("type") != type:
|
||||
for rank, r in enumerate(keyword_results):
|
||||
key = f"{r['bwb_id']}_{r['artikel']}"
|
||||
scores[key] = scores.get(key, 0) + 1 / (k + rank + 1)
|
||||
if key not in items:
|
||||
items[key] = r
|
||||
|
||||
fused = sorted(scores.items(), key=lambda x: x[1], reverse=True)
|
||||
return [
|
||||
{**items[key], "score": round(score, 4)} for key, score in fused[:limit]
|
||||
]
|
||||
|
||||
|
||||
# --- Feed ---
|
||||
|
||||
@app.get("/api/v1/feed.xml", response_class=Response)
|
||||
def feed(limit: int = Query(50, ge=1, le=200, description="Max entries")) -> Response:
|
||||
"""Atom feed van recente wijzigingen in de wetgeving."""
|
||||
import subprocess
|
||||
from datetime import datetime
|
||||
|
||||
try:
|
||||
result = subprocess.run(
|
||||
["git", "log", f"--max-count={limit}", "--format=%H%n%ai%n%s%n---"],
|
||||
cwd=store.repo_path, capture_output=True, text=True, check=True,
|
||||
)
|
||||
except subprocess.CalledProcessError:
|
||||
return Response(content="<feed/>", media_type="application/atom+xml")
|
||||
|
||||
entries = []
|
||||
lines = result.stdout.strip().split("\n---\n")
|
||||
for block in lines:
|
||||
parts = block.strip().split("\n")
|
||||
if len(parts) < 3:
|
||||
continue
|
||||
tekst = store.get_tekst(regeling["bwb_id"])
|
||||
if tekst is None or q.lower() not in tekst.lower():
|
||||
continue
|
||||
current_artikel = ""
|
||||
for line in tekst.split("\n"):
|
||||
if line.startswith("### Artikel"):
|
||||
current_artikel = line.replace("### ", "")
|
||||
if q.lower() in line.lower() and current_artikel:
|
||||
resultaten.append({
|
||||
"bwb_id": regeling["bwb_id"],
|
||||
"titel": regeling.get("titel", ""),
|
||||
"artikel": current_artikel,
|
||||
"context": line.strip()[:200],
|
||||
})
|
||||
if len(resultaten) >= limit:
|
||||
return resultaten
|
||||
return resultaten
|
||||
commit_hash, date_str, subject = parts[0], parts[1], parts[2]
|
||||
# Parse "2026-03-30 12:00:00 +0200"
|
||||
dt = date_str.split(" +")[0].split(" -")[0]
|
||||
entries.append(
|
||||
f' <entry>\n'
|
||||
f' <title>{_xml_escape(subject)}</title>\n'
|
||||
f' <id>urn:wetgit:commit:{commit_hash}</id>\n'
|
||||
f' <updated>{dt.replace(" ", "T")}Z</updated>\n'
|
||||
f' <link href="https://git.wetgit.nl/wetgit/rijk/commit/{commit_hash}"/>\n'
|
||||
f' <summary>{_xml_escape(subject)}</summary>\n'
|
||||
f' </entry>'
|
||||
)
|
||||
|
||||
now = datetime.utcnow().strftime("%Y-%m-%dT%H:%M:%SZ")
|
||||
atom = (
|
||||
'<?xml version="1.0" encoding="utf-8"?>\n'
|
||||
'<feed xmlns="http://www.w3.org/2005/Atom">\n'
|
||||
f' <title>WetGIT — Wijzigingen in Nederlandse wetgeving</title>\n'
|
||||
f' <link href="https://api.wetgit.nl/api/v1/feed.xml" rel="self"/>\n'
|
||||
f' <link href="https://wetgit.nl"/>\n'
|
||||
f' <id>urn:wetgit:feed:wijzigingen</id>\n'
|
||||
f' <updated>{now}</updated>\n'
|
||||
f' <subtitle>Elke wet een Markdown-bestand, elke wijziging een Git-commit.</subtitle>\n'
|
||||
+ "\n".join(entries)
|
||||
+ "\n</feed>"
|
||||
)
|
||||
return Response(content=atom, media_type="application/atom+xml")
|
||||
|
||||
|
||||
def _xml_escape(s: str) -> str:
|
||||
"""Escape XML special characters."""
|
||||
return s.replace("&", "&").replace("<", "<").replace(">", ">").replace('"', """)
|
||||
|
||||
|
||||
# --- Domeinen ---
|
||||
|
||||
@app.get("/api/v1/domeinen")
|
||||
def list_domeinen() -> list[dict]:
|
||||
"""Lijst van beschikbare compliance-domeinen voor alerts."""
|
||||
from wetgit.ai.domains import list_domains
|
||||
return list_domains()
|
||||
|
||||
|
||||
@app.get("/api/v1/regelingen/{bwb_id}/domeinen")
|
||||
def get_regeling_domeinen(bwb_id: str) -> dict:
|
||||
"""Classificeer een regeling naar compliance-domeinen."""
|
||||
from wetgit.ai.domains import classify_regeling as classify
|
||||
regeling = store.get_regeling(bwb_id)
|
||||
if not regeling:
|
||||
raise HTTPException(status_code=404, detail=f"Regeling {bwb_id} niet gevonden")
|
||||
tekst = store.get_tekst(bwb_id)
|
||||
domeinen = classify(regeling.get("titel", ""), tekst)
|
||||
return {"bwb_id": bwb_id, "domeinen": domeinen}
|
||||
|
||||
|
||||
# --- Cross-referenties ---
|
||||
|
||||
@app.get("/api/v1/regelingen/{bwb_id}/referenties")
|
||||
def get_referenties(
|
||||
bwb_id: str,
|
||||
richting: str = Query("both", description="outbound, inbound, of both"),
|
||||
) -> dict:
|
||||
"""Toon cross-referenties van/naar een regeling."""
|
||||
from wetgit.ai.crossref import extract_references
|
||||
regeling = store.get_regeling(bwb_id)
|
||||
if not regeling:
|
||||
raise HTTPException(status_code=404, detail=f"Regeling {bwb_id} niet gevonden")
|
||||
tekst = store.get_tekst(bwb_id)
|
||||
if tekst is None:
|
||||
return {"bwb_id": bwb_id, "referenties": []}
|
||||
refs = extract_references(bwb_id, tekst)
|
||||
return {"bwb_id": bwb_id, "referenties": refs, "aantal": len(refs)}
|
||||
|
|
|
|||
|
|
@ -14,6 +14,7 @@ class RegelingMeta(BaseModel):
|
|||
type: str
|
||||
status: str
|
||||
datum_inwerkingtreding: str | None = None
|
||||
datum_verval: str | None = None
|
||||
bron: str
|
||||
pad: str
|
||||
artikelen: int
|
||||
|
|
|
|||
|
|
@ -73,6 +73,7 @@ def _parse_regeling(md_path: Path, repo_path: Path) -> dict | None:
|
|||
"type": meta.get("type", ""),
|
||||
"status": meta.get("status", ""),
|
||||
"datum_inwerkingtreding": meta.get("datum_inwerkingtreding"),
|
||||
"datum_verval": meta.get("datum_verval"),
|
||||
"bron": meta.get("bron", ""),
|
||||
"pad": rel_path,
|
||||
"artikelen": artikel_count,
|
||||
|
|
|
|||
109
src/wetgit/tasks.py
Normal file
109
src/wetgit/tasks.py
Normal file
|
|
@ -0,0 +1,109 @@
|
|||
"""Celery achtergrondtaken voor WetGIT.
|
||||
|
||||
Taken voor dagelijkse sync, alert-verwerking en indexering.
|
||||
|
||||
Usage:
|
||||
celery -A wetgit.tasks worker --loglevel=info
|
||||
celery -A wetgit.tasks beat --loglevel=info # voor scheduling
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
import os
|
||||
from pathlib import Path
|
||||
|
||||
from celery import Celery
|
||||
from celery.schedules import crontab
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
REDIS_URL = os.environ.get("CELERY_BROKER_URL", "redis://127.0.0.1:6379/0")
|
||||
RESULT_BACKEND = os.environ.get("CELERY_RESULT_BACKEND", "redis://127.0.0.1:6379/1")
|
||||
REPO_PATH = Path(os.environ.get("WETGIT_GIT_REPOS_DIR", "/data/wetgit/git-repos"))
|
||||
|
||||
app = Celery("wetgit", broker=REDIS_URL, backend=RESULT_BACKEND)
|
||||
|
||||
app.conf.update(
|
||||
task_serializer="json",
|
||||
accept_content=["json"],
|
||||
result_serializer="json",
|
||||
timezone="Europe/Amsterdam",
|
||||
task_track_started=True,
|
||||
beat_schedule={
|
||||
"daily-sync": {
|
||||
"task": "wetgit.tasks.daily_sync",
|
||||
"schedule": crontab(hour=3, minute=0),
|
||||
},
|
||||
"daily-index": {
|
||||
"task": "wetgit.tasks.reindex",
|
||||
"schedule": crontab(hour=3, minute=30),
|
||||
},
|
||||
},
|
||||
)
|
||||
|
||||
|
||||
@app.task(name="wetgit.tasks.daily_sync")
|
||||
def daily_sync() -> dict:
|
||||
"""Voer de dagelijkse sync uit (SRU delta-updates)."""
|
||||
from wetgit.pipeline.sync import run_sync
|
||||
|
||||
rijk_repo = REPO_PATH / "rijk"
|
||||
xml_cache = Path(os.environ.get("WETGIT_DATA_DIR", "/data/wetgit")) / "xml-cache"
|
||||
|
||||
result = run_sync(
|
||||
rijk_repo=rijk_repo,
|
||||
xml_cache=xml_cache,
|
||||
)
|
||||
logger.info("Sync voltooid: %s", result)
|
||||
return result
|
||||
|
||||
|
||||
@app.task(name="wetgit.tasks.reindex")
|
||||
def reindex() -> dict:
|
||||
"""Herindexeer Meilisearch en Qdrant."""
|
||||
meili_url = os.environ.get("MEILI_URL", "http://127.0.0.1:7700")
|
||||
qdrant_url = os.environ.get("QDRANT_URL", "http://127.0.0.1:6333")
|
||||
|
||||
rijk_repo = REPO_PATH / "rijk"
|
||||
results: dict = {}
|
||||
|
||||
# Meilisearch
|
||||
try:
|
||||
from wetgit.api.search import index_repo as meili_index
|
||||
results["meilisearch"] = meili_index(rijk_repo, meili_url)
|
||||
except Exception as e:
|
||||
logger.error("Meilisearch indexering mislukt: %s", e)
|
||||
results["meilisearch_error"] = str(e)
|
||||
|
||||
# Qdrant
|
||||
try:
|
||||
from wetgit.ai.semantic import index_repo as qdrant_index
|
||||
results["qdrant"] = qdrant_index(rijk_repo, qdrant_url)
|
||||
except Exception as e:
|
||||
logger.error("Qdrant indexering mislukt: %s", e)
|
||||
results["qdrant_error"] = str(e)
|
||||
|
||||
return results
|
||||
|
||||
|
||||
@app.task(name="wetgit.tasks.send_alert")
|
||||
def send_alert(
|
||||
bwb_id: str,
|
||||
titel: str,
|
||||
diff_text: str,
|
||||
recipients: list[str] | None = None,
|
||||
webhooks: list[str] | None = None,
|
||||
domain_filter: list[str] | None = None,
|
||||
) -> bool:
|
||||
"""Verstuur een change-alert als achtergrondtaak."""
|
||||
from wetgit.ai.alerts import send_change_alert
|
||||
|
||||
return send_change_alert(
|
||||
bwb_id=bwb_id,
|
||||
titel=titel,
|
||||
diff_text=diff_text,
|
||||
recipients=recipients,
|
||||
webhooks=webhooks,
|
||||
domain_filter=domain_filter,
|
||||
)
|
||||
Loading…
Add table
Reference in a new issue