rgonsal/comunidadhll

Fork 0

Files

devRaGonSa 0da8338ba8 Fix

2026-06-05 16:57:25 +02:00

7.7 KiB

Raw Blame History

Monthly Player Ranking Data Audit

Validation Date

2026-03-24

Scope

Auditoria tecnica del estado real de datos para un futuro ranking mensual de "mejores jugadores" usando:

codigo y esquema historico del backend
persistencia local en backend/data/hll_vietnam_dev.sqlite3
snapshots historicos ya generados en backend/data/snapshots/
discovery ya documentada de la fuente CRCON/scoreboard

No se implementa todavia ninguna formula de ranking, tabla nueva ni cambio de UI.

Evidence Reviewed

backend/app/historical_models.py
backend/app/historical_storage.py
backend/app/historical_ingestion.py
backend/app/historical_snapshots.py
backend/app/historical_snapshot_storage.py
backend/app/payloads.py
docs/historical-domain-model.md
docs/historical-data-quality-notes.md
docs/historical-crcon-source-discovery.md
docs/historical-coverage-report.md

Current Persisted State

Local SQLite currently contains:

historical_servers: 3
historical_matches: 9638
historical_players: 163506
historical_player_match_stats: 1062244
historical_ingestion_runs: 32

Coverage visible in the local database today:

comunidad-hispana-01: 8602 matches, from 2024-05-17T20:48:40Z to 2026-03-23T16:01:20Z
comunidad-hispana-02: 753 matches, from 2025-11-04T17:10:19Z to 2026-03-23T18:58:06Z
comunidad-hispana-03: 283 matches, from 2026-01-14T22:34:18Z to 2026-03-08T18:11:52Z

Important quality notes from the local dataset:

all historical_player_match_stats rows have populated values for kills, deaths, teamkills, time, KPM, KDA, combat, offense, defense, support, level and team side
85,270 / 163,506 players have SteamID; the rest currently depend on crcon-player:* identity, so identity continuity is usable but not equally strong for every player
all persisted matches have start/end timestamps, map and game mode
7,961 / 9,638 persisted matches currently have both allied/axis score

What Is Persisted Today

Match level

Persisted per match:

server
external match id
creation/start/end timestamps
map name, pretty name, game mode, image
allied score
axis score

Not persisted at match level:

raw full CRCON JSON payload
derived win/loss per player
any tactical event ledger

Player identity level

Persisted per player:

stable player key
display name
SteamID when available
source player id
first seen / last seen

Player per match level

Persisted per player-match row:

level
team side
kills
deaths
teamkills
time seconds
kills per minute
deaths per minute
kill/death ratio
combat
offense
defense
support

What Exists In CRCON Source But Is Not Persisted

The documented CRCON detail payload already exposes fields that the project does not currently store:

kills_by_type
kills_streak
longest_life_secs
shortest_life_secs
most_killed
death_by
weapons
death_by_weapons

These fields are visible in the source discovery, but the current upsert logic only persists the smaller normalized subset listed above.

What Was Not Confirmed As Available

The current repository evidence does not confirm any stable source fields for:

garrisons destroyed
outposts destroyed
direct duel history in a structured reusable form
tactical actions such as node building, dismantling or commander abilities

For direct encounters, the source does expose most_killed and death_by, but that is not the same thing as a complete duel graph and is not stored today.

Availability And Reliability Matrix

Metric / signal	Exists in source	Persisted today	Reliability for ranking	Extra work	V1?
Kills	Yes	Yes	High	None	Yes
Deaths	Yes	Yes	High	None	Yes
Support	Yes	Yes	High	None	Yes
Combat	Yes	Yes	Medium-High	Query only	Maybe
Offense	Yes	Yes	Medium-High	Query only	Maybe
Defense	Yes	Yes	Medium-High	Query only	Maybe
Teamkills	Yes	Yes	High as penalty signal	Query only	Maybe
Match count	Yes	Derivable	High	Query only	Yes
Time played	Yes	Yes	High	Query only	Yes
KPM	Yes	Yes	Medium-High if computed from totals, lower if averaging raw per-match KPM	Query only	Yes
KDA / KD ratio	Yes	Yes	Medium-High if computed from totals, lower if averaging raw per-match KDA	Query only	Yes
100+ kill matches	Derivable	Exposed in leaderboard	Medium	None	No
Win/loss context	Partially	Derivable from team side + scores when scores exist	Medium	Query and validation	Maybe
Weapons profile	Yes	No	Medium-Low for V1	New persistence/modeling	No
Kill streak / life metrics	Yes	No	Medium-Low for V1	New persistence/modeling	No
Direct encounters / duels	Partial only	No	Low today	New extraction plus modeling	No
Garrisons destroyed	Not confirmed	No	Unknown	Source validation first	No
OPs destroyed	Not confirmed	No	Unknown	Source validation first	No
Tactical impact composite	Partial proxies only	Partial	Medium after design work	Query/design	No for strict V1

Current Product Readiness

The backend is already able to expose monthly leaderboard snapshots, but only for these metrics:

kills
deaths
support
matches_over_100_kills

This means:

the project already supports a monthly ranking surface operationally
the current ranking surface is narrower than the real data persisted in SQLite
offense, defense, combat, KPM and KDA are available in the database but not yet wired as first-class monthly leaderboard metrics

Recommendation For Ranking V1

A realistic V1 should use only metrics already persisted with strong coverage and low modeling risk:

total kills
total support
KPM recomputed from SUM(kills) / SUM(time_seconds)
KDA recomputed from SUM(kills) / NULLIF(SUM(deaths), 0)
minimum participation gate based on matches played and/or minutes played
optional small penalty for teamkills

Why this is the safest V1:

no new ingestion is required
all needed raw fields already exist locally
the ranking can avoid inflated outliers by requiring minimum activity
KPM and KDA become more defensible when derived from totals, not from average of precomputed per-match ratios

Recommendation For Ranking V2

A stronger V2 can expand the model with already persisted but not yet surfaced signals:

offense
defense
combat
win/loss context derived from player side and match result when scores exist

V2 may also evaluate source-only fields if a later task decides to persist them:

weapons-based detail
kill streak and life-span signals
partial rivalry/encounter signals from most_killed and death_by

Metrics Not Recommended For Early Use

Not recommended for V1 and not yet defensible for a serious monthly ranking:

garrisons destroyed
OPs destroyed
duel ranking
generic "impact in match" as a single opaque score

Reason:

either the source availability is not confirmed
or the source exists but the project does not yet persist enough structure to make the metric auditable and stable

Final Conclusion

The repository already has enough persisted historical data for a credible monthly Top 3 V1 without touching ingestion:

kills
support
time played
deaths
teamkills
offense
defense
combat

The most realistic first release is a constrained monthly ranking based on volume plus efficiency, using only persisted fields and explicit participation thresholds. Tactical metrics such as garrisons, OPs and real duel graphs should stay out of scope until the source is revalidated and the missing structures are persisted deliberately.

7.7 KiB Raw Blame History