Some metrics can scream cheat
Rage behavior, impossible-looking snaps, or repeated obvious abuse can light up a demo quickly. Those are real signals, but they are not the whole problem.
NullCS is a behavioral review project for Counter-Strike 2 demos. It ranks suspicious players from structured demo signals and returns evidence that can be reviewed, especially in the harder cases where subtle cheating and strong legitimate play start to look closer than they should.

Some demo metrics do scream that something is wrong. The harder problem is the quieter behavior that only starts to separate once timing, context, and process are modeled together.
Read the project backgroundNullCS analyzes CS2 demos, builds behavior signals, and returns ranked review output that can still be explained under scrutiny.
The project studies whether suspicious behavior can be surfaced more reliably in real demos without flattening the problem into a single loud metric. Some cases are obvious. The harder ones are the lobbies where subtle assistance, strange timing, and strong legitimate play start to sit uncomfortably close together.
Cheater, normal, and pro slices in the current CS2 training stack.
One match expands into many encounter windows and control-path measurements.
The difficult cases are the ones where the evidence has to stay readable even when suspicious behavior and strong legitimate play start to overlap.
Demos are normalized into comparable event and encounter data before any ranking is produced.
The stack can follow usercmd-derived mouse behavior, how those inputs translate into view-angle movement, crosshair acceleration, aim collapse, and post-acquire settling.
The goal is a useful review surface: ranked players, evidence, and context that stays interpretable under scrutiny.
Obvious abuse is not the full problem. The harder review task is telling strong legitimate play apart from lower-visibility cheating without pretending one score can settle it.
Rage behavior, impossible-looking snaps, or repeated obvious abuse can light up a demo quickly. Those are real signals, but they are not the whole problem.
High-ELO and pro players produce uncomfortable rounds too. A useful system has to stay quieter there than a noisy model would, or the output stops being actionable.
Aim assist, recoil assist, and information abuse often try to stay close enough to normal play to avoid obvious signatures. That is the gap NullCS is starting to break into.
The public version is meant to show the real system shape: demo input, behavior signals, ranked output, and evidence for review.
Counter-Strike 2 demos are parsed into event, engagement, and encounter structure instead of being judged from clip-level moments or surface stats.
The current stack builds hundreds of player-level signals plus deeper encounter timing channels from usercmd-style mouse deltas, view-angle response, aim collapse, angular jerk, and recoil-settling behavior.
Models rank players inside a demo so standouts can be surfaced in context instead of pretending one metric can settle the case alone.
Scores are paired with reasons and benchmark context so the output can support review, especially when strong legitimate play and subtle cheats start to overlap.
These plots are there to show how suspicious slices behave against legit and pro baselines, not to decorate the page.
This benchmark example is built from encounter-level mouse and crosshair-process aggregates. Normal players tend to be coarser and noisier, pros tend to be more efficient, and suspicious slices can start looking efficient in a different way: less corrective burst, less manual oversteer, and cleaner settling than expected for the difficulty of the encounter. That is the kind of control-path evidence NullCS is trying to surface.


The strongest player signal in suspicious demos shifts upward, while held-out legit and pro slices stay compressed near zero. That is the core credibility test: separation without broad false-positive drift.

Suspicious demos move up and to the right, while legit and pro demos stay near the origin. That matters because the signal is not just one loud outlier; the top of the lobby is coherently louder.

These panels come from mouse-delta and crosshair-process aggregates built out of encounter windows. They show why control-path telemetry matters: suspicious slices are not just louder in score space, they behave differently in input and aim process too.

This is not a one-metric story. Usercmd-derived mouse behavior, quiet-after-acquire behavior, and angular-jerk features all show measurable lift by themselves before they are folded into the full ranking model.

The data scale matters because a single CS2 match is not one row. It becomes many encounter windows, control-path sequences, player aggregates, and benchmark slices. That is what allows the model to study hard cases instead of just memorizing clips.
Raw demos are turned into structured events, behavior signals, and ranked review output.
Raw demos are turned into event, engagement, and encounter data. A single match expands into hundreds of encounter windows and thousands of tick-aligned measurements before ranking begins.
The current stack uses 449 player-level engineered features, plus encounter timing and control-path channels built from mouse delta, aim process, visibility transitions, and crosshair movement.
Models rank standouts inside a match and export evidence meant to support careful review, especially when the case is not obvious.
The current public release is the local Windows desktop app. Load a Counter-Strike .dem file, run the pipeline on your PC, and review ranked players with supporting evidence.
Local demo intake, ranked players, review-facing reasons, score context, and supporting evidence panels.
Public beta. The desktop client is live, while the model and review workflow continue to be trained and polished.
NullCS is public as a real technical project now. The repository and benchmark pages are the best entry points into the current state of the work.
Some demo metrics can scream that something is wrong, but that does not automatically settle the case on its own.
The real pressure test is whether suspicious slices rise without inflating strong legitimate and pro-level play at the same time.
Research is still ongoing. The current result is serious progress, not a claim that the problem is solved.