The science, data and model

Pawlogue is built on what the science of cat vocalization actually supports, and nothing it cannot stand behind. This page is the full, honest technical record: the evidence, the model, every dataset and its license, how it was trained, the real numbers, and how we handle your data.

Last updated 2026-06-01. Cats first, dogs next.

See the model run live Open the app

1. What the science says

There is no universal cat language. Adult-to-human meowing is a behavior cats develop individually with their own owner, so a given meow means different things from one cat to the next. We did not take this on faith. We tested it on real labeled data (CatMeows, 440 meows from 21 cats), with a strict whole-cat holdout so no cat appears in both training and test.

80%

Accuracy reading YOUR cat's own meows, once Pawlogue learns them.

A +21 point jump over the cold start, and it improved for 12 of 12 cats we tested. This per-cat personalization is the heart of the product, and it is proven on real data.

~70%

Reading mood (calm vs distress) on ANY cat, from day one. A real, cross-cat signal.

48.9%

Guessing a stranger cat's exact meaning is a coin flip (baseline 50.2%). That is why we learn YOUR cat instead of faking a universal translator.

The conclusion is the product: reading mood works on any cat from day one, and learning YOUR specific cat takes it to 80%. The base model is the credible cold start, the per-cat personalization is the real magic.

2. The model

Four small heads run entirely on your device, with no server and no audio leaving the phone by default.

Head	What it does	Test accuracy	Size	Type
Cat detector	Is this a cat sound at all, or noise to reject	~89 to 90%	365 KB (99 KB int8)	log-mel CNN
Dictionary	8-class sound and emotion read	71.9% (macro-F1 0.72)	3.7 KB	MFCC + logistic regression
Affect	Calm vs distress arousal	70.5%	1.6 KB	MFCC + logistic regression
Dog detector	Bark vs not-bark (dogs, v2)	80.4%	365 KB	log-mel CNN

Trained from scratch (no large pretrained backbone) to stay tiny and fully offline. The whole bundle is about 1.3 MB. Inference runs in the browser via ONNX Runtime Web and on native via ONNX Runtime Mobile.

Honesty check (parity): the audio feature math in the app was verified to match the Python training pipeline to within 3.8e-4 (log-mel) and 7e-3 (MFCC). The verdict you see in the app is the real model output, not an approximation.

The 8-class dictionary, per-class

Class	Meaning	F1	Clips
Content / relaxed	Low-arousal positive or relaxed (Happy + Resting merged)	0.67	25
Angry	High-arousal angry vocalization	0.83	15
Defensive	Backing off a threat, hiss-like guarding	0.76	15
Fighting	Active fight vocalization	0.65	15
Warning	Keep-back warning	0.63	10
Mating call	Estrus caterwaul	0.80	13
Mother call	Queen calling kittens (chirp/trill)	0.78	11
Hunting / prey chatter	Chatter aimed at prey	0.64	10

Cross-validated holdout, overall 71.9% accuracy vs a 21.9% guess baseline. Paining was dropped: too few clips to learn honestly, and pain is a clinical call we will not assert. Happy and Resting were merged into Content because they blended together.

3. Data and licenses

We cataloged 94 cat and dog sound datasets and pulled about 90 GB to disk. The datasets that actually feed the shipped models are below, with their licenses. We are explicit about this because a translator that hides its sources is a toy.

datasets cataloged

~90 GB

audio pulled to disk

~300k

clips processed

on-device models

Dataset	Used for	Clips	License
CatMeows (Zenodo 4008297)	Affect, the cross-cat science test	440 (21 cats)	CC BY 4.0
Cat Sound Classification V2 (open sample)	The 8-class dictionary	124 (10 classes)	CC BY 4.0
meow_dataset + liladhii cat meows	Detector cat-positives	~1,000+	mixed / unspecified
Cats vs Dogs Audio (Kaggle stealthtech)	Detector volume	1,050	CC BY 4.0
Audio Cats and Dogs (Kaggle mmoreaux)	Detector volume	277	CC BY-SA 3.0
ESC-50	Non-cat negatives (door, vacuum, etc.)	2,000	CC BY-NC 3.0
Barkopedia suite (ArlingtonCL2)	Dog detector and dog affect	~297,000	MIT

Full catalog spans CatMeows, Cat Sound V2, AudioSet label subsets (Meow, Purr, Hiss, Caterwaul, Bark), ESC-50, FSD50K, UrbanSound8K, the Barkopedia family, Freesound queries, and more. Most are CC BY 4.0 or MIT.

Commercial-license honesty: a few datasets used during development are non-commercial (ESC-50 and UrbanSound8K are CC BY-NC). Before the paid launch we will retrain the negative classes on commercially-clean sources only (CC BY, CC0, MIT, for example Freesound CC0 and FSD50K CC BY 4.0), so the shipped commercial model carries no NC-licensed training data. We are flagging this rather than hiding it.

4. How it was trained

Split: 80% train, 10% validation, 10% test. The 10% test is held out of training and model selection, and we keep those exact clips for inspection.
Balanced batches: a class-balanced sampler so no class is starved during training.
No leakage: where recording or cat IDs exist, splits are grouped so the same recording never spans train and test. The dictionary, with only 124 clips, uses 5-fold cross-validation instead of a single split.
Front end: 16 kHz mono. Detectors use a 64-band log-mel spectrogram, the affect and dictionary heads use 85 MFCC features. The DSP runs client-side and was parity-checked against the training pipeline.
Evidence trail: every experiment (EXP-01 through EXP-06) is logged with its method, holdout numbers, and honest caveats.

5. Your data and how the model improves

By default everything stays on your device. The model gets better for everyone only with data from owners who explicitly opt in. We ask once, clearly, and you can change your answer anytime.

Default: on-device only. Nothing about your cat leaves your phone.
Opt-in tier 1: share anonymized learning data (compact sound fingerprints plus the labels you confirm). These cannot be played back as audio. This is enough to sharpen the model.
Opt-in tier 2: also share short audio clips. This helps the most and is what lets us improve the core model, and it is entirely your choice.

Anonymized, never sold, deletable on request, with a clear consent record. See the privacy policy. This opt-in loop is the only way the universal base model improves over time, on top of the per-cat learning that already happens privately on your device.

6. Honest limits and what is next

The 8-class dictionary is trained on only 124 open clips. The full Cat Sound V2 database (about 3,000 clips) is gated by its authors. With it, every class would likely pass 0.8 F1 and we could re-admit Paining and split Happy from Resting.
The cat detector is meow-trained, so loud non-meow cat sounds (a defensive yowl, a fight) can score low and get held back as "not sure" rather than mislabeled.
Dog support is v2 and currently directional (some dog labels lack per-dog grouping).
Confidence is always shown as a band (clean, most-likely, possibly, hard-to-tell), never a fake percentage, because a precise percentage on this data would not be honest.
Next: a commercially-clean retrain, the full Cat Sound V2 corpus if the authors grant it, and the opt-in flywheel feeding a stronger base model.

See the model run live on 8 sounds Open the app