Building a notification firewall

Why I stopped fighting Android's per-app notification settings and put a scoring engine in the middle instead.


Three categories of apps were eating my attention with notifications I couldn’t turn off without also turning off the ones I needed: cameras, the cat tracker, and a small zoo of Telegram bots. Per-app notification settings are too coarse. What worked was treating notifications as events, routing them through a small server, scoring them on context, and only interrupting me when the score crossed a threshold. This is the architecture, the scoring model, and the thing I learned about my own attention along the way.

The problem you can’t fix in Android Settings

Android lets you do exactly two things with a per-app notification: allow all, or allow none. There’s an “important” channel slider, which most apps don’t implement, and a per-category mute, which most apps don’t honour. The granularity of attention you actually want — “tell me when the camera sees a person but not when it sees a leaf” — is not on the menu.

For most apps this is fine. For three categories on my phone it is a constant low-grade tax:

I tried muting individual notifications. I tried per-app schedules. I tried the cameras’ built-in “smart” filters, which are good in cities and confused in vineyards. None of it worked because the problem isn’t which apps should notify me. It’s which events from those apps should notify me, and the answer depends on time of day, recent history, and whether anything else is going on.

The fix has to live outside Android.

The architecture, in one sentence

Android forwards every notification from the watched apps to a small server. The server scores the event, applies pattern rules, and decides whether to drop it, file it for a digest, send a quiet alert, or wake me up. I see roughly 80% fewer notifications and miss roughly zero of the ones I cared about.

┌─────────────────────────────┐
│  Phone                      │
│                             │
│  Watched apps               │
│      │                      │
│      ▼                      │
│  Tasker (AutoNotification)  │
│      │                      │
│      ▼                      │
│  JSON payload over          │
│  outbound HTTPS             │
└───────────┬─────────────────┘


┌─────────────────────────────┐
│  Server (homelab)           │
│                             │
│  Webhook receiver           │
│      │                      │
│      ▼                      │
│  Normalise + store          │
│      │                      │
│      ▼                      │
│  Score (rules first,        │
│  LLM later)                 │
│      │                      │
│      ▼                      │
│  ┌─────┬─────┬─────┐        │
│  │ 0   │ 50  │ 80  │        │
│  │ -19 │ -79 │ +   │        │
│  └──┬──┴──┬──┴──┬──┘        │
│  drop  digest  alert        │
└────────┴───────┴────────────┘


              back to phone
              (ntfy or Telegram)

The on-phone side is Tasker + AutoNotification, configured to forward only the apps I care about. The server side is a small workflow (originally n8n; lightweight enough to be a Python script). The return path is ntfy, which gives me back a clean notification on the phone — bypassing the noisy native one.

The schema, kept boring on purpose

The payload from phone to server is deliberately minimal:

{
  "device_id": "android-main",
  "captured_at": "2026-05-18T19:33:00+02:00",
  "package": "com.example.camera",
  "app_name": "Garden camera",
  "title": "Motion detected",
  "text": "Garden camera detected motion",
  "category": "alarm",
  "android_priority": "default"
}

No credentials, no tokens, no raw app database contents. The phone forwards what the notification said and the server reasons about it. Anything that requires reaching back into the app — opening the camera feed, querying the tracker for more detail — happens out of band, not over this webhook.

The deliberate boringness matters. The temptation, the first time you build this, is to enrich the payload with everything the phone knows. Don’t. Treat the webhook as an attention-budget interface; the data is the headline, not the article.

The scoring model

The scoring is what does the work. Four bands:

0-19   noise. store, do not surface.
20-49  useful, not urgent. include in next digest.
50-79  important. send a quiet alert.
80-100 urgent. interrupt.

The factors going into the score:

FactorEffect
Source appCameras and the tracker base higher than Telegram bots
Keywords in text”failed”, “down”, “urgent”, “expired” → +20-30
Time of dayAfter 23:00 or before 07:00: thresholds drop for safety-shaped events, rise for routine ones
Frequency burstSame source + similar text 5× in 10 minutes → collapse to one event, score drops
Repetition after suppressionEvent keeps coming after a digest entry → score creeps up
Unusual silenceTracker hasn’t reported for 4× the expected interval → high score
Known benign patternRecurring 06:00 backup-completed → score caps low

A worked example:

event: garden camera, motion detected
time:  03:18
base score: 30 (camera, after midnight)
keyword bonus: 0
burst penalty: 0 (single event)
quiet-hours boost: +30 (camera, deep night)
unusual-pattern boost: +20 (no scheduled activity)
─────────
final: 80 → urgent alert

Same event, different time:

event: garden camera, motion detected
time:  15:42
base score: 15 (camera, daytime)
burst penalty: -10 (5th similar event in 10 min)
─────────
final: 5 → silent store

Same world. Same camera. Two different actions. The difference is context the camera app doesn’t have and shouldn’t have to compute.

The pattern rules that pull the most weight

Burst collapse. Most noisy notifications come in bursts. The first one gets evaluated normally. The second through tenth get collapsed — the score on each subsequent one is capped, and they’re rolled into a digest entry that just says “garden camera fired 9 more times between 15:42 and 15:51.” Bursts almost always mean a leaf in front of the lens. Bursts almost never mean someone is at the door.

Quiet hours. Between 23:00 and 07:00 the thresholds shift. Routine bot notifications (backup completed, sync OK) get suppressed harder. Camera and tracker events get amplified harder. The asymmetry matches my actual life: at 3am I want to know about the cat, not about backups.

Silence detection. This is the rule that earned its place after a near-miss. The tracker normally reports every 10 minutes during outside hours. If it goes silent for 40+ minutes during outside hours, that’s itself an event — the device might be dead, the cat might be out of coverage, something is up. The system scores silence the same way it scores noise.

Repeat-after-suppression. If I’ve explicitly silenced an event (via a one-tap “shut up” button) and the same event fires again within the suppression window, the score creeps up. Either the event is more important than my first-pass dismissal allowed, or it’s a recurring annoyance worth promoting to a rule. Either way, it stops being silently ignored.

What I learned about my own attention

Three things that surprised me, in roughly the order they surprised me.

Notification volume was not the problem. I had assumed the issue was the number of pings. It wasn’t. The issue was the cost of evaluation — every notification required a glance, a half-second decision (do I care?), and a return to whatever I was doing. Eighty pings of low-cost evaluation is worse than four pings of high-cost evaluation. The score-and-suppress system isn’t reducing the number of notifications I see by 80%; it’s reducing the evaluation cost by close to 100%. Each notification that comes through has already been judged worth interrupting me.

Quiet hours mattered more than I thought. A camera ping at 14:00 is an annoyance. A camera ping at 03:00 is an event. Same data; the time it arrives is part of its meaning. Most apps don’t model this and shouldn’t have to. The firewall does.

The digest entries are the surprising win. I thought the dropped/silent events would be the value. They’re not. The value is the digest — once or twice a day, a short list of “here’s what happened that didn’t quite warrant interrupting you.” I read it in two minutes and I have the context I’d have spent forty minutes collecting from individual notifications. The forty-to-two ratio is the real number.

What I’d build differently

Start with rules, add LLM later (and only sparingly). I was tempted to use an LLM to score notifications from the start. Bad call. LLMs are great at interpreting unusual events — but they’re slow, expensive per call, and they generate variance. A rule-based scorer is deterministic, debuggable, and gives you something to compare LLM scoring to if you ever do bolt one on. I have LLM scoring on the events that the rules can’t confidently classify (~5% of traffic). That’s enough.

Have a one-tap “shut up” button. When the system pings me and I don’t care, I need to be able to tell it so in one tap, from the notification itself. Otherwise I’ll just re-mute the source app and undo the whole project. The button doesn’t have to be sophisticated — it just has to write to a small “things I dismissed” log that feeds into next month’s rule tuning.

Resist the temptation to make it a Big System. This whole thing is one webhook receiver, a scoring function, and an ntfy push. About 200 lines of code. The first version I built was four times the size, with a fancy dashboard and per-event analytics. I didn’t use any of it. The dashboard was for me to feel impressed with myself. The boring 200 lines is the thing that works.

The takeaway

The most important notification setting is not “which apps can ping me.” It’s “what kind of events from those apps deserve my attention right now.” Android doesn’t model this. The watched apps don’t model this. They can’t. Only a system that sits outside of them — and outside of you — can apply the context that makes a notification worth surfacing.

The phone is a piece of glass with a list of unread things on it. The list will grow. The question is who decides which entries on that list deserve a moment of your life. For two years I let the apps decide. For the last few months a 200-line scoring function has decided. Of the two, I prefer the second one. The cat hasn’t gone missing. The cameras still wake me when they should. The backups still complete. The phone is quieter.


The notification firewall isn’t published anywhere; it’s a personal project on a homelab. The pattern — phone forwards, server scores, server decides — is the publishable part.