blog.bouah.net/content/posts/thread-model.en.md

---
title: "An overview of my threat model"
date: 2022-04-13T12:00:00+01:00
tags: [XMPP, Threat Model, Security, Privacy]
---

I was interested in knowing what kind of threat model people had when using
XMPP, so I asked on the newly created [XMPP-related community
forum][xmpp-lemmy] -- which uses [Lemmy]! A decentralized alternative to
Reddit using Activity Pub. I had an idea for myself, but I didn't realize it
was going to be this long an answer. So I decided to write it down here
instead. I'll be posting the [link there][comment].

[xmpp-lemmy]: https://community.xmpp.net/post/25
[Lemmy]: https://join-lemmy.org/
[comment]: https://community.xmpp.net/post/25/comment/31

Building up a [threat model][threat-model] is identifying what and/or whom you
are trying to protect against. This allows you to take steps to ensure you are
actually being protected against what you think you want to protect against. A
threat model is to be refined, improved, etc.

[threat-model]: https://en.wikipedia.org/wiki/Threat_model

I have two main use-cases and I'll go through one of them, the other one being
less involved, even though definitely influenced by this one. This is surely
incomplete but it should give a pretty good overview still.

I started doing some activism the past years and I've had to adapt regarding
communications. It seems not many people in these groups are aware of the
amount of information that's recoverable by an attacker. I was surprised how
very little security culture there was, even though I wasn't doing much of it
myself before (because I didn't think I needed it, really). As you may have
guessed, this concerns a lot more than just instant messaging but this is what
this article focuses on.

# The threat model

For this use-case, I want it to make it hard for anybody to trace my actions
back to my civil identity and those of my friends. While I know this is never
going to be perfect, and the attacker here has way more resources than we
have, we do what is possible to reduce the impact on us. I am also aware that
many attacks are theoretical and may be used nowhere in practice, but that
doesn't mean we should ignore them either.

Online, I want to protect myself against passive state-level surveillance, but
also targeted surveillance to some extent. Offline, I need to protect the
devices I use. In case they are seized by the police, I want to prevent them
from getting too much information so they get less material to charge us with.
But if it gets to this, there's many chances they are going to be able to
associate my different identities.

Some may think with this threat model in mind I wouldn't trust the server
administrator, but this is a false dichotomy. What I don't want is my data
falling in the hands of an intruder such as the police overtaking the server.
Server admins are legally required to give encryption passphrases in many
jurisdictions, for one, but also mistakes are human and hacking into a server
may not be so hard with the right amount of resources.

# How does this work with XMPP?

First, this is not proper to XMPP: we don't use our civil identities, we use
pseudonyms. In these circles we mostly don't know each other's civil
identities, and it's not useful anyway. It's the same online for example in
the free software community, where there's no reason why you'd need this
information.

We use [Tor], so the ISP and middle boxes don't know where we connect to, and
the XMPP server doesn't know where we connect from.

[Tor]: https://torproject.org

We create accounts on populated public XMPP servers, and connect to them using
TLS -- which has been the default for a long time now -- and use member-only /
private (non-public) rooms to talk together, with [OMEMO]. We don't know all
of the people in the room but there is some kind of trust chain.

[OMEMO]: https://xmpp.org/extensions/xep-0384.html

We're not verifying OMEMO [fingerprints] as we may not know everybody in the
room, and changing devices/OMEMO keys also causes pain regarding user
experience when combined with FP verification.

[fingerprints]: https://en.wikipedia.org/wiki/Public_key_fingerprint

On devices (PCs, smartphones), we use [full-disk encryption][FDE] where
possible. As we generally use second-hand phones, the feature may not be
available all the time. A pretty generic advice I give is to put a passphrase
to the OS and also clear client logs regularly. It can be configured in
Conversations on Android, I don't know about iOS clients.

[FDE]: https://en.wikipedia.org/wiki/Disk_encryption#Full_disk_encryption

The baseline is: your smartphone is your weak point, even though most of us
have one because it's convenient. This is certainly the first piece that will
incriminate you, if it's not you or your friends doing so inadvertently.

# What I'd like to improve in XMPP?

There are so many details that I have no clue about that could be used against
me to correlate my different identities.

I use multiple accounts on [Conversations], as well as [Dino] on the desktop
for this use-case. Randomizing connections to the various accounts could be
one thing to improve.

I don't use [Poezio] for anything else than my civil identity, because Poezio
isn't very much used. Even though it may also be the case for Dino..

Currently in server logs, a few things can be used to identify a client, such
as the resource string set by the client to something similar to
`clientname.randombits`, or the `disco#info` which lists capabilities of a
client.  Both are actually stored on the server for possibly good reasons, but
that's always more information to identity somebody.

[Conversations]: https://conversations.im
[Dino]: https://dino.im
[Poezio]: https://poez.io

I remember developers asking for the resource to be easily distinguishable for
debugging purposes. Having something à la [docker container
names][docker-names] should be good enough for this (a list of adjectives and
names combined into random `<adjective>_<name>`). I am not entirely sure what
to do about `disco#info` being stored.

[docker-names]: https://github.com/moby/moby/blob/master/pkg/namesgenerator/names-generator.go

A good point for public servers is that they don't seem to store archives
forever anymore (since [GDPR]? Or for disk-space concerns maybe). They will
generally have 2 weeks / 1 month of (encrypted) activity which, I give you, may be
enough in some cases to incriminate someone, but it's probably better than
logs that go back to -infinity.

[GDPR]: https://en.wikipedia.org/wiki/GDPR

The roster is also stored as plaintext on the server and can easily be taken
by the police. Encrypted roster may not be as far as we imagine. There have
been similar efforts done in Dovecot to encrypt the user mailbox with a
user-provided passphrase. This wouldn't prevent servers from recreating it
based on activity when logged in, but that's already more efforts required and
many wouldn't bother -- leaving this data unavailable as plaintext by default.

On the client, I would like more private defaults. Tor support is a MUST,
fortunately Conversations has it, and it's possible to use it with Dino but
one has to know how to set it up on their system and there's no way to enforce
using Tor, and it's not shown whether it's in use either. Same issue in
Poezio.

Storing logs forever is also one thing that I find annoying. It can be
configured in Conversations but it's not limited by default. It's hidden in
Expert Setting as `Never` to delete messages automatically.

Dino doesn't have any settings regarding logs. I'd have to clear them myself
by going through the sqlite database (pretty technical already). Poezio has a
`use_log` setting that stores every message (and presence depending on
config), and it's also True by default.

Interactions with OMEMO between non-contacts is a mess. Some servers have the
[`mod_block_strangers`] module deployed as an anti-spam measure: when a user
from such a server joins a private room, non-contacts will be prevented from
fetching their keys. Dino creates the OMEMO node as [only accessible by
contacts][dino-omemo] (to prevent deanonymization [in some Prosody
MUCs][prosody-pep]). And Conversations [doesn't allow sending encrypted
messages][conversations-omemo] if it doesn't have keys of all participants in
a private room.

[`mod_block_strangers`]: https://modules.prosody.im/mod_block_strangers.html
[dino-omemo]: https://github.com/dino/dino/issues/1139
[prosody-pep]: https://issues.prosody.im/1441
[conversations-omemo]: https://github.com/iNPUTmice/Conversations/issues/3081

I am not even talking about OMEMO implementations (using [OMEMO
0.3.0][OMEMO03]) which per the spec only encrypt the `<body/>` element in a
message, leaking actual data depending on the feature used, or restricting the
feature set greatly. This is fixed in the newer version of the spec but
deployed nowhere at the moment.

[OMEMO03]: https://xmpp.org/extensions/attic/xep-0384-0.3.0.html

I am also not talking about why XMPP and not say Signal, or Telegram. I have
already talked about this in part in other articles but that may warrant its
own article at some point.

This article only scratches the surface. There are many more details that
would need to be ironed-out. And of course implementations need to make
choices and can't answer every single use-cases out there. I do wish Privacy
was more of a concern though.

Where is “Privacy by default” gone? Somebody bring it back please.