187 lines
9.1 KiB
Markdown
187 lines
9.1 KiB
Markdown
|
---
|
||
|
title: "An overview of my threat model"
|
||
|
date: 2022-04-13T12:00:00+01:00
|
||
|
tags: [XMPP, Thread Model, Security, Privacy]
|
||
|
---
|
||
|
|
||
|
I was interested in knowing what kind of threat model people had when using
|
||
|
XMPP, so I asked on the newly created [XMPP-related community
|
||
|
forum][xmpp-lemmy] -- which uses [Lemmy]! A decentralized alternative to
|
||
|
Reddit using Activity Pub. I had an idea for myself, but I didn't realize it
|
||
|
was going to be this long an answer. So I decided to write it down here
|
||
|
instead. I'll be posting the link there.
|
||
|
|
||
|
[xmpp-lemmy]: https://community.xmpp.net/post/25
|
||
|
[Lemmy]: https://join-lemmy.org/
|
||
|
|
||
|
Building up a [threat model][threat-model] is identifying what and/or whom you
|
||
|
are trying to protect against. This allows you to take steps to ensure you are
|
||
|
actually being protected against what you think you want to protect against. A
|
||
|
threat model is to be refined, improved, etc.
|
||
|
|
||
|
[threat-model]: https://en.wikipedia.org/wiki/Threat_model
|
||
|
|
||
|
I have two main use-cases and I'll go through one of them, the other one being
|
||
|
less involved, even though definitely influenced by this one. This is surely
|
||
|
incomplete but it should give a pretty good overview still.
|
||
|
|
||
|
I started doing some activism the past years and I've had to adapt regarding
|
||
|
communications. It seems not many people in these groups are aware of the
|
||
|
amount of information that's recoverable by an attacker. I was surprised how
|
||
|
very little security culture there was, even though I wasn't doing much of it
|
||
|
myself before (because I didn't think I needed it, really). As you may have
|
||
|
guessed, this concerns a lot more than just instant messaging but this is what
|
||
|
this article focuses on.
|
||
|
|
||
|
# The threat model
|
||
|
|
||
|
For this use-case, I want it to make it hard for anybody to trace my actions
|
||
|
back to my civil identity and those of my friends. While I know this is never
|
||
|
going to be perfect, and the attacker here has way more resources than we
|
||
|
have, we do what is possible to reduce the impact on us. I am also aware that
|
||
|
many attacks are theoretical and may be used nowhere in practice, but that
|
||
|
doesn't mean we should ignore them either.
|
||
|
|
||
|
Online, I want to protect myself against passive state-level surveillance, but
|
||
|
also targeted surveillance to some extent. Offline, I need to protect the
|
||
|
devices I use. In case they are seized by the police, I want to prevent them
|
||
|
from getting too much information so they get less material to charge us with.
|
||
|
But if it gets to this, there's many chances they are going to be able to
|
||
|
associate my different identities.
|
||
|
|
||
|
Some may think with this threat model in mind I wouldn't trust the server
|
||
|
administrator, but this is a false dichotomy. What I don't want is my data
|
||
|
falling in the hands of an intruder such as the police overtaking the server.
|
||
|
Server admins are legally required to give encryption passphrases in many
|
||
|
jurisdictions, for one, but also mistakes are human and hacking into a server
|
||
|
may not be so hard with the right amount of resources.
|
||
|
|
||
|
# How does this work with XMPP?
|
||
|
|
||
|
First, this is not proper to XMPP: we don't use our civil identities, we use
|
||
|
pseudonyms. In these circles we mostly don't know each other's civil
|
||
|
identities, and it's not useful anyway. It's the same online for example in
|
||
|
the free software community, where there's no reason why you'd need this
|
||
|
information.
|
||
|
|
||
|
We use [Tor], so the ISP and middle boxes don't know where we connect to, and
|
||
|
the XMPP server doesn't know where we connect from.
|
||
|
|
||
|
[Tor]: https://torproject.org
|
||
|
|
||
|
We create accounts on populated public XMPP servers, and connect to them using
|
||
|
TLS -- which has been the default for a long time now -- and use member-only /
|
||
|
private (non-public) rooms to talk together, with [OMEMO]. We don't know all
|
||
|
of the people in the room but there is some kind of trust chain.
|
||
|
|
||
|
[OMEMO]: https://xmpp.org/extensions/xep-0384.html
|
||
|
|
||
|
We're not verifying OMEMO [fingerprints] as we may not know everybody in the
|
||
|
room, and changing devices/OMEMO keys also causes pain regarding user
|
||
|
experience when combined with FP verification.
|
||
|
|
||
|
[fingerprints]: https://en.wikipedia.org/wiki/Public_key_fingerprint
|
||
|
|
||
|
On devices (PCs, smartphones), we use [full-disk encryption][FDE] where
|
||
|
possible. As we generally use second-hand phones, the feature may not be
|
||
|
available all the time. A pretty generic advice I give is to put a passphrase
|
||
|
to the OS and also clear client logs regularly. It can be configured in
|
||
|
Conversations on Android, I don't know about iOS clients.
|
||
|
|
||
|
[FDE]: https://en.wikipedia.org/wiki/Disk_encryption#Full_disk_encryption
|
||
|
|
||
|
The baseline is: your smartphone is your weak point, even though most of us
|
||
|
have one because it's convenient. This is certainly the first piece that will
|
||
|
incriminate you, if it's not you or your friends doing so inadvertently.
|
||
|
|
||
|
# What I'd like to improve in XMPP?
|
||
|
|
||
|
There are so many details that I have no clue about that could be used against
|
||
|
me to correlate my different identities.
|
||
|
|
||
|
I use multiple accounts on [Conversations], as well as [Dino] on the desktop
|
||
|
for this use-case. Randomizing connections to the various accounts could be
|
||
|
one thing to improve.
|
||
|
|
||
|
I don't use [Poezio] for anything else than my civil identity, because Poezio
|
||
|
isn't very much used. Even though it may also be the case for Dino..
|
||
|
|
||
|
Currently in server logs, a few things can be used to identify a client, such
|
||
|
as the resource string set by the client to something similar to
|
||
|
`clientname.random`, or the `disco#info` which lists capabilities of a client.
|
||
|
Both are actually stored on the server for possibly good reasons, but that's
|
||
|
always more information to identity somebody.
|
||
|
|
||
|
[Conversations]: https://conversations.im
|
||
|
[Dino]: https://dino.im
|
||
|
[Poezio]: https://poez.io
|
||
|
|
||
|
I remember developers asking for the resource to be easily distinguishable for
|
||
|
debugging purposes. Having something à la [docker container
|
||
|
names][docker-names] should be good enough for this (a list of adjectives and
|
||
|
names combined into random `<adjective>_<name>`). I am not entirely sure what
|
||
|
to do about `disco#info` being stored.
|
||
|
|
||
|
[docker-names]: https://github.com/moby/moby/blob/master/pkg/namesgenerator/names-generator.go
|
||
|
|
||
|
A good point for public servers is that they don't seem to store archives
|
||
|
forever anymore (since [GDPR]? Or for disk-space concerns maybe). They will
|
||
|
generally have 2 weeks / 1 month of (encrypted) activity which, I give you, may be
|
||
|
enough in some cases to incriminate someone, but it's probably better than
|
||
|
logs that go back to -infinity.
|
||
|
|
||
|
[GDPR]: https://en.wikipedia.org/wiki/GDPR
|
||
|
|
||
|
The roster is also stored as plaintext on the server and can easily be taken
|
||
|
by the police. Encrypted roster may not be as far as we imagine. There have
|
||
|
been similar efforts done in Dovecot to encrypt the user mailbox with a
|
||
|
user-provided passphrase. This wouldn't prevent servers from recreating it
|
||
|
based on activity when logged in, but that's already more efforts required and
|
||
|
many wouldn't bother -- leaving this data unavailable as plaintext by default.
|
||
|
|
||
|
On the client, I would like more private defaults. Tor support is a MUST,
|
||
|
fortunately Conversations has it, and it's possible to use it with Dino but
|
||
|
one has to know how to set it up on their system and there's no way to enforce
|
||
|
using Tor, and it's not shown whether it's in use either. Same issue in
|
||
|
Poezio.
|
||
|
|
||
|
Storing logs forever is also one thing that I find annoying. It can be
|
||
|
configured in Conversations but it's not by default. It's hidden in Expert
|
||
|
Setting as `Never` to delete messages automatically.
|
||
|
|
||
|
Dino doesn't have any settings regarding logs. I'd have to clear them myself
|
||
|
by going through the sqlite database (pretty technical already). Poezio has a
|
||
|
`use_log` setting nowadays that stores every message (and presence depending
|
||
|
on config), and it's also True by default.
|
||
|
|
||
|
Interactions with OMEMO between non-contacts is a mess. Some servers have the
|
||
|
[`mod_block_strangers`] module deployed as an anti-spam measure: when a user
|
||
|
from such a server joins a private room, non-contacts will be prevented from
|
||
|
fetching their keys. Dino creates the OMEMO node as [only accessible by
|
||
|
contacts][dino-omemo] (in an effort to prevent enumeration attacks). And
|
||
|
Conversations [doesn't allow sending encrypted messages][conversations-omemo]
|
||
|
if it doesn't have keys of all participants in a private room.
|
||
|
|
||
|
[`mod_block_strangers`]: https://modules.prosody.im/mod_block_strangers.html
|
||
|
[dino-omemo]: https://github.com/dino/dino/issues/1139
|
||
|
[conversations-omemo]: https://github.com/iNPUTmice/Conversations/issues/3081
|
||
|
|
||
|
I am not even talking about OMEMO implementations (using [OMEMO
|
||
|
0.3.0][OMEMO03]) which per the spec only encrypt the `<body/>` element in a
|
||
|
message, leaking actual data depending on the feature used, or restricting the
|
||
|
feature set greatly. This is fixed in the newer version of the spec but
|
||
|
deployed nowhere at the moment.
|
||
|
|
||
|
[OMEMO03]: https://xmpp.org/extensions/attic/xep-0384-0.3.0.html
|
||
|
|
||
|
I am also not talking about why XMPP and not say Signal, or Telegram. I have
|
||
|
already talked about this in part in other articles but that may warrant its
|
||
|
own article at some point.
|
||
|
|
||
|
This article only scratches the surface. There are many more details that
|
||
|
would need to be ironed-out. And of course implementations need to make
|
||
|
choices and can't answer every single use-cases out there. I do wish Privacy
|
||
|
was more of a concern though.
|
||
|
|
||
|
Where is “Privacy by default” gone? Somebody bring it back please.
|