From 0695f312a509961ad24385777815aa269801dab6 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Maxime=20=E2=80=9Cpep=E2=80=9D=20Buquet?= <pep@bouah.net>
Date: Wed, 13 Apr 2022 17:01:48 +0200
Subject: [PATCH] new draft: threat-model
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Signed-off-by: Maxime “pep” Buquet <pep@bouah.net>
---
 content/posts/thread-model.md | 187 ++++++++++++++++++++++++++++++++++
 1 file changed, 187 insertions(+)
 create mode 100644 content/posts/thread-model.md

diff --git a/content/posts/thread-model.md b/content/posts/thread-model.md
new file mode 100644
index 0000000..317cc8c
--- /dev/null
+++ b/content/posts/thread-model.md
@@ -0,0 +1,187 @@
+---
+title: "An overview of my threat model"
+date: 2022-04-13T12:00:00+01:00
+tags: [XMPP, Thread Model, Security, Privacy]
+draft: true
+---
+
+I was interested in knowing what kind of threat model people had when using
+XMPP, so I asked on the newly created [XMPP-related community
+forum][xmpp-lemmy] -- which uses [Lemmy]! A decentralized alternative to
+Reddit using Activity Pub. I had an idea for myself, but I didn't realize it
+was going to be this long an answer. So I decided to write it down here
+instead. I'll be posting the link there.
+
+[xmpp-lemmy]: https://community.xmpp.net/post/25
+[Lemmy]: https://join-lemmy.org/
+
+Building up a [threat model][threat-model] is identifying what and/or whom you
+are trying to protect against. This allows you to take steps to ensure you are
+actually being protected against what you think you want to protect against. A
+threat model is to be refined, improved, etc.
+
+[threat-model]: https://en.wikipedia.org/wiki/Threat_model
+
+I have two main use-cases and I'll go through one of them, the other one being
+less involved, even though definitely influenced by this one. This is surely
+incomplete but it should give a pretty good overview still.
+
+I started doing some activism the past years and I've had to adapt regarding
+communications. It seems not many people in these groups are aware of the
+amount of information that's recoverable by an attacker. I was surprised how
+very little security culture there was, even though I wasn't doing much of it
+myself before (because I didn't think I needed it, really). As you may have
+guessed, this concerns a lot more than just instant messaging but this is what
+this article focuses on.
+
+# The threat model
+
+For this use-case, I want it to make it hard for anybody to trace my actions
+back to my civil identity and those of my friends. While I know this is never
+going to be perfect, and the attacker here has way more resources than we
+have, we do what is possible to reduce the impact on us. I am also aware that
+many attacks are theoretical and may be used nowhere in practice, but that
+doesn't mean we should ignore them either.
+
+Online, I want to protect myself against passive state-level surveillance, but
+also targeted surveillance to some extent. Offline, I need to protect the
+devices I use. In case they are seized by the police, I want to prevent them
+from getting too much information so they get less material to charge us with.
+But if it gets to this, there's many chances they are going to be able to
+associate my different identities.
+
+Some may think with this threat model in mind I wouldn't trust the server
+administrator, but this is a false dichotomy. What I don't want is my data
+falling in the hands of an intruder such as the police overtaking the server.
+Server admins are legally required to give encryption passphrases in many
+jurisdictions, for one, but also mistakes are human and hacking into a server
+may not be so hard with the right amount of resources.
+
+# How does this work with XMPP?
+
+First, this is not proper to XMPP: we don't use our civil identities, we use
+pseudonyms. In these circles we mostly don't know each other's civil
+identities, and it's not useful anyway. It's the same online for example in
+the free software community, where there's no reason why you'd need this
+information.
+
+We use [Tor], so the ISP and middle boxes don't know where we connect to, and
+the XMPP server doesn't know where we connect from.
+
+[Tor]: https://torproject.org
+
+We create accounts on populated public XMPP servers, and connect to them using
+TLS -- which has been the default for a long time now -- and use member-only /
+private (non-public) rooms to talk together, with [OMEMO]. We don't know all
+of the people in the room but there is some kind of trust chain.
+
+[OMEMO]: https://xmpp.org/extensions/xep-0384.html
+
+We're not verifying OMEMO [fingerprints] as we may not know everybody in the
+room, and changing devices/OMEMO keys also causes pain regarding user
+experience when combined with FP verification.
+
+[fingerprints]: https://en.wikipedia.org/wiki/Public_key_fingerprint
+
+On devices (PCs, smartphones), we use [full-disk encryption][FDE] where
+possible. As we generally use second-hand phones, the feature may not be
+available all the time. A pretty generic advice I give is to put a passphrase
+to the OS and also clear client logs regularly. It can be configured in
+Conversations on Android, I don't know about iOS clients.
+
+[FDE]: https://en.wikipedia.org/wiki/Disk_encryption#Full_disk_encryption
+
+The baseline is: your smartphone is your weak point, even though most of us
+have one because it's convenient. This is certainly the first piece that will
+incriminate you, if it's not you or your friends doing so inadvertently.
+
+# What I'd like to improve in XMPP?
+
+There are so many details that I have no clue about that could be used against
+me to correlate my different identities.
+
+I use multiple accounts on [Conversations], as well as [Dino] on the desktop
+for this use-case. Randomizing connections to the various accounts could be
+one thing to improve.
+
+I don't use [Poezio] for anything else than my civil identity, because Poezio
+isn't very much used. Even though it may also be the case for Dino..
+
+Currently in server logs, a few things can be used to identify a client, such
+as the resource string set by the client to something similar to
+`clientname.random`, or the `disco#info` which lists capabilities of a client.
+Both are actually stored on the server for possibly good reasons, but that's
+always more information to identity somebody.
+
+[Conversations]: https://conversations.im
+[Dino]: https://dino.im
+[Poezio]: https://poez.io
+
+I remember developers asking for the resource to be easily distinguishable for
+debugging purposes. Having something à la [docker container
+names][docker-names] should be good enough for this (a list of adjectives and
+names combined into random `<adjective>_<name>`). I am not entirely sure what
+to do about `disco#info` being stored.
+
+[docker-names]: https://github.com/moby/moby/blob/master/pkg/namesgenerator/names-generator.go
+
+A good point for public servers is that they don't seem to store archives
+forever anymore (since [GDPR]? Or for disk-space concerns maybe). They will
+generally have 2 weeks / 1 month of (encrypted) activity which, I give you, may be
+enough in some cases to incriminate someone, but it's probably better than
+logs that go back to -infinity.
+
+[GDPR]: https://en.wikipedia.org/wiki/GDPR
+
+The roster is also stored as plaintext on the server and can easily be taken
+by the police. Encrypted roster may not be as far as we imagine. There have
+been similar efforts done in Dovecot to encrypt the user mailbox with a
+user-provided passphrase. This wouldn't prevent servers from recreating it
+based on activity when logged in, but that's already more efforts required and
+many wouldn't bother -- leaving this data unavailable as plaintext by default.
+
+On the client, I would like more private defaults. Tor support is a MUST,
+fortunately Conversations has it, and it's possible to use it with Dino but
+one has to know how to set it up on their system and there's no way to enforce
+using Tor, and it's not shown whether it's in use either. Same issue in
+Poezio.
+
+Storing logs forever is also one thing that I find annoying. It can be
+configured in Conversations but it's not by default. It's hidden in Expert
+Setting as `Never` to delete messages automatically.
+
+Dino doesn't have any settings regarding logs. I'd have to clear them myself
+by going through the sqlite database (pretty technical already). Poezio has a
+`use_log` setting nowadays that stores every message (and presence depending
+on config), and it's also True by default.
+
+Interactions with OMEMO between non-contacts is a mess. Some servers have the
+[`mod_block_strangers`] module deployed as an anti-spam measure: when a user
+from such a server joins a private room, non-contacts will be prevented from
+fetching their keys. Dino creates the OMEMO node as [only accessible by
+contacts][dino-omemo] (in an effort to prevent enumeration attacks). And
+Conversations [doesn't allow sending encrypted messages][conversations-omemo]
+if it doesn't have keys of all participants in a private room.
+
+[`mod_block_strangers`]: https://modules.prosody.im/mod_block_strangers.html
+[dino-omemo]: https://github.com/dino/dino/issues/1139
+[conversations-omemo]: https://github.com/iNPUTmice/Conversations/issues/3081
+
+I am not even talking about OMEMO implementations (using [OMEMO
+0.3.0][OMEMO03]) which per the spec only encrypt the `<body/>` element in a
+message, leaking actual data depending on the feature used, or restricting the
+feature set greatly. This is fixed in the newer version of the spec but
+deployed nowhere at the moment.
+
+[OMEMO03]: https://xmpp.org/extensions/attic/xep-0384-0.3.0.html
+
+I am also not talking about why XMPP and not say Signal, or Telegram. I have
+already talked about this in part in other articles but that may warrant its
+own article at some point.
+
+This article only scratches the surface. There are many more details that
+would need to be ironed-out. And of course implementations need to make
+choices and can't answer every single use-cases out there. I do wish Privacy
+was more of a concern though.
+
+Where is “Privacy by default” gone? Somebody bring it back please.