From 0695f312a509961ad24385777815aa269801dab6 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Maxime=20=E2=80=9Cpep=E2=80=9D=20Buquet?= Date: Wed, 13 Apr 2022 17:01:48 +0200 Subject: [PATCH] new draft: threat-model MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: Maxime “pep” Buquet --- content/posts/thread-model.md | 187 ++++++++++++++++++++++++++++++++++ 1 file changed, 187 insertions(+) create mode 100644 content/posts/thread-model.md diff --git a/content/posts/thread-model.md b/content/posts/thread-model.md new file mode 100644 index 0000000..317cc8c --- /dev/null +++ b/content/posts/thread-model.md @@ -0,0 +1,187 @@ +--- +title: "An overview of my threat model" +date: 2022-04-13T12:00:00+01:00 +tags: [XMPP, Thread Model, Security, Privacy] +draft: true +--- + +I was interested in knowing what kind of threat model people had when using +XMPP, so I asked on the newly created [XMPP-related community +forum][xmpp-lemmy] -- which uses [Lemmy]! A decentralized alternative to +Reddit using Activity Pub. I had an idea for myself, but I didn't realize it +was going to be this long an answer. So I decided to write it down here +instead. I'll be posting the link there. + +[xmpp-lemmy]: https://community.xmpp.net/post/25 +[Lemmy]: https://join-lemmy.org/ + +Building up a [threat model][threat-model] is identifying what and/or whom you +are trying to protect against. This allows you to take steps to ensure you are +actually being protected against what you think you want to protect against. A +threat model is to be refined, improved, etc. + +[threat-model]: https://en.wikipedia.org/wiki/Threat_model + +I have two main use-cases and I'll go through one of them, the other one being +less involved, even though definitely influenced by this one. This is surely +incomplete but it should give a pretty good overview still. + +I started doing some activism the past years and I've had to adapt regarding +communications. It seems not many people in these groups are aware of the +amount of information that's recoverable by an attacker. I was surprised how +very little security culture there was, even though I wasn't doing much of it +myself before (because I didn't think I needed it, really). As you may have +guessed, this concerns a lot more than just instant messaging but this is what +this article focuses on. + +# The threat model + +For this use-case, I want it to make it hard for anybody to trace my actions +back to my civil identity and those of my friends. While I know this is never +going to be perfect, and the attacker here has way more resources than we +have, we do what is possible to reduce the impact on us. I am also aware that +many attacks are theoretical and may be used nowhere in practice, but that +doesn't mean we should ignore them either. + +Online, I want to protect myself against passive state-level surveillance, but +also targeted surveillance to some extent. Offline, I need to protect the +devices I use. In case they are seized by the police, I want to prevent them +from getting too much information so they get less material to charge us with. +But if it gets to this, there's many chances they are going to be able to +associate my different identities. + +Some may think with this threat model in mind I wouldn't trust the server +administrator, but this is a false dichotomy. What I don't want is my data +falling in the hands of an intruder such as the police overtaking the server. +Server admins are legally required to give encryption passphrases in many +jurisdictions, for one, but also mistakes are human and hacking into a server +may not be so hard with the right amount of resources. + +# How does this work with XMPP? + +First, this is not proper to XMPP: we don't use our civil identities, we use +pseudonyms. In these circles we mostly don't know each other's civil +identities, and it's not useful anyway. It's the same online for example in +the free software community, where there's no reason why you'd need this +information. + +We use [Tor], so the ISP and middle boxes don't know where we connect to, and +the XMPP server doesn't know where we connect from. + +[Tor]: https://torproject.org + +We create accounts on populated public XMPP servers, and connect to them using +TLS -- which has been the default for a long time now -- and use member-only / +private (non-public) rooms to talk together, with [OMEMO]. We don't know all +of the people in the room but there is some kind of trust chain. + +[OMEMO]: https://xmpp.org/extensions/xep-0384.html + +We're not verifying OMEMO [fingerprints] as we may not know everybody in the +room, and changing devices/OMEMO keys also causes pain regarding user +experience when combined with FP verification. + +[fingerprints]: https://en.wikipedia.org/wiki/Public_key_fingerprint + +On devices (PCs, smartphones), we use [full-disk encryption][FDE] where +possible. As we generally use second-hand phones, the feature may not be +available all the time. A pretty generic advice I give is to put a passphrase +to the OS and also clear client logs regularly. It can be configured in +Conversations on Android, I don't know about iOS clients. + +[FDE]: https://en.wikipedia.org/wiki/Disk_encryption#Full_disk_encryption + +The baseline is: your smartphone is your weak point, even though most of us +have one because it's convenient. This is certainly the first piece that will +incriminate you, if it's not you or your friends doing so inadvertently. + +# What I'd like to improve in XMPP? + +There are so many details that I have no clue about that could be used against +me to correlate my different identities. + +I use multiple accounts on [Conversations], as well as [Dino] on the desktop +for this use-case. Randomizing connections to the various accounts could be +one thing to improve. + +I don't use [Poezio] for anything else than my civil identity, because Poezio +isn't very much used. Even though it may also be the case for Dino.. + +Currently in server logs, a few things can be used to identify a client, such +as the resource string set by the client to something similar to +`clientname.random`, or the `disco#info` which lists capabilities of a client. +Both are actually stored on the server for possibly good reasons, but that's +always more information to identity somebody. + +[Conversations]: https://conversations.im +[Dino]: https://dino.im +[Poezio]: https://poez.io + +I remember developers asking for the resource to be easily distinguishable for +debugging purposes. Having something à la [docker container +names][docker-names] should be good enough for this (a list of adjectives and +names combined into random `_`). I am not entirely sure what +to do about `disco#info` being stored. + +[docker-names]: https://github.com/moby/moby/blob/master/pkg/namesgenerator/names-generator.go + +A good point for public servers is that they don't seem to store archives +forever anymore (since [GDPR]? Or for disk-space concerns maybe). They will +generally have 2 weeks / 1 month of (encrypted) activity which, I give you, may be +enough in some cases to incriminate someone, but it's probably better than +logs that go back to -infinity. + +[GDPR]: https://en.wikipedia.org/wiki/GDPR + +The roster is also stored as plaintext on the server and can easily be taken +by the police. Encrypted roster may not be as far as we imagine. There have +been similar efforts done in Dovecot to encrypt the user mailbox with a +user-provided passphrase. This wouldn't prevent servers from recreating it +based on activity when logged in, but that's already more efforts required and +many wouldn't bother -- leaving this data unavailable as plaintext by default. + +On the client, I would like more private defaults. Tor support is a MUST, +fortunately Conversations has it, and it's possible to use it with Dino but +one has to know how to set it up on their system and there's no way to enforce +using Tor, and it's not shown whether it's in use either. Same issue in +Poezio. + +Storing logs forever is also one thing that I find annoying. It can be +configured in Conversations but it's not by default. It's hidden in Expert +Setting as `Never` to delete messages automatically. + +Dino doesn't have any settings regarding logs. I'd have to clear them myself +by going through the sqlite database (pretty technical already). Poezio has a +`use_log` setting nowadays that stores every message (and presence depending +on config), and it's also True by default. + +Interactions with OMEMO between non-contacts is a mess. Some servers have the +[`mod_block_strangers`] module deployed as an anti-spam measure: when a user +from such a server joins a private room, non-contacts will be prevented from +fetching their keys. Dino creates the OMEMO node as [only accessible by +contacts][dino-omemo] (in an effort to prevent enumeration attacks). And +Conversations [doesn't allow sending encrypted messages][conversations-omemo] +if it doesn't have keys of all participants in a private room. + +[`mod_block_strangers`]: https://modules.prosody.im/mod_block_strangers.html +[dino-omemo]: https://github.com/dino/dino/issues/1139 +[conversations-omemo]: https://github.com/iNPUTmice/Conversations/issues/3081 + +I am not even talking about OMEMO implementations (using [OMEMO +0.3.0][OMEMO03]) which per the spec only encrypt the `` element in a +message, leaking actual data depending on the feature used, or restricting the +feature set greatly. This is fixed in the newer version of the spec but +deployed nowhere at the moment. + +[OMEMO03]: https://xmpp.org/extensions/attic/xep-0384-0.3.0.html + +I am also not talking about why XMPP and not say Signal, or Telegram. I have +already talked about this in part in other articles but that may warrant its +own article at some point. + +This article only scratches the surface. There are many more details that +would need to be ironed-out. And of course implementations need to make +choices and can't answer every single use-cases out there. I do wish Privacy +was more of a concern though. + +Where is “Privacy by default” gone? Somebody bring it back please.