This is bare-bones and is missing many features which we intend to add
in future commits, such as parsing from attributes whose names differ
from the field names and parsing into non-String types.
Well, not really, of course. All of this will make sense once we start
adding support for fields and non-struct types. Refactoring the code now
before we start to add actual member field parsing is much easier.
How do I know that this will work out? Well, my crystal ball knows it.
Don't believe me? Okay, ChatGPT told me ... Alright alright, I went
through the entire process of implementing this feature *twice* at this
point and have a pretty good idea of where to draw the abstraction lines
so that everything falls neatly into place. You'll have to trust me on
this one.
(Or, you know, check out old branches in my xmpp-rs repo. That might
work, too. `feature/derive-macro-streaming-full` might be a name to look
for if you dare.)
This is a large change and as such, it needs good motivation. Let me
remind you of the ultimate goal: we want a derive macro which allows us
to FromXml/IntoXml, and that derive macro should be usable from
`xmpp_parsers` and other crates.
For that, any code generated by the derive macro mustn't depend on any
code in the `xmpp_parsers` crate, because you cannot name the crate you
are in portably (`xmpp_parsers::..` wouldn't resolve within
`xmpp_parsers`, and `crate::..` would point at other crates if the macro
was used in other crates).
We also want to interoperate with code already implementing
`TryFrom<Element>` and `Into<Element>` on structs. This ultimately
requires that we have an error type which is shared by the two
implementations and that error type must be declared in the `xso` crate
to be usable by the macros.
Thus, we port the error type over to use the type declared in `xso`.
This changes the structure of the error type greatly; I do not think
that `xso` should have to know about all the different types we are
parsing there and they don't deserve special treatment. Wrapping them in
a `Box<dyn ..>` seems more appropriate.
From [XEP-0004: Data Forms](https://xmpp.org/extensions/xep-0004.html#protocol-field):
> ...
> The <field/> element MAY contain any of the following child elements:
>
> <desc/>
> The XML character data of this element provides a natural-language
> description of the field, intended for presentation in a
> user-agent (e.g., as a "tool-tip", help button, or explanatory text
> provided near the field). The <desc/> element SHOULD NOT contain
> newlines (the \n and \r characters), since layout is the
> responsibility of a user agent, and any handling of
> newlines (e.g., presentation in a user interface) is unspecified
> herein. (Note: To provide a description of a field, it
> is RECOMMENDED to use a <desc/> element rather than
> a separate <field/> element of type "fixed".)
> ...
This adds shims which provide FromXml and IntoXml implementations to
*all* macro-generated types in `xmpp_parsers`. Mind that this does not
cover all types in `xmpp_parsers`, but a good share of them.
This is another first step toward real, fully streamed parsing.
[gone](https://datatracker.ietf.org/doc/html/rfc6120#section-8.3.3.5) and [redirect](https://datatracker.ietf.org/doc/html/rfc6120#section-8.3.3.14) errors may include an alternative address.
> gone
>
> The recipient or server can no longer be contacted at this address,
> typically on a permanent basis (as opposed to the <redirect/> error
> condition, which is used for temporary addressing failures); the
> associated error type SHOULD be "cancel" and the error stanza SHOULD
> include a new address (if available) as the XML character data of the
> <gone/> element (which MUST be a Uniform Resource Identifier [URI] or
> Internationalized Resource Identifier [IRI] at which the entity can
> be contacted, typically an XMPP IRI as specified in [XMPP-URI]).
—
> redirect
>
> The recipient or server is redirecting requests for this information
> to another entity, typically in a temporary fashion (as opposed to
> the <gone/> error condition, which is used for permanent addressing
> failures); the associated error type SHOULD be "modify" and the error
> stanza SHOULD contain the alternate address in the XML character data
> of the <redirect/> element (which MUST be a URI or IRI with which the
> sender can communicate, typically an XMPP IRI as specified in
> [XMPP-URI](https://datatracker.ietf.org/doc/html/rfc6120#ref-XMPP-URI)).
Looking at [the spec](https://xmpp.org/extensions/xep-0004.html#protocol-field)
it seems valid not to have a `var` attribute set, at least for fields of type
`fixed` that is:
> If the element type is anything other than "fixed" (see below), it MUST
> possess a 'var' attribute that uniquely identifies the field in the context
> of the form (if it is "fixed", it MAY possess a 'var' attribute). The element
> MAY possess a 'label' attribute that defines a human-readable name for the field.
Nightly rust complains about `cfg(..)` tests against undeclared
features and other unknown cfgs. They need to be explicitly declared
now.
The nightly/stable features don't exist, so I removed them and
substitutes the currently correct number for the single test where they
were used.
The `xmpprs_doc_build` cfg flag is now declared as expectable.
xmpp-rs normally has the stance to get buggy implementations fixed
rather than dropping checks. In this particular case I think this is not
a good use of resources:
- The disco#info feature var conveys no actual information:
If an implementation replies properly to a disco#info query, it is
already implied that it supports the protocol.
- There are broken server implementations out there.
A lot of them (all recent (>= 0.10 && < 0.13 AFAICT) Prosody IM
instances). At this point in time, xmpp-rs is unable to query
disco#info from MUCs hosted on such prosody versions, except by
workarounds (such as the one removed in this diff).
- XEP-0030 now features a note which reads:
> Note: Some entities are known not to advertise the
> `http://jabber.org/protocol/disco#info` feature within their
> responses, contrary to this specification. Entities receiving
> otherwise valid responses which do not include this feature SHOULD
> infer the support.
The case would be different if there were no (deployed) implementations
which had this bug or if the bug actually had an effect on clients.
Especially the latter is not the case though, as pointed out above.
Hence, I conclude that this check is overly pedantic and the resources
(time, emotional energy of dealing with bugs, punching patches through
to stable distributions, etc. etc.) spent on getting this fixed would
be better invested elsewhere.
In addition, the workaround is extremely ugly and, even in the xmpp-rs
implementation, has no test coverage. Without test coverage of such an
implementation, it is bound to break in funny ways when xmpp-rs changes
the strings of its error messages (which is something one might do even
outside a breaking release).
XEP-0068 is rather explicit that `FORM_TYPE` fields which are not
`type='hidden'` MUST be ignored (in most cases, see comments inside
the code for exceptions). The previous implementation returned an error
instead (and aborted parsing with that), which is obviously not
"ignoring".
[RFC 2426][1] says:
> The binary data MUST be encoded using the "B" encoding format.
> Long lines of encoded binary data SHOULD BE folded to 75 characters
> using the folding method defined in [MIME-DIR].
That implies that whitespace may occur in binval data and we thus must
be able to parse this correctly.
[1]: https://datatracker.ietf.org/doc/html/rfc2426#section-2.4.1
Other additional checks are already gated by the absence of this
feature. As the MR to remove these checks altogether is still blocked,
this should serve as at least as an intermediate solution to anyone
affected by buggy remote implementations.
This moves InnerJid into Jid and reformulates BareJid and FullJid in
terms of Jid.
Doing this has the key advantage that FullJid and BareJid can deref to
and borrow as Jid. This, in turn, has the advantage that they can be
used much more flexibly in HashMaps. However, this is (as we say in
Germany) future music; this commit only does the internal reworking.
Oh and also, it saves 20% memory on Jid objects.
Fixes#122 more thoroughly, or rather the original intent behind it.
This allows constructs like:
```rust
let residual = match Iq::try_from(stanza) {
Ok(iq) => return handle_iq(..),
Err(Error::TypeMismatch(_, _, v)) => v,
Err(other) => return handle_parse_error(..),
};
let residual = match Message::try_from(stanza) {
..
};
let residual = ..
log::warn!("unhandled object: {:?}", residual);
```
The interesting part of this is that this could be used in a loop over a
Vec<Box<dyn FnMut(Element) -> ControlFlow<SomeResult, Element>>, i.e. in
a parsing loop for a generic XML/XMPP stream.
The advantage is that the stanza.is() check runs only once (in
check_self!) and doesn't need to be duplicated outside, and it reduces
the use of magic strings.
Since Rust 1.76, and some much older nightly, there have been
improvements to the niche computation, which leads to smaller types
which can encode the same amount of data, variants, and such.
This fixes the tests on this compiler version.
That one accepts both uppercase and lowercase hexadecimal input, and
outputs in lowercase.
It requires no separator between bytes, unlike ColonSeparatedHex.