The traits have undergone a couple iterations and this is what we end up
with. The core issue which makes this entire thing ugly is the
Orphan Rule, preventing some trait implementations relating to types
which haven't been defined in this crate.
In an ideal world, we would implement FromXmlText and IntoXmlText for
all types implementing FromStr and/or fmt::Display.
This comes with two severe issues:
1. Downstream crates cannot chose to have different
parsing/serialisation behaviour for "normal" text vs. xml.
2. We ourselves cannot define a behaviour for `Option<T>`. `Option<T>`
does not implement `FromStr` (nor `Display`), but the standard
library *could* do that at some point, and thus Rust doesn't let us
implement e.g. `FromXmlText for Option<T> where T: FromXmlText`,
if we also implement it on `T: FromStr`.
The second one hurts particularly once we get to optional attributes:
For these, we need to "detect" that the type is in fact `Option<T>`,
because we then need to invoke `FromXmlText` on `T` instead of
`Option<T>`. Unfortunately, we cannot do that: macros operate on token
streams and we have no type information available.
We can of course match on the name `Option`, but that breaks down when
users re-import `Option` under a different name. Even just enumerating
all the possible correct ways of using `Option` from the standard
library (there are more than three) would be a nuisance at best.
Hence, we need *another* trait or at least a specialized implementation
of `FromXmlText for Option<T>`, and we cannot do that if we blanket-impl
`FromXmlText` on `T: FromStr`.
That makes the traits what they are, and introduces the requirement that
we know about any upstream crate which anyone might want to parse from
or to XML. This sucks a lot, but that's the state of the world. We are
late to the party, and we cannot expect everyone to do the same they
have done for `serde` (many crates have a `feature = "serde"` which then
provides Serialize/Deserialize trait impls for their types).
This is more in line with how we handle closely coupled specifications
already. While there are subdirectories for "large" specifications (such
as MUC and PubSub), those only refer to a single XEP document. When
there are multiple separate XEP documents, we have separate modules for
that.
There is at least one branch of the FromStr implementation which passes
user input right into the error struct, so we cannot assume that `'` is
not part of that value.
The previous code didn't build with 1.78:
```
error[E0716]: temporary value dropped while borrowed
--> parsers/src/data_forms/validate.rs:394:46
|
380 | let value = match self {
| ----- borrow later stored here
...
394 | Datatype::UserDefined(value) => &format!("x:{value}"),
| ^^^^^^^^^^^^^^^^^^^-
| | |
| | temporary value is freed at the end of this statement
| creates a temporary value which is freed while still in use
|
= note: consider using a `let` binding to create a longer lived value
= note: this error originates in the macro `format` (in Nightly builds, run with -Z macro-backtrace for more info)
error[E0716]: temporary value dropped while borrowed
--> parsers/src/data_forms/validate.rs:395:51
|
380 | let value = match self {
| ----- borrow later stored here
...
395 | Datatype::Other { prefix, value } => &format!("{prefix}:{value}"),
| ^^^^^^^^^^^^^^^^^^^^^^^^^^-
| | |
| | temporary value is freed at the end of this statement
| creates a temporary value which is freed while still in use
|
= note: consider using a `let` binding to create a longer lived value
= note: this error originates in the macro `format` (in Nightly builds, run with -Z macro-backtrace for more info)
For more information about this error, try `rustc --explain E0716`.
```
This seems like a silly reason to pull up the compiler version
requirements, so I fixed it with a trivial modification.
This got introduced in 2e3004f89e, but
there should be no reason to run ignored tests; if they are #[ignore]
it’s probably for a good reason.
This is particularly annoying with doctests, where ignore is used to
explicitly highlight broken/untested code.
Thanks jonas’ for noticing!
This is bare-bones and is missing many features which we intend to add
in future commits, such as parsing from attributes whose names differ
from the field names and parsing into non-String types.
Well, not really, of course. All of this will make sense once we start
adding support for fields and non-struct types. Refactoring the code now
before we start to add actual member field parsing is much easier.
How do I know that this will work out? Well, my crystal ball knows it.
Don't believe me? Okay, ChatGPT told me ... Alright alright, I went
through the entire process of implementing this feature *twice* at this
point and have a pretty good idea of where to draw the abstraction lines
so that everything falls neatly into place. You'll have to trust me on
this one.
(Or, you know, check out old branches in my xmpp-rs repo. That might
work, too. `feature/derive-macro-streaming-full` might be a name to look
for if you dare.)
If we are going to support structs with fields, it would be good to have
that struct-related code organised a little and less splashed over the
main lib.rs file.
This is a large change and as such, it needs good motivation. Let me
remind you of the ultimate goal: we want a derive macro which allows us
to FromXml/IntoXml, and that derive macro should be usable from
`xmpp_parsers` and other crates.
For that, any code generated by the derive macro mustn't depend on any
code in the `xmpp_parsers` crate, because you cannot name the crate you
are in portably (`xmpp_parsers::..` wouldn't resolve within
`xmpp_parsers`, and `crate::..` would point at other crates if the macro
was used in other crates).
We also want to interoperate with code already implementing
`TryFrom<Element>` and `Into<Element>` on structs. This ultimately
requires that we have an error type which is shared by the two
implementations and that error type must be declared in the `xso` crate
to be usable by the macros.
Thus, we port the error type over to use the type declared in `xso`.
This changes the structure of the error type greatly; I do not think
that `xso` should have to know about all the different types we are
parsing there and they don't deserve special treatment. Wrapping them in
a `Box<dyn ..>` seems more appropriate.
The CI makes that an error anyway, and this allows developers to start
sketching stuff without running into errors all the time (and then
adding `/// TODO` to work around them).
Just like with the builder type, the concrete iterator type on IntoXml
is supposed to be an implementation detail. That allows switching freely
between various ways to generate such a type.
From [XEP-0004: Data Forms](https://xmpp.org/extensions/xep-0004.html#protocol-field):
> ...
> The <field/> element MAY contain any of the following child elements:
>
> <desc/>
> The XML character data of this element provides a natural-language
> description of the field, intended for presentation in a
> user-agent (e.g., as a "tool-tip", help button, or explanatory text
> provided near the field). The <desc/> element SHOULD NOT contain
> newlines (the \n and \r characters), since layout is the
> responsibility of a user agent, and any handling of
> newlines (e.g., presentation in a user interface) is unspecified
> herein. (Note: To provide a description of a field, it
> is RECOMMENDED to use a <desc/> element rather than
> a separate <field/> element of type "fixed".)
> ...
This adds shims which provide FromXml and IntoXml implementations to
*all* macro-generated types in `xmpp_parsers`. Mind that this does not
cover all types in `xmpp_parsers`, but a good share of them.
This is another first step toward real, fully streamed parsing.
This library provides the traits to parse structs from XML and
serialise them into XML without having to buffer the document object
model in memory.
The only implementations it provides are for minidom, basically
providing a lower-level interface to `minidom::Element::from_reader` and
`minidom::Element::to_writer`.
This is the first stepping stone into a world where `xmpp_parsers` can
parse the structs directly from XML.
[gone](https://datatracker.ietf.org/doc/html/rfc6120#section-8.3.3.5) and [redirect](https://datatracker.ietf.org/doc/html/rfc6120#section-8.3.3.14) errors may include an alternative address.
> gone
>
> The recipient or server can no longer be contacted at this address,
> typically on a permanent basis (as opposed to the <redirect/> error
> condition, which is used for temporary addressing failures); the
> associated error type SHOULD be "cancel" and the error stanza SHOULD
> include a new address (if available) as the XML character data of the
> <gone/> element (which MUST be a Uniform Resource Identifier [URI] or
> Internationalized Resource Identifier [IRI] at which the entity can
> be contacted, typically an XMPP IRI as specified in [XMPP-URI]).
—
> redirect
>
> The recipient or server is redirecting requests for this information
> to another entity, typically in a temporary fashion (as opposed to
> the <gone/> error condition, which is used for permanent addressing
> failures); the associated error type SHOULD be "modify" and the error
> stanza SHOULD contain the alternate address in the XML character data
> of the <redirect/> element (which MUST be a URI or IRI with which the
> sender can communicate, typically an XMPP IRI as specified in
> [XMPP-URI](https://datatracker.ietf.org/doc/html/rfc6120#ref-XMPP-URI)).