object::extract: get a bunch of keys in one go#2247
object::extract: get a bunch of keys in one go#2247the-moisrex wants to merge 18 commits intosimdjson:masterfrom
Conversation
This is the bare minimum implementation of this idea.
|
@the-moisrex Don't worry about the fuzzer failing, it seems like an obsolete config. |
|
Exciting work. |
|
Here are some ideas, @lemire what do you think? tweet t;
tweet_value.extract(
to{"created_at", t.created_at},
to{"id", t.id},
to{"text", t.result},
to{"user", sub{
to{"id", t.user.id},
to{"screen_name", t.user.screen_name}
}},
to{"aliases", range(1, 4, aliases.begin())},
to{"array", range(arr.begin(), arr.end())},
to{"array2", range(1, 3, back_inserter(vec))},
); |
|
@the-moisrex I generally like the |
|
Added the Now this is working: tweet_value.extract(
to{"created_at", t.created_at},
to{"id", t.id},
to{"text", t.result},
to{"in_reply_to_status_id", [&](auto value) {
t.in_reply_to_status_id = nullable_int(value);
}},
to{"retweet_count", t.retweet_count},
to{"favorite_count", t.favorite_count},
to{"user", sub{
to{"id", t.user.id},
to{"screen_name", t.user.screen_name}
}}); |
| namespace ondemand { | ||
|
|
||
| #if defined(__cpp_concepts) && defined(__cpp_consteval) | ||
| #define SIMDJSON_SUPPORTS_EXTRACT 1 |
There was a problem hiding this comment.
Should this be enabled only when SIMDJSON_EXCEPTIONS is true? Systems such as Node.js use simdjson, they compile with C++20, but they disable exceptions (at the compiler level). All of the simdjson library is designed to work well with both exceptions and no exception. It would be ok to have features requiring exception support, although this should be explicit.
For reference, here is how it uses in Node.js (without exceptions)
https://github.com/nodejs/node/blob/b887942e6b430e5f28ea7cb6c43841730161641c/src/node_modules.cc#L137
There was a problem hiding this comment.
We wouldn't need to specify that. Other than the one that you mentioned above (which is an easy fix), in extract, any type of error should be silently ignored; all the noexcept business is for the user's own exceptions (inside lambdas for example) not invalid json inputs.
I am not too worried about such issues, we can always figure it out in the end. :-) Let us make sure that we have the right design before worrying about a specific compiler. |
|
@lemire Is it a good idea to rename |
|
For handling of errors, we could make @lemire Which one do you prefer? I think the first one looks better, but it does mean the user will lose the information on where the error occurred. |
We have to mindful of quality-of-life. If it becomes too hard to debug, it might be annoying to users. I am neutral about how we go about it, we just have to have some way to debug the issue that is not insane. ❤️ |
|
@lemire I went with the first way. |
| }}); | ||
| if (error) { | ||
| return error; | ||
| } |
|
@the-moisrex I will dig into it soon. I invite the entire community to help review. |
|
@the-moisrex I have added the extractor to a few of our standard benchmarks, please see... |
adding benchmark to extractor
|
@lemire
|
|
@the-moisrex Very nice. Visual Studio appears to complain... :-/ |
|
It is likely that this PR is as good as it is going to guess on the short term. This is very good code and it is a very good design, in my opinion. I am still opening this up to the community... please do comment if you are reading this!!! |
| std::apply([&](auto &...endpoints) { | ||
| std::ignore = ((field_key.unsafe_is_equal(endpoints.key()) ? (error = endpoints(value(iter.child()))) == SUCCESS : true) && ...); |
There was a problem hiding this comment.
I have no idea why MSVC gives that error; I fixed it the way programmers fix everything, adding another layer of indirection.
I could not see any visible performance drop after this though.
extractor PR with clangcl tweaks
|
We need documentation before merging this, but I am still opening up this issue for comments. If you use simdjson, please consider reviewing. |
|
@the-moisrex I am not verifying your good results. See |

This is the bare minimum implementation of this idea.
This makes this syntax possible:
Car car{}; error = obj.extract( to{"wheels", sub{ to{"front", car.wheels.front}, to{"back", car.wheels.back}, }}, to{"make", car.make}, to{"model", car.model}, to{"year", [&car](auto val) { car.year = val; }}); if (error) { return error; }Its performance is better than #2235, and we're pretty much on par with the traditional way of doing this; but still there are things we could do as well to make it even faster.
I tried to make
to{...}unnecessary, but it doesn't seem to work. If you have a way that the user wouldn't need to keep typingtowould be great.