simdjson alternatives and similar libraries
Based on the "JSON" category.
Alternatively, view simdjson alternatives based on common mentions on social networks and blogs.
-
JSMN
Jsmn is a world fastest JSON parser/tokenizer. This is the official repo replacing the old one at Bitbucket -
json-c
https://github.com/json-c/json-c is the official code repository for json-c. See the wiki for release tarballs for download. API docs at http://json-c.github.io/json-c/ -
frozen
JSON parser and generator for C/C++ with scanf/printf like interface. Targeting embedded systems. -
json_dto
A small header-only library for converting data between json representation and c++ structs -
Jsonifier
A few classes for extremely fast json parsing/serializing in modern C++. Possibly the fastest json parser in C++. Possibly the fastest json serializer in C++.
SaaSHub - Software Alternatives and Reviews
* Code Quality Rankings and insights are calculated and provided by Lumnify.
They vary from L1 to L5 with "L5" being the highest.
Do you think we are missing an alternative of simdjson or a related project?
Popular Comparisons
README
simdjson : Parsing gigabytes of JSON per second
JSON is everywhere on the Internet. Servers spend a lot of time parsing it. We need a fresh approach. The simdjson library uses commonly available SIMD instructions and microparallel algorithms to parse JSON 4x faster than RapidJSON and 25x faster than JSON for Modern C++.
- Fast: Over 4x faster than commonly used production-grade JSON parsers.
- Record Breaking Features: Minify JSON at 6 GB/s, validate UTF-8 at 13 GB/s, NDJSON at 3.5 GB/s.
- Easy: First-class, easy to use and carefully documented APIs.
- Strict: Full JSON and UTF-8 validation, lossless parsing. Performance with no compromises.
- Automatic: Selects a CPU-tailored parser at runtime. No configuration needed.
- Reliable: From memory allocation to error handling, simdjson's design avoids surprises.
- Peer Reviewed: Our research appears in venues like VLDB Journal, Software: Practice and Experience.
This library is part of the Awesome Modern C++ list.
Table of Contents
- Quick Start
- Documentation
- Performance results
- Real-world usage
- Bindings and Ports of simdjson
- About simdjson
- Funding
- Contributing to simdjson
- License
Quick Start
The simdjson library is easily consumable with a single .h and .cpp file.
- Prerequisites:
g++
(version 7 or better) orclang++
(version 6 or better), and a 64-bit system with a command-line shell (e.g., Linux, macOS, freeBSD). We also support programming environments like Visual Studio and Xcode, but different steps are needed. - Pull [simdjson.h](singleheader/simdjson.h) and [simdjson.cpp](singleheader/simdjson.cpp) into a directory, along with the sample file [twitter.json](jsonexamples/twitter.json).
wget https://raw.githubusercontent.com/simdjson/simdjson/master/singleheader/simdjson.h https://raw.githubusercontent.com/simdjson/simdjson/master/singleheader/simdjson.cpp https://raw.githubusercontent.com/simdjson/simdjson/master/jsonexamples/twitter.json
- Create
quickstart.cpp
:
#include <iostream>
#include "simdjson.h"
using namespace simdjson;
int main(void) {
ondemand::parser parser;
padded_string json = padded_string::load("twitter.json");
ondemand::document tweets = parser.iterate(json);
std::cout << uint64_t(tweets["search_metadata"]["count"]) << " results." << std::endl;
}
c++ -o quickstart quickstart.cpp simdjson.cpp
./quickstart
100 results.
Documentation
Usage documentation is available:
- [Basics](doc/basics.md) is an overview of how to use simdjson and its APIs.
- [Performance](doc/performance.md) shows some more advanced scenarios and how to tune for them.
- [Implementation Selection](doc/implementation-selection.md) describes runtime CPU detection and how you can work with it.
- API contains the automatically generated API documentation.
Performance results
The simdjson library uses three-quarters less instructions than state-of-the-art parser RapidJSON. To our knowledge, simdjson is the first fully-validating JSON parser to run at gigabytes per second (GB/s) on commodity processors. It can parse millions of JSON documents per second on a single core.
The following figure represents parsing speed in GB/s for parsing various files on an Intel Skylake processor (3.4 GHz) using the GNU GCC 10 compiler (with the -O3 flag). We compare against the best and fastest C++ libraries on benchmarks that load and process the data. The simdjson library offers full unicode (UTF-8) validation and exact number parsing.
The simdjson library offers high speed whether it processes tiny files (e.g., 300 bytes) or larger files (e.g., 3MB). The following plot presents parsing speed for synthetic files over various sizes generated with a script on a 3.4 GHz Skylake processor (GNU GCC 9, -O3).
All our experiments are reproducible.
For NDJSON files, we can exceed 3 GB/s with our multithreaded parsing functions.
Real-world usage
If you are planning to use simdjson in a product, please work from one of our releases.
Bindings and Ports of simdjson
We distinguish between "bindings" (which just wrap the C++ code) and a port to another programming language (which reimplements everything).
- ZippyJSON: Swift bindings for the simdjson project.
- libpy_simdjson: high-speed Python bindings for simdjson using libpy.
- pysimdjson: Python bindings for the simdjson project.
- cysimdjson: high-speed Python bindings for the simdjson project.
- simdjson-rs: Rust port.
- simdjson-rust: Rust wrapper (bindings).
- SimdJsonSharp: C# version for .NET Core (bindings and full port).
- simdjson_nodejs: Node.js bindings for the simdjson project.
- simdjson_php: PHP bindings for the simdjson project.
- simdjson_ruby: Ruby bindings for the simdjson project.
- fast_jsonparser: Ruby bindings for the simdjson project.
- simdjson-go: Go port using Golang assembly.
- rcppsimdjson: R bindings.
- simdjson_erlang: erlang bindings.
- lua-simdjson: lua bindings.
- hermes-json: haskell bindings.
- simdjzon: zig port.
About simdjson
The simdjson library takes advantage of modern microarchitectures, parallelizing with SIMD vector instructions, reducing branch misprediction, and reducing data dependency to take advantage of each CPU's multiple execution cores.
Some people enjoy reading our paper: A description of the design and implementation of simdjson is in our research article:
- Geoff Langdale, Daniel Lemire, Parsing Gigabytes of JSON per Second, VLDB Journal 28 (6), 2019.
We have an in-depth paper focused on the UTF-8 validation:
- John Keiser, Daniel Lemire, Validating UTF-8 In Less Than One Instruction Per Byte, Software: Practice & Experience 51 (5), 2021.
We also have an informal blog post providing some background and context.
For the video inclined, (It was the best voted talk, we're kinda proud of it.)
Funding
The work is supported by the Natural Sciences and Engineering Research Council of Canada under grant number RGPIN-2017-03910.
Contributing to simdjson
Head over to [CONTRIBUTING.md](CONTRIBUTING.md) for information on contributing to simdjson, and [HACKING.md](HACKING.md) for information on source, building, and architecture/design.
License
This code is made available under the Apache License 2.0.
Under Windows, we build some tools using the windows/dirent_portable.h file (which is outside our library code): it under the liberal (business-friendly) MIT license.
For compilers that do not support C++17, we bundle the string-view library which is published under the Boost license. Like the Apache license, the Boost license is a permissive license allowing commercial redistribution.
For efficient number serialization, we bundle Florian Loitsch's implementation of the Grisu2 algorithm for binary to decimal floating-point numbers. The implementation was slightly modified by JSON for Modern C++ library. Both Florian Loitsch's implementation and JSON for Modern C++ are provided under the MIT license.
For runtime dispatching, we use some code from the PyTorch project licensed under 3-clause BSD.
*Note that all licence references and agreements mentioned in the simdjson README section above
are relevant to that project's source code only.