MeTA v1.3 Release Notes
-
🆕 New features
- ➕ additions to the graph library:
- myopic search
- BFS
- preferential attachment graph generation model (supports node attractiveness from different distributions)
- betweenness centrality
- eigenvector centrality
- ➕ added a new natural language parsing library:
- parse tree library (visitor-based)
- shift-reduce constituency parser for generating phrase structure trees
- reimplementation of evalb metrics for evaluating parsers
- new filter for Penn Treebank-style normalization
- ➕ added a greedy averaged Perceptron-based tagger
- demo application for various basic text processing (profile)
- 👍 basic iostreams that support gzip compression (if compiled with ZLib support)
- ➕ added iteration method for
stats::multinomial
seen events - ➕ added expected value and entropy functions to
stats
namespace - ➕ added
linear_model
: a generic multiclass classifier storage class - added
gz_corpus
: a compressed version ofline_corpus
- ➕ added macros for generating type safe identifiers with user defined literal suffixes
- ➕ added a persistent stack data structure to
meta::util
✨ Enhancements
- ➕ added operator== for
util::optional<T>
- 👍 better CMake support for building the libsvm modules
- 👍 better CMake support for downloading unit-test data
- 👌 improved setup guide in README (for OS X, Ubuntu, Arch, and EWS/ENGRIT)
- 🔨 tree analyzers refactored to use the new parser library (removes dependency on outside toolkits for generating tree files)
- 🚚 analyzers that are not part of the "core" have been moved into their
respective folders (so
ngram_pos_analyzer
is insrc/sequence
,tree_analyzer
is insrc/parser
) make_index
now checks if the files exist before loading an index, and if they are missing creates a new one (as opposed to just throwing an exception on a nonexistent file)- ⬆️ cpptoml upgraded to support TOML v0.4.0
- ⚠ enable extra warnings (-Wextra) for clang++ and g++
🐛 Bug fixes
- 🛠 fix
sequence_analyzer::analyze() const
when applied to untagged sequences (was throwing when it shouldn't) - ensure that the inverted index object is destroyed first before
uninverting occurs in the creation of a
forward_idnex
- 🛠 fix bug where
icu_tokenizer
would output spaces as tokens - 🛠 fix bugs where index objects were not destroyed before trying to delete their files in the unit tests
- 🛠 fix bug in
sparse_vector::find()
where it would return a non-end iterator when asked to find an element that does not exist
- ➕ additions to the graph library: