MeTA v1.3 Release Notes

  • ๐Ÿ†• New features

    • โž• additions to the graph library:
      • myopic search
      • BFS
      • preferential attachment graph generation model (supports node attractiveness from different distributions)
      • betweenness centrality
      • eigenvector centrality
    • โž• added a new natural language parsing library:
      • parse tree library (visitor-based)
      • shift-reduce constituency parser for generating phrase structure trees
      • reimplementation of evalb metrics for evaluating parsers
      • new filter for Penn Treebank-style normalization
    • โž• added a greedy averaged Perceptron-based tagger
    • demo application for various basic text processing (profile)
    • ๐Ÿ‘ basic iostreams that support gzip compression (if compiled with ZLib support)
    • โž• added iteration method for stats::multinomial seen events
    • โž• added expected value and entropy functions to stats namespace
    • โž• added linear_model: a generic multiclass classifier storage class
    • added gz_corpus: a compressed version of line_corpus
    • โž• added macros for generating type safe identifiers with user defined literal suffixes
    • โž• added a persistent stack data structure to meta::util

    โœจ Enhancements

    • โž• added operator== for util::optional<T>
    • ๐Ÿ‘ better CMake support for building the libsvm modules
    • ๐Ÿ‘ better CMake support for downloading unit-test data
    • ๐Ÿ‘Œ improved setup guide in README (for OS X, Ubuntu, Arch, and EWS/ENGRIT)
    • ๐Ÿ”จ tree analyzers refactored to use the new parser library (removes dependency on outside toolkits for generating tree files)
    • ๐Ÿšš analyzers that are not part of the "core" have been moved into their respective folders (so ngram_pos_analyzer is in src/sequence, tree_analyzer is in src/parser)
    • make_index now checks if the files exist before loading an index, and if they are missing creates a new one (as opposed to just throwing an exception on a nonexistent file)
    • โฌ†๏ธ cpptoml upgraded to support TOML v0.4.0
    • โš  enable extra warnings (-Wextra) for clang++ and g++

    ๐Ÿ› Bug fixes

    • ๐Ÿ›  fix sequence_analyzer::analyze() const when applied to untagged sequences (was throwing when it shouldn't)
    • ensure that the inverted index object is destroyed first before uninverting occurs in the creation of a forward_idnex
    • ๐Ÿ›  fix bug where icu_tokenizer would output spaces as tokens
    • ๐Ÿ›  fix bugs where index objects were not destroyed before trying to delete their files in the unit tests
    • ๐Ÿ›  fix bug in sparse_vector::find() where it would return a non-end iterator when asked to find an element that does not exist