Skip to contents

Overview

There are lots of ways in R to modify categorical variables and to redistribution numeric values between categories. If you are performing a simple recoding of category labels, or collapsing multiple categories, you might use the forcats package. However, if you are doing more complex transformations you might find yourself writing custom functions or scripts including mutating joins, grouped summary operations, or case-wise transformation of numeric values. Verifying these data wrangling scripts and pipelines becomes more difficult as the number of categories and the complexity of the mappings increases. Current solutions to this issue mostly involve ad hoc data validation of the data before and after they are transformed (assertr, validate, pointblank).

The xmap package offers an alternative approach to ensuring your code performs the intended transformations. Instead of inspecting the data, the package provides tools for validating the mapping objects which are used to transform the data. Examples of mapping objects and available verification functions include:

  • Named vectors or lists
    • Commonly used as reference inputs for recoding or collapsing categories.
    • Use verify_named() for checking properties such as uniqueness or 1-to-1 relations,
    • and verify_named_matchset() for checking the set of names or values matches expectations.
  • Lookup tables:
    • Also known as crosswalks and concordance tables
    • Use verify_pairs() for checking uniqueness and 1-to-1 relations.
  • Crossmaps:
    • a new graph-based extension of Crosswalk tables that also store redistribution weights for ambiguous 1-to-many relations.
    • Use verify_links_as_xmap() to check aggregation or disaggregation weights and other desirable properties.

See vignette("xmap") to get started using verification functions in your existing workflows. The functions are based on results obtained by representing and analysing recoding or redistribution transformations as directed, weighted bipartite graphs (i.e. “Crossmaps”). For more information about this underlying graph structure, and the experimental xmap_df class, see vignette("making-xmaps") and vignette("vis-xmaps").

Installation

To install the latest release of xmap:

remotes::install_github("cynthiahqy/xmap")

To install the latest development version of xmap:

remotes::install_github("cynthiahqy/conformr-xmap-project", subdir = "xmap")