janitor 1.0.0 (2018-03-17)

Release summary

A stable version 1.0.0, with a new tabyl API and with breaking changes to the output of clean_names().

This preserves the original functionality of janitor, but significantly changes the implementation.

Breaking changes

A fully-overhauled tabyl

This is now a single function tabyl() to count combinations of one, two, or three variables, ala base R’s table(). This replaces the crosstab() function. The resulting tabyl data.frames can be manipulated and formatted using a family of adorn_ functions. See the tabyls vignette for more.

The now-redundant legacy functions crosstab() and adorn_crosstab() have been deprecated, but remain in the package for now. Existing code that relies on tabyl will break if the sort argument is used, as that argument no longer exists in tabyl (use dplyr::arrange() instead).

Breaking improvements to clean_names

clean_names() now detects and preserves camelCase inputs, allows multiple options for case outputs of the cleaned data.frame, and preserves whether there’s space between letters and numbers. It also transliterates accented letters and turns # into "number". This may cause old code to break. E.g., variableName as a raw column name is now converted to variable_name (or variableName, VariableName, etc. depending on your preference), where it would previously have been converted to variablename. To minimize this inconvenience, there’s a quick fix for compatibility: you can find-and-replace to insert the argument case = "old_janitor", preserving the old behavior of clean_names() as of janitor version 0.3.1 (and thus not have to redo your scripts beyond that.)

Major Features

Minor Features

Bug fixes


janitor 0.3.1 (2018-01-04)

Release summary

This is a bug-fix release with no new functionality or changes. It fixes a bug where adorn_crosstab() failed if the tibble package was version > 1.4.

Major changes to janitor are currently in development on GitHub and will be released soon. This is not that next big release.


janitor 0.3.0 (2017-05-06)

Release summary

The primary purpose of this release is to maintain accuracy given breaking changes to the dplyr package, upon which janitor is built, in dplyr version >0.6.0. This update also contains a number of minor improvements.

Critical: if you update the package dplyr to version >0.6.0, you must update janitor to version 0.3.0 to ensure accurate results from janitor’s tabyl() function. This is due to a change in the behavior of dplyr’s _join functions (discussed in #111).

janitor 0.3.0 is compatible with this new version of dplyr as well as old versions of dplyr back to 0.5.0. That is, updating janitor to 0.3.0 does not necessitate an update to dplyr >0.6.0.

Breaking changes

  • The functions add_totals_row and add_totals_col were combined into a single function, adorn_totals(). (#57). The add_totals_ functions are now deprecated and should not be used.
  • The first argument of adorn_crosstab() is now “dat” instead of “crosstab” (indicating that the function can be called on any data.frame, not just a result of crosstab())

Major Features

  • Exported the %>% pipe from magrittr (#107).

Deprecated the following functions:

Minor Features

  • adorn_totals() and ns_to_percents() can now be called on data.frames that have non-numeric columns beyond the first one (those columns will be ignored) (#57)
  • adorn_totals("col") retains factor class in 1st column if 1st column in the input data.frame was a factor

Bug fixes


janitor 0.2.1 (2016-10-30)

Bug fixes


janitor 0.2.0 (2016-10-03)

Features

Major

Submitted to CRAN!

Minor

  • The count in tabyl() for factor levels that aren’t present is now 0 instead of NA (#48)

Bug fixes

  • Can call tabyl() on the result of a tabyl(), e.g., mtcars %>% tabyl(mpg) %>% tabyl(n) (#54)
  • get_dupes() now works on variables with spaces in column names (#62)

Package management

  • Reached 100% unit test code coverage

janitor 0.1.2

Features

Major

  • Added a function adorn_crosstab() that formats the results of a crosstab() for pretty printing. Shows % and N in the same cell, with the % symbol, user-specified rounding (method and number of digits), and the option to include a totals row and/or column. E.g., mtcars %>% crosstab(cyl, gear) %>% adorn_crosstab().
  • crosstab() can be called in a %>% pipeline, e.g., mtcars %>% crosstab(cyl, gear). Thanks to @chrishaid (#34)
  • tabyl() can also be called in a %>% pipeline, e.g., mtcars %>% tabyl(cyl) (#35)
  • Added use_first_valid_of() function (#32)
  • Added minor functions for manipulating numeric data.frames for presentation: ns_to_percents(), add_totals_row(), add_totals_col(),

Minor

  • crosstab() returns 0 instead of NA when there are no instances of a variable combination.
  • A call like tabyl(df$vecname) retains the more-descriptive $ symbol in the column name of the result - if you want a legal R name in the result, call it as df %>% tabyl(vecname)
  • Single and double quotation marks are handled by clean_names()

Package management

  • Added codecov to measure test coverage
  • Added unit test coverage
  • Added Travis-CI for continuous integration

janitor 0.1 (2016-04-17)

  • Initial draft of skeleton package on GitHub