- List of packages I’ve found useful in my workflow during 2022 (in no particular order)
Plot
ggvoronoi: Voronoi Diagrams and Heatmaps with ‘ggplot2’
tags: #ggplot #tidyverse #voronoi
[cran package link] https://CRAN.R-project.org/package=ggvoronoi
description from the author/vignette
Easy creation and manipulation of Voronoi diagrams using ‘deldir’ with visualization in ‘ggplot2’. Convenient functions are provided to create nearest neighbor diagrams and heatmaps. Diagrams are computed with ‘deldir’ and processed to work with the ‘sp’ framework. Results are provided in a convenient spatial data structure and displayed with ‘ggplot2’. An outline can be provided by the user to specify the spatial domain of interest.
ggh4x: Hacks for ‘ggplot2’
tags: #ggplot #tidyverse
[cran package link] https://CRAN.R-project.org/package=ggh4x
description from the author/vignette
A ‘ggplot2’ extension that does a variety of little helpful things. The package extends ‘ggplot2’ facets through customisation, by setting individual scales per panel, resizing panels and providing nested facets. Also allows multiple colour and fill scales per plot. Also hosts a smaller collection of stats, geoms and axis guides.
gdiff: Graphical Difference Testing
tags: #plot #ggplot #data analysis #comparison
[cran package link] https://CRAN.R-project.org/package=gdiff
description from the author/vignette
Functions for performing graphical difference testing. Differences are generated between raster images. Comparisons can be performed between different package versions and between different R versions.
aplot: Decorate a ‘ggplot’ with Associated Information
tags: #plot #data analysis #ggplot
[cran package link] https://CRAN.R-project.org/package=aplot
description from the author/vignette
For many times, we are not just aligning plots as what ‘cowplot’ and ‘patchwork’ did. Users would like to align associated information that requires axes to be exactly matched in subplots, e.g. hierarchical clustering with a heatmap. This package provides utilities to aligns associated subplots to a main plot at different sides (left, right, top and bottom) with axes exactly matched.
gghighlight: Highlight Lines and Points in ‘ggplot2’
tags: #plot #data analysis #ggplot
[cran package link] https://CRAN.R-project.org/package=gghighlight
description from the author/vignette
Make it easier to explore data with highlights.
pdp: Partial Dependence Plots
tags: #plot #data analysis #ggplot [cran package link] https://CRAN.R-project.org/package=pdp
description from the author/vignette
A general framework for constructing partial dependence (i.e., marginal effect) plots from various types machine learning models in R.
vcd: Visualizing Categorical Data
tags: #plot #data analysis #ggplot
[cran package link] https://CRAN.R-project.org/package=testDriveR
description from the author/vignette
Visualization techniques, data sets, summary and inference procedures aimed particularly at categorical data. Special emphasis is given to highly extensible grid graphics. The package was package was originally inspired by the book “Visualizing Categorical Data” by Michael Friendly and is now the main support package for a new book, “Discrete Data Analysis with R” by Michael Friendly and David Meyer (2015).
gTestsMulti: New Graph-Based Multi-Sample Tests
tags: #plot #data analysis #ggplot
[cran package link] https://CRAN.R-project.org/package=gTestsMulti
description from the author/vignette
New multi-sample tests for testing whether multiple samples are from the same distribution. They work well particularly for high-dimensional data. Song, H. and Chen, H. (2022) <arXiv:2205.13787>.
spiralize: Visualize Data on Spirals
tags: #statistic #visualization #plot
[cran package link] https://CRAN.R-project.org/package=spiralize
description from the author/vignette
It visualizes data along an Archimedean spiral https://en.wikipedia.org/wiki/Archimedean_spiral, makes so-called spiral graph or spiral chart. It has two major advantages for visualization: 1. It is able to visualize data with very long axis with high resolution. 2. It is efficient for time series data to reveal periodic patterns.
valuemap: Making Choropleth Map
tags: #plot #valuemap
[cran package link] https://CRAN.R-project.org/package=valuemap
description from the author/vignette
You can easily visualize your ‘sf’ polygons or data.frame with h3 address. While ‘leaflet’ package is too raw for data analysis, this package can save data analysts’ efforts & time with pre-set visualize options.
tessellation: Delaunay and Voronoï Tessellations
tags: #tesselation #voronoi #delaunay
[cran package link] https://cran.r-project.org/package=tessellation
description from the author/vignette
Delaunay and Voronoï tessellations, with emphasis on the two-dimensional and the three-dimensional cases (the package provides functions to plot the tessellations for these cases). Delaunay tessellations are computed in C with the help of the ‘Qhull’ library http://www.qhull.org/.
ggdist: Visualizations of Distributions and Uncertainty
tags: #ggplot #Distributions #Uncertainty
[cran package link] https://cran.r-project.org/package=ggdist
description from the author/vignette
Provides primitives for visualizing distributions using ‘ggplot2’ that are particularly tuned for visualizing uncertainty in either a frequentist or Bayesian mode. Both analytical distributions (such as frequentist confidence distributions or Bayesian priors) and distributions represented as samples (such as bootstrap distributions or Bayesian posterior samples) are easily visualized. Visualization primitives include but are not limited to: points with multiple uncertainty intervals, eye plots (Spiegelhalter D., 1999) https://ideas.repec.org/a/bla/jorssa/v162y1999i1p45-58.html, density plots, gradient plots, dot plots (Wilkinson L., 1999) doi:10.1080/00031305.1999.10474474, quantile dot plots (Kay M., Kola T., Hullman J., Munson S., 2016) doi:10.1145/2858036.2858558, complementary cumulative distribution function barplots (Fernandes M., Walls L., Munson S., Hullman J., Kay M., 2018) doi:10.1145/3173574.3173718, and fit curves with multiple uncertainty ribbons.
grafify: Easy Graphs for Data Visualisation and Linear Models for ANOVA
tags: #multivariate #Inference #tests #statistics
[cran package link] https://cran.r-project.org//package=energy
description from the author/vignette
E-statistics (energy) tests and statistics for multivariate and univariate inference, including distance correlation, one-sample, two-sample, and multi-sample tests for comparing multivariate distributions, are implemented. Measuring and testing multivariate independence based on distance correlation, partial distance correlation, multivariate goodness-of-fit tests, k-groups and hierarchical clustering based on energy distance, testing for multivariate normality, distance components (disco) for non-parametric analysis of structured data, and other energy statistics/methods are implemented.
DiagrammeR: Graph/Network Visualization
tags: #graph #networks
[cran package link] https://cran.r-project.org/package=DiagrammeR
description from the author/vignette
Build graph/network structures using functions for stepwise addition and deletion of nodes and edges. Work with data available in tables for bulk addition of nodes, edges, and associated metadata. Use graph selections and traversals to apply changes to specific nodes or edges. A wide selection of graph algorithms allow for the analysis of graphs. Visualize the graphs and take advantage of any aesthetic properties assigned to nodes and edges.
netplot: Beautiful Graph Drawing
tags: #plot #graph
[cran package link] https://cran.r-project.org/package=netplot
description from the author/vignette
A graph visualization engine that puts an emphasis on aesthetics at the same time of providing default parameters that yield out-of-the-box-nice visualizations. The package is built on top of ‘The Grid Graphics Package’ and seamlessly work with ‘igraph’ and ‘network’ objects.
vivid: variable importance and variable interaction displays
tags: #statistics #clinical data
[cran package link] https://cran.r-project.org/package=vivid
description from the author/vignette
Variable importance, interaction measures and partial dependence plots are important summaries in the interpretation of statistical and machine learning models. In our R package vivid (variable importance and variable interaction displays) we create new visualisation techniques for exploring these model summaries. We construct heatmap and graph-based displays showing variable importance and interaction jointly, which are carefully designed to highlight important aspects of the fit. We also construct a new matrix-type layout showing all single and bivariate partial dependence plots, and an alternative layout based on graph Eulerians focusing on key subsets. Our new visualisations are model-agnostic and are applicable to regression and classification supervised learning settings. They enhance interpretation even in situations where the number of variables is large and the interaction structure complex.
gridpattern: ‘grid’ Pattern Grobs
tags: #database #relational #data
[cran package link] https://cran.r-project.org/package=gridpattern
description from the author/vignette
Provides ‘grid’ grobs that fill in a user-defined area with various patterns. Includes enhanced versions of the geometric and image-based patterns originally contained in the ‘ggpattern’ package as well as original ‘pch’, ‘polygon_tiling’, ‘regular_polygon’, ‘rose’, ‘text’, ‘wave’, and ‘weave’ patterns plus support for custom user-defined patterns.
PairViz: Visualization using Graph Traversal
tags: #graphs #visualization [cran package link] https://cran.r-project.org/package=PairViz
description from the author/vignette
Improving graphics by ameliorating order effects, using Eulerian tours and Hamiltonian decompositions of graphs. References for the methods presented here are C.B. Hurley and R.W. Oldford (2010) doi:10.1198/jcgs.2010.09136 and C.B. Hurley and R.W. Oldford (2011) doi:10.1007/s00180-011-0229-5.
DiagrammeR: Graph/Network Visualization
tags: #graph #networks
[cran package link] https://cran.r-project.org/package=DiagrammeR
description from the author/vignette
Build graph/network structures using functions for stepwise addition and deletion of nodes and edges. Work with data available in tables for bulk addition of nodes, edges, and associated metadata. Use graph selections and traversals to apply changes to specific nodes or edges. A wide selection of graph algorithms allow for the analysis of graphs. Visualize the graphs and take advantage of any aesthetic properties assigned to nodes and edges.
ggimage: Use Image in ‘ggplot2’
tags: #ggplot [cran package link] https://cran.r-project.org/package=ggimage
description from the author/vignette
Supports image files and graphic objects to be visualized in ‘ggplot2’ graphic system.
superb: Summary Plots with Adjusted Error Bars
tags: #ggplot #summary #plots
[cran package link] https://cran.r-project.org//package=superb
description from the author/vignette
Computes standard error and confidence interval of various descriptive statistics under various designs and sampling schemes. The main function, superbPlot(), can either return a plot or a dataframe with the statistic and its precision interval so that other plotting package can be used. See Cousineau and colleagues (2021) doi:10.1177/25152459211035109 or Cousineau (2017) doi:10.5709/acp-0214-z for a review as well as Cousineau (2005) doi:10.20982/tqmp.01.1.p042, Morey (2008) doi:10.20982/tqmp.04.2.p061, Baguley (2012) doi:10.3758/s13428-011-0123-7, Cousineau & Laurencelle (2016) doi:10.1037/met0000055, Cousineau & O’Brien (2014) doi:10.3758/s13428-013-0441-z, Calderini & Harding doi:10.20982/tqmp.15.1.p001 for specific references.
khroma: Colour Schemes for Scientific Data Visualization
tags: #plot #colors
[cran package link] https://cran.r-project.org/web/package=khroma
Colour schemes ready for each type of data (qualitative, diverging or sequential), with colours that are distinct for all people, including colour-blind readers. This package provides an implementation of Paul Tol (2018) and Fabio Crameri (2018) doi:10.5194/gmd-11-2541-2018 colour schemes for use with ‘graphics’ or ‘ggplot2’. It provides tools to simulate colour-blindness and to test how well the colours of any palette are identifiable. Several scientific thematic schemes (geologic timescale, land cover, FAO soils, etc.) are also implemented
ggside: Side Grammar Graphics
tags: #plot #ggplot
[cran package link] https://cran.r-project.org//package=ggside
description from the author/vignette
The grammar of graphics as shown in ‘ggplot2’ has provided an expressive API for users to build plots. ‘ggside’ extends ‘ggplot2’ by allowing users to add graphical information about one of the main panel’s axis using a familiar ‘ggplot2’ style API with tidy data. This package is particularly useful for visualizing metadata on a discrete axis, or summary graphics on a continuous axis such as a boxplot or a density distribution.
ggquiver: Quiver Plots for ‘ggplot2’
tags: #plot #ggplot
[cran package link] https://cran.r-project.org/package=ggquiver
description from the author/vignette
An extension of ‘ggplot2’ to provide quiver plots to visualise vector fields. This functionality is implemented using a geom to produce a new graphical layer, which allows aesthetic options. This layer can be overlaid on a map to improve visualisation of mapped data.
timevis: Create Interactive Timeline Visualizations in R
tags: #visualization #interactive
[cran package link] https://cran.r-project.org/package=timevis
description from the author/vignette
Create rich and fully interactive timeline visualizations. Timelines can be included in Shiny apps and R markdown documents, or viewed from the R console and ‘RStudio’ Viewer. ‘timevis’ includes an extensive API to manipulate a timeline after creation, and supports getting data out of the visualization into R. Based on the ‘vis.js’ Timeline module and the ‘htmlwidgets’ R package.
misc3d: Miscellaneous 3D Plots
tags: #plot #misc
[cran package link] https://cran.r-project.org//package=misc3d
description from the author/vignette
A collection of miscellaneous 3d plots, including isosurfaces..
Math
lmtest: Testing Linear Regression Models
tags: #linear regression #testing
[cran package link] https://cran.r-project.org/package=lmtest
description from the author/vignette
A collection of tests, data sets, and examples for diagnostic checking in linear regression models. Furthermore, some generic tools for inference in parametric models are provided.
optimx: Expanded Replacement and Extension of the ‘optim’ Function
tags: #optim
[cran package link] https://cran.r-project.org/packages=optimx
description from the author/vignette
Provides a replacement and extension of the optim() function to call to several function minimization codes in R in a single statement. These methods handle smooth, possibly box constrained functions of several or many parameters. Note that function ‘optimr()’ was prepared to simplify the incorporation of minimization codes going forward. Also implements some utility codes and some extra solvers, including safeguarded Newton methods. Many methods previously separate are now included here.
Statistics
corrr: Correlations in R
tags: #statistics #data #correlation #calculus
[cran package link] https://CRAN.R-project.org/package=corrr%5D
description from the author/vignette
A ‘ggplot2’ extension that does a variety of little helpful things. The package extends ‘ggplot2’ facets through customisation, by setting individual scales per panel, resizing panels and providing nested facets. Also allows multiple colour and fill scales per plot. Also hosts a smaller collection of stats, geoms and axis guides.
FactoMineR: Multivariate Exploratory Data Analysis and Data Mining
tags: #statistics #pca #clustering #multivariate #data
[cran package link]https://CRAN.R-project.org/package=FactoMineR
description from the author/vignette
Exploratory data analysis methods to summarize, visualize and describe datasets. The main principal component methods are available, those with the largest potential in terms of applications: principal component analysis (PCA) when variables are quantitative, correspondence analysis (CA) and multiple correspondence analysis (MCA) when variables are categorical, Multiple Factor Analysis when variables are structured in groups, etc. and hierarchical cluster analysis. F. Husson, S. Le and J. Pages (2017).
VIM: Visualization and Imputation of Missing Values
tags: #data #missing values
[cran package link] https://CRAN.R-project.org/package=VIM
description from the author/vignette
New tools for the visualization of missing and/or imputed values are introduced, which can be used for exploring the data and the structure of the missing and/or imputed values. Depending on this structure of the missing values, the corresponding methods may help to identify the mechanism generating the missing values and allows to explore the data including missing values. In addition, the quality of imputation can be visually explored using various univariate, bivariate, multiple and multivariate plot methods. A graphical user interface available in the separate package VIMGUI allows an easy handling of the implemented plot methods.
cfda: Categorical Functional Data Analysis
tags: #data #categorical
[cran package link] https://CRAN.R-project.org/package=cfda
description from the author/vignette
Package for the analysis of categorical functional data. The main purpose is to compute an encoding (real functional variable) for each state doi:10.3390/math9233074. It also provides functions to perform basic statistical analysis on categorical functional data.
SHT: Statistical Hypothesis Testing Toolbox
tags: #statistics #data analysis #comparison
[cran package link] https://CRAN.R-project.org/package=SHT
description from the author/vignette
We provide a collection of statistical hypothesis testing procedures ranging from classical to modern methods for non-trivial settings such as high-dimensional scenario. For the general treatment of statistical hypothesis testing, see the book by Lehmann and Romano (2005) doi:10.1007/0-387-27605-X.
contingencytables: Statistical Analysis of Contingency Tables
tags: #plot #data analysis #ggplot
[cran package link] <https://contingencytables.com/
description from the author/vignette
Provides functions to perform statistical inference of data organized in contingency tables. This package is a companion to the “Statistical Analysis of Contingency Tables” book by Fagerland et al. <ISBN 9781466588172>.
MorphoTools2: Multivariate Morphometric Analysis
tags: #statistics #multivatiate [cran package link] https://CRAN.R-project.org/package=MorphoTools2
description from the author/vignette
Tools for multivariate analyses of morphological data, wrapped in one package, to make the workflow convenient and fast. Statistical and graphical tools provide a comprehensive framework for checking and manipulating input data, statistical analyses, and visualization of results. Several methods are provided for the analysis of raw data, to make the dataset ready for downstream analyses. Integrated statistical methods include hierarchical classification, principal component analysis, principal coordinates analysis, non-metric multidimensional scaling, and multiple discriminant analyses: canonical, stepwise, and classificatory (linear, quadratic, and the non-parametric k nearest neighbours). The philosophy of the package will be described in Šlenker et al. (in prep).
autostats: Auto Stats
tags: #statistic #reports #exploration
[cran package link]https://CRAN.R-project.org/package=autostats
description from the author/vignette
Automatically do statistical exploration. Create formulas using ‘tidyselect’ syntax, and then determine cross-validated model accuracy and variable contributions using ‘glm’ and ‘xgboost’. Contains additional helper functions to create and modify formulas. Has a flagship function to quickly determine relationships between categorical and continuous variables in the data set.
flexclust: Flexible Cluster Algorithms
tags: #classificatrion #clusters #multivariate
[cran package link] https://cran.r-project.org/package=flexclust
description from the author/vignette
The main function kcca implements a general framework for k-centroids cluster analysis supporting arbitrary distance measures and centroid computation. Further cluster methods include hard competitive learning, neural gas, and QT clustering. There are numerous visualization methods for cluster results (neighborhood graphs, convex cluster hulls, barcharts of centroids, …), and bootstrap methods for the analysis of cluster stability.
ffmanova: Fifty-Fifty MANOVA
tags: #MANOVA #MANCONVA
[cran package link] https://cran.r-project.org/package=ffmanova
description from the author/vignette
General linear modeling with multiple responses (MANCOVA). An overall p-value for each model term is calculated by the 50-50 MANOVA method by Langsrud (2002) doi:10.1111/1467-9884.00320, which handles collinear responses. Rotation testing, described by Langsrud (2005) doi:10.1007/s11222-005-4789-5, is used to compute adjusted single response p-values according to familywise error rates and false discovery rates (FDR). The approach to FDR is described in the appendix of Moen et al. (2005) doi:10.1128/AEM.71.4.2086-2094.2005. Unbalanced designs are handled by Type II sums of squares as argued in Langsrud (2003) doi:10.1023/A:1023260610025. Furthermore, the Type II philosophy is extended to continuous design variables as described in Langsrud et al. (2007) doi:10.1080/02664760701594246. This means that the method is invariant to scale changes and that common pitfalls are avoided.
compareGroups 4.0: Descriptives by groups
tags: #statistics #clinical data
[cran package link] https://cran.r-project.org/package=compareGroups
description from the author/vignette
compareGroups is an R package available on CRAN which performs descriptive tables displaying means, standard deviation, quantiles or frequencies of several variables. Also, p-value to test equality between groups is computed using the appropiate test. With a very simple code, nice, compact and ready-to-publish descriptives table are displayed on R console. They can also be exported to different formats, such as Word, Excel, PDF or inserted in a R-Sweave or R-markdown d
DescTools: Tools for Descriptive Statistics
tags: #statistics
[cran package link] https://cran.r-project.org//package=DescTools
description from the author/vignette
A collection of miscellaneous basic statistic functions and convenience wrappers for efficiently describing data. The author’s intention was to create a toolbox, which facilitates the (notoriously time consuming) first descriptive tasks in data analysis, consisting of calculating descriptive statistics, drawing graphical summaries and reporting the results. The package contains furthermore functions to produce documents using MS Word (or PowerPoint) and functions to import data from Excel. Many of the included functions can be found scattered in other packages and other sources written partly by Titans of R. The reason for collecting them here, was primarily to have them consolidated in ONE instead of dozens of packages (which themselves might depend on other packages which are not needed at all), and to provide a common and consistent interface as far as function and arguments naming, NA handling, recycling rules etc. are concerned. Google style guides were used as naming rules (in absence of convincing alternatives). The ‘BigCamelCase’ style was consequently applied to functions borrowed from contributed R packages as well.
outForest: Multivariate Outlier Detection and Replacement
tags: #random forest #outliers
[cran package link] https://cran.r-project.org/package=outForest
description from the author/vignette
Provides a random forest based implementation of the method described in Chapter 7.1.2 (Regression model based anomaly detection) of Chandola et al. (2009) doi:10.1145/1541880.1541882. It works as follows: Each numeric variable is regressed onto all other variables by a random forest. If the scaled absolute difference between observed value and out-of-bag prediction of the corresponding random forest is suspiciously large, then a value is considered an outlier. The package offers different options to replace such outliers, e.g. by realistic values found via predictive mean matching. Once the method is trained on a reference data, it can be applied to new data.
multid: Multivariate Difference Between Two Groups
tags: #multivariate #test
[cran package link] https://cran.r-project.org/package=multid
description from the author/vignette
Estimation of multivariate differences between two groups (e.g., multivariate sex differences) with regularized regression methods and predictive approach. See Lönnqvist & Ilmarinen (2021) doi:10.1007/s11109-021-09681-2 and Ilmarinen et al. (2021) doi:10.31234/osf.io/j59bs.
simpr: Flexible ‘Tidyverse’-Friendly Simulations
tags: #simulation #tidyverse
[cran package link] https://cran.r-project.org/package=simpr
description from the author/vignette
A general, ‘tidyverse’-friendly framework for simulation studies, design analysis, and power analysis. Specify data generation, define varying parameters, generate data, fit models, and tidy model results in a single pipeline, without needing loops or custom functions.
ggfortify: Data Visualization Tools for Statistical Analysis Results
tags: #statistics #datavis
[cran package link] https://cran.r-project.org/web/packages/ggfortify/index.html
description from the author/vignette
Unified plotting tools for statistics commonly used, such as GLM, time series, PCA families, clustering and survival analysis. The package offers a single plotting interface for these analysis results and plots in a unified style using ‘ggplot2’.
MVTests: Multivariate Hypothesis Tests
tags: #statistics #test #multivariate
[cran package link] https://cran.r-project.org/package=MVTests
description from the author/vignette
Multivariate hypothesis tests and the confidence intervals. It can be used to test the hypothesizes about mean vector or vectors (one-sample, two independent samples, paired samples), covariance matrix (one or more matrices), and the correlation matrix. Moreover, it can be used for robust Hotelling T^2 test at one sample case in high dimensional data. For this package, we have benefited from the studies Rencher (2003), Nel and Merwe (1986) doi:10.1080/03610928608829342, Tatlidil (1996), Tsagris (2014), Villasenor Alva and Estrada (2009) doi:10.1080/03610920802474465.
plsVarSel: Variable Selection in Partial Least Squares
tags: #pls #chemometrics
[cran package link] https://cran.r-project.org/packages=plsVarSel
description from the author/vignette
Interfaces and methods for variable selection in Partial Least Squares. The methods include filter methods, wrapper methods and embedded methods. Both regression and classification is supported.
RcmdrPlugin.EZR: R Commander Plug-in for the EZR (Easy R) Package
tags: #statistics #ROC
[cran package link] https://cran.r-project.org/package=RcmdrPlugin.EZR/index.html
description from the author/vignette
EZR (Easy R) adds a variety of statistical functions, including survival analyses, ROC analyses, metaanalyses, sample size calculation, and so on, to the R commander. EZR enables point-and-click easy access to statistical functions, especially for medical statistics. EZR is platform-independent and runs on Windows, Mac OS X, and UNIX. Its complete manual is available only in Japanese (Chugai Igakusha, ISBN: 978-4-498-10918-6, Nankodo, ISBN: 978-4-524-26158-1, Ohmsha, ISBN: 978-4-274-22632-8), but an report that introduced the investigation of EZR was published in Bone Marrow Transplantation (Nature Publishing Group) as an Open article. This report can be used as a simple manual. It can be freely downloaded from the journal website as shown below. This report has been cited in more than 3,000 scientific articles.
RcmdrPlugin.NMBU: R Commander Plug-in for University Level Applied Statistics
tags: #PLS #LDA #QDA
[cran package link] https://cran.r-project.org/package=RcmdrPlugin.NMBU
description from the author/vignette
An R Commander “plug-in” extending functionality of linear models and providing an interface to Partial Least Squares Regression and Linear and Quadratic Discriminant analysis. Several statistical summaries are extended, predictions are offered for additional types of analyses, and extra plots, tests and mixed models are available.
DataEditR: An Interactive Editor for Viewing, Entering, Filtering & Editing Data
tags: #tables #editor
[cran package link] https://cran.r-project.org/package=DataEditR
description from the author/vignette
An interactive editor built on ‘rhandsontable’ to allow the interactive viewing, entering, filtering and editing of data in R https://dillonhammill.github.io/DataEditR/.
fICA: Classical, Reloaded and Adaptive FastICA Algorithms
tags: #ggplot #summary #plots
[cran package link] https://cran.r-project.org/package=fICA
description from the author/vignette
Algorithms for classical symmetric and deflation-based FastICA, reloaded deflation-based FastICA algorithm and an algorithm for adaptive deflation-based FastICA using multiple nonlinearities. For details, see Miettinen et al. (2014) doi:10.1109/TSP.2014.2356442 and Miettinen et al. (2017) doi:10.1016/j.sigpro.2016.08.028. The package is described in Miettinen, Nordhausen and Taskinen (2018) doi:10.32614/RJ-2018-046.
JFE: Tools and GUI for Analyzing Time Series Data of Just Finance and Econometrics
tags: #econometrics #finance
[cran package link] https://cran.r-project.org/web/packages/JFE/index.html
description from the author/vignette
Support the analysis of financial and econometric time series, including recursive forecasts for machine learning.
anscombiser: Create Datasets with Identical Summary Statistics
tags: #statistics #anscombe
[cran package link] https://cran.r-project.org/package=anscombiser
description from the author/vignette
The anscombiser package takes a simpler and quicker approach to the same problem, using Anscombe’s statistics. It uses shifting, scaling and rotating to transform the observations in an input dataset to achieve a target set of Anscombe’s statistics”
randtoolbox: Toolbox for Pseudo and Quasi Random Number Generation and Random Generator Tests
tags: #distributions
[cran package link] https://cran.r-project.org/package=randtoolbox
description from the author/vignette
Provides (1) pseudo random generators - general linear congruential generators, multiple recursive generators and generalized feedback shift register (SF-Mersenne Twister algorithm and WELL generators); (2) quasi random generators - the Torus algorithm, the Sobol sequence, the Halton sequence (including the Van der Corput sequence) and (3) some generator tests - the gap test, the serial test, the poker test. See e.g. Gentle (2003) doi:10.1007/b97336. The package can be provided without the rngWELL dependency on demand. Take a look at the Distribution task view of types and tests of random number generators. Version in Memoriam of Diethelm and Barbara Wu
Medicine
visR: Clinical Graphs and Tables Adhering to Graphical Principles
tags: #data #clinical
[cran package link] https://CRAN.R-project.org/package=cfda
description from the author/vignette
To enable fit-for-purpose, reusable clinical and medical research focused visualizations and tables with sensible defaults and based on graphical principles as described in: “Vandemeulebroecke et al. (2018)” doi:10.1002/pst.1912, “Vandemeulebroecke et al. (2019)” doi:10.1002/psp4.12455, and “Morris et al. (2019)” doi:10.1136/bmjopen-2019-030215.
vbp: Blood Pressure Analysis in R
tags: #statistics #clinical data
[cran package link] https://cran.r-project.org/package=bp
description from the author/vignette
Cardiovascular disease (CVD) is the leading cause of death worldwide with Hypertension, specifically, affecting over 1.1 billion people annually. The goal of the package is to provide a comprehensive toolbox for analyzing blood pressure data using a variety of statistical metrics and visualizations to bring more clarity to CVD.
Teaching
testDriveR: Teaching Data for Statistics and Data Science
tags: #plot #data analysis #ggplot
[cran package link] https://CRAN.R-project.org/package=testDriveR
description from the author/vignette
Provides data sets for teaching statistics and data science courses. It includes a sample of data from John Edmund Kerrich’s famous coinflip experiment. These are data that I used for teaching SOC 4015 / SOC 5050 at Saint Louis University (SLU). The package also contains an R Markdown template with the required formatting for assignments in my courses SOC 4015, SOC 4650, SOC 5050, and SOC 5650 at SLU.
Chemistry
stoichcalc: R Functions for Solving Stoichiometric Equations
tags: #chemistry #stoichiometry [cran package link] https://CRAN.R-project.org/package=stoichcalc
description from the author/vignette
Given a list of substance compositions, a list of substances involved in a process, and a list of constraints in addition to mass conservation of elementary constituents, the package contains functions to build the substance composition matrix, to analyze the uniqueness of process stoichiometry, and to calculate stoichiometric coefficients if process stoichiometry is unique. (See Reichert, P. and Schuwirth, N., A generic framework for deriving process stoichiometry in enviromental models, Environmental Modelling and Software 25, 1241-1251, 2010 for more details.)
inters: Flexible Tools for Estimating Interactions
tags: #statistics #interactions [cran package link] https://CRAN.R-project.org/package=inters
description from the author/vignette
A set of functions to estimate interactions flexibly in the face of possibly many controls. Implements the procedures described in Blackwell and Olson (2022) doi:10.1093/restud/rdt044.
waves: Vis-NIR Spectral Analysis Wrapper
tags: #spectroscopy #preprocessing #filtering #model training
[cran package link] https://cran.r-project.org/package=waves
description from the author/vignette
Originally designed application in the context of resource-limited plant research and breeding programs, ‘waves’ provides an open-source solution to spectral data processing and model development by bringing useful packages together into a streamlined pipeline. This package is wrapper for functions related to the analysis of point visible and near-infrared reflectance measurements. It includes visualization, filtering, aggregation, preprocessing, cross-validation set formation, model training, and prediction functions to enable open-source association of spectral and reference data. This package is documented in a peer-reviewed manuscript in the Plant Phenome Journal doi:10.1002/ppj2.20012. Specialized cross-validation schemes are described in detail in Jarquín et al. (2017) doi:10.3835/plantgenome2016.12.0130. Example data is from Ikeogu et al. (2017) doi:10.1371/journal.pone.0188918.
MSclassifR: Automated Classification of Mass Spectra
tags: #Classification #Mass-Spectra
[cran package link] https://cran.r-project.org//package=MSclassifR
description from the author/vignette
Functions to classify mass spectra in known categories, and to determine discriminant mass-over-charge values. It includes easy-to-use functions for pre-processing mass spectra, functions to determine discriminant mass-over-charge values (m/z) from a library of mass spectra corresponding to different categories, and functions to predict the category (species, phenotypes, etc.) associated to a mass spectrum from a list of selected mass-over-charge values. Two vignettes illustrating how to use the functions of this package from real data sets are also available online to help users: https://agodmer.github.io/MSclassifR_examples/Vignettes/Vignettemsclassifr_Ecrobia.html and https://agodmer.github.io/MSclassifR_examples/Vignettes/Vignettemsclassifr_Klebsiella.html.
NGLVieweR: load a PDB in R in order to view it
tags: #chemistrys #visualization #molecular
[cran package link] https://cran.r-project.org/packages=NGLVieweR
description from the author/vignette
Provides an ‘htmlwidgets’ https://www.htmlwidgets.org/ interface to ‘NGL.js’ http://nglviewer.org/ngl/api/. ‘NGLvieweR’ can be used to visualize and interact with protein databank (‘PDB’) and structural files in R and Shiny applications. It includes a set of API functions to manipulate the viewer after creation in Shiny.
Writing reports and articles
utile.tools: Summarize Data for Publication
tags: #statistic #reports #exploration
[cran package link] [https://CRAN.R-project.org/package=utile.tools]
description from the author/vignette
A set of tools for preparing and summarizing data for publication purposes. Includes functions for tabulating models, means to produce human-readable summary statistics from raw data, macros for calculating duration of time, and simplistic hypothesis testing tools.
rrtable: Reproducible Research with a Table of R Codes
tags: #tables #reproducible research
[cran package link] https://cran.r-project.org/package=rrtable
description from the author/vignette
Makes documents containing plots and tables from a table of R codes. Can make “HTML”, “pdf(‘LaTex’)”, “docx(‘MS Word’)” and “pptx(‘MS Powerpoint’)” documents with or without R code. In the package, modularized ‘shiny’ app codes are provided. These modules are intended for reuse across applications.
reporter: Creates Statistical Reports
tags: #statistics #report
[cran package link] https://CRAN.R-project.org/package=reporter
description from the author/vignette
Contains functions to create regulatory-style statistical reports. Originally designed to create tables, listings, and figures for the pharmaceutical, biotechnology, and medical device industries, these reports are generalized enough that they could be used in any industry. Generates text, rich-text, PDF, HTML, and Microsoft Word file formats. The package specializes in printing wide and long tables with automatic page wrapping and splitting. Reports can be produced with a minimum of function calls, and without relying on other table packages. The package supports titles, footnotes, page header, page footers, spanning headers, page by variables, and automatic page numbering.
PDE: Extract Tables and Sentences from PDFs with User Interface
tags: #pdf #scraping
[cran package link] https://cran.r-project.org/packages=PDE
description from the author/vignette
PDE is a R package that easily extracts information and tables from PDF files. The PDE_analyzer_i() performs the sentence and table extraction while the included PDE_reader_i() allows the user-friendly visualization and quick-processing of the obtained results.
Tplyr: A Grammar of Clinical Data Summary
tags: #clinical #medical
[cran package link] https://cran.r-project.org/package=Tplyr
description from the author/vignette
A tool created to simplify the data manipulation necessary to create clinical reports.
Coding
mockr: Mocking in R
tags: #testing
[cran package link] https://cran.r-project.org/package=mockr
description from the author/vignette
Provides a means to mock a package function, i.e., temporarily substitute it for testing. Designed as a drop-in replacement for the now deprecated ‘testthat::with_mock()’ and ‘testthat::local_mock()’.
cOde: Automated C Code Generation for ‘deSolve’, ‘bvpSolve’
tags: #C #Jacobians
[cran package link] https://cran.r-project.org/package=cOde
description from the author/vignette
Generates all necessary C functions allowing the user to work with the compiled-code interface of ode() and bvptwp(). The implementation supports “forcings” and “events”. Also provides functions to symbolically compute Jacobians, sensitivity equations and adjoint sensitivities being the basis for sensitivity analysis.
matlab2r: Translation Layer from MATLAB to R
tags: #R #Matlab
[cran package link] https://cran.r-project.org/package=matlab2r
description from the author/vignette
Allows users familiar with MATLAB to use MATLAB-named functions in R. Several basic MATLAB functions are written in this package to mimic the behavior of their original counterparts, with more to come as this package grows.
lessR: Less Code, More Results
tags: #coding
[cran package link] https://cran.r-project.org//package=lessR
description from the author/vignette
Each function accomplishes the work of several or more standard R functions. For example, two function calls, Read() and CountAll(), read the data and generate summary statistics for all variables in the data frame, plus histograms and bar charts as appropriate. Other functions provide for descriptive statistics, a comprehensive regression analysis, analysis of variance and t-test, plotting including the introduced here Violin/Box/Scatter plot for a numerical variable, bar chart, histogram, box plot, density curves, calibrated power curve, reading multiple data formats with the same function call, variable labels, color themes, Trellis graphics and a built-in help system. Also includes a confirmatory factor analysis of multiple indicator measurement models, pedagogical routines for data simulation such as for the Central Limit Theorem, and generation and rendering of R markdown instructions for interpretative output.
Groundhog: Addressing The Threat That R Poses To Reproducible Research
tags: #reproducibility
[cran package link] https://cran.r-project.org/package=groundhog
description from the author/vignette
Make R scripts that rely on packages reproducible, by ensuring that every time a given script is run, the same version of the used packages are loaded (instead of whichever version the user running the script happens to have installed). This is achieved by using the new command groundhog.library() instead of the base command library(), and including a date in the call. The date is used to call on the same version of the package every time (the most recent version available on CRAN at that date).
Graphics
raymolecule: Parse and Render Molecular Structures in 3D
tags: #chemistry #rendering
[cran package link] https://cran.r-project.org/package=raymolecule
description from the author/vignette
Downloads and parses ‘SDF’ (Structural Description Format) and ‘PDB’ (Protein Database) files for 3D rendering.
Regression
speedglm: Fitting Linear and Generalized Linear Models to Large Data Sets
tags: #fitting #GLM
[cran package link] https://cran.r-project.org//package=speedglm
description from the author/vignette
Fitting linear models and generalized linear models to large data sets by updating algorithms.
modelsummary: Summary Tables and Plots for Statistical Models and Data: Beautiful, Customizable, and Publication-Ready
tags: #models
[cran package link] https://cran.r-project.org/package=modelsummary
description from the author/vignette
Create beautiful and customizable tables to summarize several statistical models side-by-side. Draw coefficient plots, multi-level cross-tabs, dataset summaries, balance tables (a.k.a. “Table 1s”), and correlation matrices. This package supports dozens of statistical models, and it can produce tables in HTML, LaTeX, Word, Markdown, PDF, PowerPoint, Excel, RTF, JPG, or PNG. Tables can easily be embedded in ‘Rmarkdown’ or ‘knitr’ dynamic documents.
Signal Processing
gsignal: Signal Processing in R
tags: #signal processing #filters #boxcar
[cran package link] https://cran.r-project.org/packages=gsignal
description from the author/vignette
R implementation of the ‘Octave’ package ‘signal’, containing a variety of signal processing tools, such as signal generation and measurement, correlation and convolution, filtering, filter design, filter analysis and conversion, power spectrum analysis, system identification, decimation and sample rate change, and windowing.
Data analysis
splitTools: Tools for Data Splitting
tags: #data splitting
[cran package link] https://cran.r-project.org/package=splitTools
description from the author/vignette
Fast, lightweight toolkit for data splitting. Data sets can be partitioned into disjoint groups (e.g. into training, validation, and test) or into (repeated) k-folds for subsequent cross-validation. Besides basic splits, the package supports stratified, grouped as well as blocked splitting. Furthermore, cross-validation folds for time series data can be created. See e.g. Hastie et al. (2001) doi:10.1007/978-0-387-84858-7 for the basic background on data partitioning and cross-validation.
optedr: Calculating Optimal and D-Augmented Designs
tags: #DoE #Chemometrics #optimal-design
[cran package link] https://cran.r-project.org//package=optedr
description from the author/vignette
Calculates D-, Ds-, A- and I-optimal designs for non-linear models, via an implementation of the cocktail algorithm (Yu, 2011, doi:10.1007/s11222-010-9183-2). Compares designs via their efficiency, and D-augments any design with a controlled efficiency. An efficient rounding function has been provided to transform approximate designs to exact designs. mynotes
ReDaMoR: Relational Data Modeler
tags: #database #relational #data
[cran package link] https://cran.r-project.org/package=ReDaMoR
description from the author/vignette
The aim of this package is to manipulate relational data models in R. It provides functions to create, modify and export data models in json format. It also allows importing models created with ‘MySQL Workbench’ (https://www.mysql.com/products/workbench/). These functions are accessible through a graphical user interface made with ‘shiny’. Constraints such as types, keys, uniqueness and mandatory fields are automatically checked and corrected when editing a model. Finally, real data can be confronted to a model to check their compatibility.
explore: Simplifies Exploratory Data Analysis
tags: #graphs #visualization
[cran package link] https://cran.r-project.org/package=gridpattern
description from the author/vignette
Interactive data exploration with one line of code or use an easy to remember set of tidy functions for exploratory data analysis. Introduces three main verbs. explore() to graphically explore a variable or table, describe() to describe a variable or table and report() to create an automated report.
esquisse: Explore and Visualize Your Data Interactivelly
tags: #visualization #interactive
[cran package link] https://cran.r-project.org/package=esquisse
description from the author/vignette
A ‘shiny’ gadget to create ‘ggplot2’ figures interactively with drag-and-drop to map your variables to different aesthetics. You can quickly visualize your data accordingly to their type, export in various formats, and retrieve the code to reproduce the plot.
plfMA: A GUI to View, Design and Export Various Graphs of Data
tags: #visualization #GUI
[cran package link] http://cran.stat.unipd.it/package=plfMA
description from the author/vignette
Provides a graphical user interface for viewing and designing various types of graphs of the data. The graphs can be saved in different formats of an image.
datasets.load: Interfaces for Loading Datasets
tags: #visualization #interactive
[cran package link] https://cran.r-project.org/packages/datasets.load/index.html
description from the author/vignette
Visual interface for loading datasets in RStudio from all installed (including unloaded) packages, also includes command line interfaces.
loon.shiny: Automatically Create a ‘Shiny’ App Based on Interactive ‘Loon’ Widgets
tags: #data analysis
[cran package link] https://cran.r-project.org/package=loon.shiny
description from the author/vignette
Package ‘shiny’ provides interactive web applications in R. Package ‘loon’ is an interactive toolkit engaged in open-ended, creative and unscripted data exploration. The ‘loon.shiny’ package can take ‘loon’ widgets and display a selfsame ‘shiny’ app.
loon: Interactive Statistical Data Visualization
tags: #plot #data analysis
[cran package link] https://cran.r-project.org//package=loon
description from the author/vignette
An extendable toolkit for interactive data visualization and exploration.
rio: A Swiss-Army Knife for Data I/O
tags: #data input
[cran package link] https://cran.r-project.org/package=rio
description from the author/vignette
Streamlined data import and export by making assumptions that the user is probably willing to make: ‘import()’ and ‘export()’ determine the data structure from the file extension, reasonable defaults are used for data import and export (e.g., ‘stringsAsFactors=FALSE’), web-based import is natively supported (including from SSL/HTTPS), compressed files can be read directly without explicit decompression, and fast import packages are used where appropriate. An additional convenience function, ‘convert()’, provides a simple method for converting between file types.
tabxplor: User-Friendly Tables with Color Helpers for Data Exploration
tags: #plots #tables
[cran package link] https://cran.r-project.org/package=tabxplor
description from the author/vignette
Make it easy to deal with multiple cross-tables in data exploration, by creating them, manipulating them, and adding color helpers to highlight important informations. All functions are “tidy”, pipe-friendly, and render data frames which can be easily manipulated. Tables can be exported to Excel and in html with formats and colors.
groupdata2: Creating Groups from Data
tags: #tables #data
[cran package link] https://cran.r-project.org//package=groupdata2
description from the author/vignette
hods for dividing data into groups. Create balanced partitions and cross-validation folds. Perform time series windowing and general grouping and splitting of data. Balance existing groups with up- and downsampling or collapse them to fewer groups.
conjurer: A Parametric Method for Generating Synthetic Data
tags: #plot #colors
[cran package link] https://cran.r-project.org//package=conjurer
description from the author/vignette
Builds synthetic data applicable across multiple domains. This package also provides flexibility to control data distribution to make it relevant to many industry examples
owidR: A Package for Importing Data from Our World in Data
tags: #data #statistics
[cran package link] https://cran.r-project.org/package=owidR
description from the author/vignette
Scrapes data from the Our World in Data website to offer easy to use functions for searching for datasets and downloading them into R.
tidycharts: Generate Tidy Charts Inspired by ‘IBCS’
tags: #plots
[cran package link] https://cran.r-project.org/package=tidycharts
description from the author/vignette
There is a wide range of R packages created for data visualization, but still, there was no simple and easily accessible way to create clean and transparent charts - up to now. The ‘tidycharts’ package enables the user to generate charts compliant with International Business Communication Standards (‘IBCS’). It means unified bar widths, colors, chart sizes, etc. Creating homogeneous reports has never been that easy! Additionally, users can apply semantic notation to indicate different data scenarios (plan, budget, forecast). What’s more, it is possible to customize the charts by creating a personal color pallet with the possibility of switching to default options after the experiments. We wanted the package to be helpful in writing reports, so we also made joining charts in a one, clear image possible. All charts are generated in SVG format and can be shown in the ‘RStudio’ viewer pane or exported to HTML output of ‘knitr’/‘markdown’.
EXTRA
tidydice: simulates rolling a dice and flipping a coin
tags: #teaching #fun
[cran package link] https://cran.r-project.org//package=tidydice
description from the author/vignette
This package simulates rolling a dice and flipping a coin. Each experiment generates a tibble. Dice rolls and coin flips are simulated using sample(). The properties of the dice can be changed, like the number of sides. A coin flip is simulated using a two sided dice. Experiments can be combined with the pipe-operator.
tiling:Polygon Tiling Examples
tags: #arts #fun
[cran package link] https://cran.rstudio.com/web/package=gridpattern
description from the author/vignette
Several uniform regular polygon tiling patterns can be achieved by use of grid.pattern_regular_polygon() plus occasionally grid.polygon() to set a background color. This vignette highlights several such tiling patterns plus a couple notable non-uniform tiling patterns.
lingtypology: Linguistic Typology and Mapping
tags: #linguistic mapping
[cran package link] https://cran.r-project.org/package=lingtypology
description from the author/vignette
Provides R with the Glottolog database https://glottolog.org/ and some more abilities for purposes of linguistic mapping. The Glottolog database contains the catalogue of languages of the world. This package helps researchers to make a linguistic maps, using philosophy of the Cross-Linguistic Linked Data project https://clld.org/, which allows for while at the same time facilitating uniform access to the data across publications. A tutorial for this package is available on GitHub pages https://docs.ropensci.org/lingtypology/ and package vignette. Maps created by this package can be used both for the investigation and linguistic teaching. In addition, package provides an ability to download data from typological databases such as WALS, AUTOTYP and some others and to create your own database website.