yardstick - Tidy Characterizations of Model Performance
Tidy tools for quantifying how well model fits to a data set such as confusion matrices, class probability curve summaries, and regression metrics (e.g., RMSE).
Last updated 7 hours ago
15.03 score 382 stars 59 dependents 2.2k scripts 25k downloadspaletteer - Comprehensive Collection of Color Palettes
The choices of color palettes in R can be quite overwhelming with palettes spread over many packages with many different API's. This packages aims to collect all color palettes across the R ecosystem under the same package with a streamlined API.
Last updated 7 months ago
color-palettepalettes
14.01 score 952 stars 22 dependents 7.0k scripts 192k downloadsprismatic - Color Manipulation Tools
Manipulate and visualize colors in a intuitive, low-dependency and functional way.
Last updated 2 months ago
colorcolor-manipulationcolour
11.86 score 138 stars 29 dependents 428 scripts 190k downloadstidypredict - Run Predictions Inside the Database
It parses a fitted 'R' model object, and returns a formula in 'Tidy Eval' code that calculates the predictions. It works with several databases back-ends because it leverages 'dplyr' and 'dbplyr' for the final 'SQL' translation of the algorithm. It currently supports lm(), glm(), randomForest(), ranger(), earth(), xgb.Booster.complete(), cubist(), and ctree() models.
Last updated 1 months ago
dbplyrdplyrpurrrrlang
11.19 score 259 stars 2 dependents 251 scripts 1.4k downloadslime - Local Interpretable Model-Agnostic Explanations
When building complex models, it is often difficult to explain why the model should be trusted. While global measures such as accuracy are useful, they cannot be used for explaining why a model made a specific prediction. 'lime' (a port of the 'lime' 'Python' package) is a method for explaining the outcome of black box models by fitting a local model around the point in question an perturbations of this point. The approach is described in more detail in the article by Ribeiro et al. (2016) <arXiv:1602.04938>.
Last updated 2 years ago
caretmodel-checkingmodel-evaluationmodelingcpp
11.05 score 486 stars 1 dependents 732 scripts 1.7k downloadstextrecipes - Extra 'Recipes' for Text Processing
Converting text to numerical features requires specifically created procedures, which are implemented as steps according to the 'recipes' package. These steps allows for tokenization, filtering, counting (tf and tfidf) and feature hashing.
Last updated 2 months ago
cpp
10.70 score 160 stars 1 dependents 992 scripts 787 downloadstextdata - Download and Load Various Text Datasets
Provides a framework to download, parse, and store text datasets on the disk and load them when needed. Includes various sentiment lexicons and labeled text data sets for classification and analysis.
Last updated 8 months ago
text-datasets
9.76 score 75 stars 1 dependents 1.3k scripts 4.0k downloadsthemis - Extra Recipes Steps for Dealing with Unbalanced Data
A dataset with an uneven number of cases in each class is said to be unbalanced. Many models produce a subpar performance on unbalanced datasets. A dataset can be balanced by increasing the number of minority cases using SMOTE 2011 <doi:10.48550/arXiv.1106.1813>, BorderlineSMOTE 2005 <doi:10.1007/11538059_91> and ADASYN 2008 <https://ieeexplore.ieee.org/document/4633969>. Or by decreasing the number of majority cases using NearMiss 2003 <https://www.site.uottawa.ca/~nat/Workshop2003/jzhang.pdf> or Tomek link removal 1976 <https://ieeexplore.ieee.org/document/4309452>.
Last updated 7 hours ago
9.73 score 143 stars 1 dependents 1.1k scripts 5.0k downloadsrules - Model Wrappers for Rule-Based Models
Bindings for additional models for use with the 'parsnip' package. Models include prediction rule ensembles (Friedman and Popescu, 2008) <doi:10.1214/07-AOAS148>, C5.0 rules (Quinlan, 1992 ISBN: 1558602380), and Cubist (Kuhn and Johnson, 2013) <doi:10.1007/978-1-4614-6849-3>.
Last updated 3 months ago
9.47 score 40 stars 1 dependents 20k scripts 830 downloadsembed - Extra Recipes for Encoding Predictors
Predictors can be converted to one or more numeric representations using a variety of methods. Effect encodings using simple generalized linear models <doi:10.48550/arXiv.1611.09477> or nonlinear models <doi:10.48550/arXiv.1604.06737> can be used. There are also functions for dimension reduction and other approaches.
Last updated 7 hours ago
9.17 score 142 stars 1.1k scripts 1.2k downloadssparsevctrs - Sparse Vectors for Use in Data Frames
Provides sparse vectors powered by ALTREP (Alternative Representations for R Objects) that behave like regular vectors, and can thus be used in data frames. Also provides tools to convert between sparse matrices and data frames with sparse columns and functions to interact with sparse vectors.
Last updated 10 hours ago
8.27 score 13 stars 30 dependents 21 scripts 542 downloadsdiscrim - Model Wrappers for Discriminant Analysis
Bindings for additional classification models for use with the 'parsnip' package. Models include flavors of discriminant analysis, such as linear (Fisher (1936) <doi:10.1111/j.1469-1809.1936.tb02137.x>), regularized (Friedman (1989) <doi:10.1080/01621459.1989.10478752>), and flexible (Hastie, Tibshirani, and Buja (1994) <doi:10.1080/01621459.1994.10476866>), as well as naive Bayes classifiers (Hand and Yu (2007) <doi:10.1111/j.1751-5823.2001.tb00465.x>).
Last updated 3 months ago
8.02 score 28 stars 1 dependents 992 scripts 955 downloadsemoji - Data and Function to Work with Emojis
Contains data about emojis with relevant metadata, and functions to work with emojis when they are in strings.
Last updated 3 months ago
7.79 score 28 stars 2 dependents 306 scripts 1.2k downloadsggpage - Creates Page Layout Visualizations
Facilitates the creation of page layout visualizations in which words are represented as rectangles with sizes relating to the length of the words. Which then is divided in lines and pages for easy overview of up to quite large texts.
Last updated 6 years ago
data-visualizationdatavisualizationdatavizggplot2
7.53 score 340 stars 66 scripts 236 downloadstidyclust - A Common API to Clustering
A common interface to specifying clustering models, in the same style as 'parsnip'. Creates unified interface across different functions and computational engines.
Last updated 7 months ago
7.10 score 110 stars 139 scripts 1.0k downloadsmodelenv - Provide Tools to Register Models for Use in 'tidymodels'
An developer focused, low dependency package in 'tidymodels' that provides functions to register how models are to be used. Functions to register models are complimented with accessor functions to retrieve registered model information to aid in model fitting and error handling.
Last updated 3 months ago
7.01 score 4 stars 43 dependents 1 scripts 20k downloadsorbital - Predict with 'tidymodels' Workflows in Databases
Turn 'tidymodels' workflows into objects containing the sufficient sequential equations to perform predictions. These smaller objects allow for low dependency prediction locally or directly in databases.
Last updated 1 months ago
6.22 score 25 stars 11 scripts 353 downloadsfastTextR - An Interface to the 'fastText' Library
An interface to the 'fastText' library <https://github.com/facebookresearch/fastText>. The package can be used for text classification and to learn word vectors. An example how to use 'fastTextR' can be found in the 'README' file.
Last updated 1 years ago
cpp
5.50 score 4 stars 2 dependents 44 scripts 376 downloadsfriends - The Entire Transcript from Friends in Tidy Format
The complete scripts from the American sitcom Friends in tibble format. Use this package to practice data wrangling, text analysis and network analysis.
Last updated 3 years ago
5.03 score 63 stars 34 scripts 186 downloadsmodeldatatoo - More Data Sets Useful for Modeling Examples
More data sets used for demonstrating or testing model-related packages are contained in this package. The data sets are downloaded and cached, allowing for more and bigger data sets.
Last updated 10 months ago
4.85 score 7 stars 34 scripts 161 downloadshcandersenr - H.C. Andersens Fairy Tales
Texts for H.C. Andersens fairy tales, ready for text analysis. Fairy tales in German, Danish, English, Spanish and French.
Last updated 5 years ago
andersens-fairy-talestext-mining
4.62 score 10 stars 83 scripts 138 downloadswalmartAPI - Walmart Open API Wrapper
Provides API access to the Walmart Open API <https://developer.walmartlabs.com/>, that contains data about stores, Value of the day and products which includes names, sale prices, shipping rates and taxonomies.
Last updated 5 years ago
walmart-api
4.39 score 19 stars 13 scripts 139 downloadsextrasteps - More Miscellaneous Steps for the 'recipes' Package
Contains additional miscellaneous steps for the 'recipes' package. These steps are useful, but doesn't have a good home in other 'recipes' packages or its extensions.
Last updated 4 months ago
4.32 score 10 stars 14 scripts 149 downloadswordsalad - Provide Tools to Extract and Analyze Word Vectors
Provides access to various word embedding methods (GloVe, fasttext and word2vec) to extract word vectors using a unified framework to increase reproducibility and correctness.
Last updated 4 years ago
3.60 score 8 stars 9 scripts 155 downloadsmethcon5 - Identify and Rank CpG DNA Methylation Conservation Along the Human Genome
Identify and rank CpG DNA methylation conservation along the human genome. Specifically it includes bootstrapping methods to provide ranking which should adjust for the differences in length as without it short regions tend to get higher conservation scores.
Last updated 14 hours ago
2.70 score 6 scripts 178 downloads