Welcome
INCOMPLETE DRAFT
This textbook is an introduction to the fundamental concepts and practical programming skills from Data Science that are increasingly employed in a variety of language-centered fields and sub-fields applied to the task of quantitative text analysis. It is geared towards advanced undergraduates, graduate students, and researchers looking to expand their methodological toolbox.
The content is currently under development. Feedback is welcome and can be provided through the hypothes.is service. A toolbar interface to this service is located on the right sidebar. To register for a free account and join the “text_as_data” annotation group follow this link. Suggestions and changes that are incorporated will be acknowledged.
Author
Dr. Jerid Francom is Associate Professor of Spanish and Linguistics at Wake Forest University. His research focuses on the use of large-scale language archives (corpora) from a variety of sources (news, social media, and other internet sources) to better understand the linguistic and cultural similarities and differences between language varieties for both scholarly and pedagogical projects. He has published on topics including the development, annotation, and evaluation of linguistic corpora and analyzed corpora through corpus, psycholinguistic, and computational methodologies. He also has experience working with and teaching statistical programming with R.
License
This work by Jerid C. Francom is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 United States License.
Credits
Acknowledgements
TAD has been reviewed by and suggestions and changes incorporated based on the feedback through the TAD Hypothes.is group by the following people: Andrea Bowling, Caroline Brady, Declan Golsen, Asya Little, Claudia Valdez, …
Build information
This version of the textbook was built with R version 4.1.2 (2021-11-01) on macOS Big Sur 10.16 with the following packages:
package | version | source |
---|---|---|
assertthat | 0.2.1 | CRAN (R 4.1.0) |
backports | 1.4.1 | CRAN (R 4.1.2) |
bookdown | 0.26 | CRAN (R 4.1.2) |
broom | 0.8.0 | CRAN (R 4.1.2) |
bslib | 0.3.1 | CRAN (R 4.1.0) |
cachem | 1.0.6 | CRAN (R 4.1.0) |
cellranger | 1.1.0 | CRAN (R 4.1.0) |
cli | 3.3.0 | CRAN (R 4.1.2) |
colorspace | 2.0.3 | CRAN (R 4.1.2) |
crayon | 1.5.1 | CRAN (R 4.1.2) |
DBI | 1.1.2 | CRAN (R 4.1.0) |
dbplyr | 2.1.1 | CRAN (R 4.1.0) |
digest | 0.6.29 | CRAN (R 4.1.0) |
downlit | 0.4.0 | CRAN (R 4.1.0) |
dplyr | 1.0.9 | CRAN (R 4.1.2) |
DT | 0.22 | CRAN (R 4.1.2) |
ellipsis | 0.3.2 | CRAN (R 4.1.0) |
evaluate | 0.15 | CRAN (R 4.1.2) |
fansi | 1.0.3 | CRAN (R 4.1.2) |
fastmap | 1.1.0 | CRAN (R 4.1.0) |
forcats | 0.5.1 | CRAN (R 4.1.0) |
fs | 1.5.2 | CRAN (R 4.1.0) |
generics | 0.1.2 | CRAN (R 4.1.2) |
ggplot2 | 3.3.5 | CRAN (R 4.1.0) |
glue | 1.6.2 | CRAN (R 4.1.2) |
gtable | 0.3.0 | CRAN (R 4.1.0) |
haven | 2.5.0 | CRAN (R 4.1.2) |
here | 1.0.1 | CRAN (R 4.1.0) |
hms | 1.1.1 | CRAN (R 4.1.0) |
htmltools | 0.5.2 | CRAN (R 4.1.0) |
htmlwidgets | 1.5.4 | CRAN (R 4.1.0) |
httr | 1.4.2 | CRAN (R 4.1.0) |
janeaustenr | 0.1.5 | CRAN (R 4.0.2) |
jquerylib | 0.1.4 | CRAN (R 4.1.0) |
jsonlite | 1.8.0 | CRAN (R 4.1.2) |
knitr | 1.39 | CRAN (R 4.1.2) |
lattice | 0.20.45 | CRAN (R 4.1.2) |
lifecycle | 1.0.1 | CRAN (R 4.1.0) |
lubridate | 1.8.0 | CRAN (R 4.1.0) |
magrittr | 2.0.3 | CRAN (R 4.1.2) |
Matrix | 1.4.1 | CRAN (R 4.1.2) |
memoise | 2.0.1 | CRAN (R 4.1.0) |
modelr | 0.1.8 | CRAN (R 4.1.0) |
munsell | 0.5.0 | CRAN (R 4.1.0) |
pacman | 0.5.1 | CRAN (R 4.1.0) |
pillar | 1.7.0 | CRAN (R 4.1.2) |
pkgconfig | 2.0.3 | CRAN (R 4.1.0) |
purrr | 0.3.4 | CRAN (R 4.1.0) |
R6 | 2.5.1 | CRAN (R 4.1.0) |
Rcpp | 1.0.8.3 | CRAN (R 4.1.2) |
readr | 2.1.2 | CRAN (R 4.1.2) |
readxl | 1.4.0 | CRAN (R 4.1.2) |
reprex | 2.0.1 | CRAN (R 4.1.0) |
rlang | 1.0.2 | CRAN (R 4.1.2) |
rmarkdown | 2.14 | CRAN (R 4.1.2) |
rprojroot | 2.0.3 | CRAN (R 4.1.2) |
rstudioapi | 0.13 | CRAN (R 4.1.0) |
rvest | 1.0.2 | CRAN (R 4.1.0) |
sass | 0.4.1 | CRAN (R 4.1.2) |
scales | 1.2.0 | CRAN (R 4.1.2) |
sessioninfo | 1.2.2 | CRAN (R 4.1.0) |
SnowballC | 0.7.0 | CRAN (R 4.1.0) |
stringi | 1.7.6 | CRAN (R 4.1.0) |
stringr | 1.4.0 | CRAN (R 4.1.0) |
tibble | 3.1.6 | CRAN (R 4.1.0) |
tidyr | 1.2.0 | CRAN (R 4.1.2) |
tidyselect | 1.1.2 | CRAN (R 4.1.2) |
tidytext | 0.3.2 | CRAN (R 4.1.0) |
tidyverse | 1.3.1 | CRAN (R 4.1.0) |
tokenizers | 0.2.1 | CRAN (R 4.1.0) |
tzdb | 0.3.0 | CRAN (R 4.1.2) |
utf8 | 1.2.2 | CRAN (R 4.1.0) |
vctrs | 0.4.1 | CRAN (R 4.1.2) |
webshot | 0.5.3 | CRAN (R 4.1.2) |
withr | 2.5.0 | CRAN (R 4.1.2) |
xfun | 0.30 | CRAN (R 4.1.2) |
xml2 | 1.3.3 | CRAN (R 4.1.0) |
yaml | 2.3.5 | CRAN (R 4.1.2) |