Close Menu
  • Home
  • AI
  • Big Data
  • Cloud Computing
  • iOS Development
  • IoT
  • IT/ Cybersecurity
  • Tech
    • Nanotechnology
    • Green Technology
    • Apple
    • Software Development
    • Software Engineering

Subscribe to Updates

Get the latest technology news from Bigteetechhub about IT, Cybersecurity and Big Data.

    What's Hot

    SVS Engineers: Who are the people that test-drive your network?

    October 12, 2025

    macOS Sequoia (version 15) is now available for your Mac with some big upgrades

    October 12, 2025

    Building a real-time ICU patient analytics pipeline with AWS Lambda event source mapping

    October 12, 2025
    Facebook X (Twitter) Instagram
    Facebook X (Twitter) Instagram
    Big Tee Tech Hub
    • Home
    • AI
    • Big Data
    • Cloud Computing
    • iOS Development
    • IoT
    • IT/ Cybersecurity
    • Tech
      • Nanotechnology
      • Green Technology
      • Apple
      • Software Development
      • Software Engineering
    Big Tee Tech Hub
    Home»Artificial Intelligence»Posit AI Blog: Introducing the text package
    Artificial Intelligence

    Posit AI Blog: Introducing the text package

    big tee tech hubBy big tee tech hubOctober 12, 20250435 Mins Read
    Share Facebook Twitter Pinterest Copy Link LinkedIn Tumblr Email Telegram WhatsApp
    Follow Us
    Google News Flipboard
    Posit AI Blog: Introducing the text package
    Share
    Facebook Twitter LinkedIn Pinterest Email Copy Link


    AI-based language analysis has recently gone through a “paradigm shift” (Bommasani et al., 2021, p. 1), thanks in part to a new technique referred to as transformer language model (Vaswani et al., 2017, Liu et al., 2019). Companies, including Google, Meta, and OpenAI have released such models, including BERT, RoBERTa, and GPT, that have achieved unprecedented large improvements across most language tasks such as web search and sentiment analysis. While these language models are accessible in Python, and for typical AI tasks through HuggingFace, the R package text makes HuggingFace and state-of-the-art transformer language models accessible as social scientific pipelines in R.

    Introduction

    We developed the text package (Kjell, Giorgi & Schwartz, 2022) with two objectives in mind:
    To serve as a modular solution for downloading and using transformer language models. This, for example, includes transforming text to word embeddings as well as accessing common language model tasks such as text classification, sentiment analysis, text generation, question answering, translation and so on.
    To provide an end-to-end solution that is designed for human-level analyses including pipelines for state-of-the-art AI techniques tailored for predicting characteristics of the person that produced the language or eliciting insights about linguistic correlates of psychological attributes.

    This blog post shows how to install the text package, transform text to state-of-the-art contextual word embeddings, use language analysis tasks as well as visualize words in word embedding space.

    Installation and setting up a python environment

    The text package is setting up a python environment to get access to the HuggingFace language models. The first time after installing the text package you need to run two functions: textrpp_install() and textrpp_initialize().

    # Install text from CRAN
    install.packages("text")
    library(text)
    
    # Install text required python packages in a conda environment (with defaults)
    textrpp_install()
    
    # Initialize the installed conda environment
    # save_profile = TRUE saves the settings so that you do not have to run textrpp_initialize() again after restarting R
    textrpp_initialize(save_profile = TRUE)

    See the extended installation guide for more information.

    Transform text to word embeddings

    The textEmbed() function is used to transform text to word embeddings (numeric representations of text). The model argument enables you to set which language model to use from HuggingFace; if you have not used the model before, it will automatically download the model and necessary files.

    # Transform the text data to BERT word embeddings
    # Note: To run faster, try something smaller: model = 'distilroberta-base'.
    word_embeddings <- textEmbed(texts = "Hello, how are you doing?",
                                model = 'bert-base-uncased')
    word_embeddings
    comment(word_embeddings)

    The word embeddings can now be used for downstream tasks such as training models to predict related numeric variables (e.g., see the textTrain() and textPredict() functions).

    (To get token and individual layers output see the textEmbedRawLayers() function.)

    There are many transformer language models at HuggingFace that can be used for various language model tasks such as text classification, sentiment analysis, text generation, question answering, translation and so on. The text package comprises user-friendly functions to access these.

    classifications <- textClassify("Hello, how are you doing?")
    classifications
    comment(classifications)
    generated_text <- textGeneration("The meaning of life is")
    generated_text

    For more examples of available language model tasks, for example, see textSum(), textQA(), textTranslate(), and textZeroShot() under Language Analysis Tasks.

    Visualizing words in the text package is achieved in two steps: First with a function to pre-process the data, and second to plot the words including adjusting visual characteristics such as color and font size.
    To demonstrate these two functions we use example data included in the text package: Language_based_assessment_data_3_100. We show how to create a two-dimensional figure with words that individuals have used to describe their harmony in life, plotted according to two different well-being questionnaires: the harmony in life scale and the satisfaction with life scale. So, the x-axis shows words that are related to low versus high harmony in life scale scores, and the y-axis shows words related to low versus high satisfaction with life scale scores.

    word_embeddings_bert <- textEmbed(Language_based_assessment_data_3_100,
                                      aggregation_from_tokens_to_word_types = "mean",
                                      keep_token_embeddings = FALSE)
    
    # Pre-process the data for plotting
    df_for_plotting <- textProjection(Language_based_assessment_data_3_100$harmonywords, 
                                      word_embeddings_bert$text$harmonywords,
                                      word_embeddings_bert$word_types,
                                      Language_based_assessment_data_3_100$hilstotal, 
                                      Language_based_assessment_data_3_100$swlstotal
    )
    
    # Plot the data
    plot_projection <- textProjectionPlot(
      word_data = df_for_plotting,
      y_axes = TRUE,
      p_alpha = 0.05,
      title_top = "Supervised Bicentroid Projection of Harmony in life words",
      x_axes_label = "Low vs. High HILS score",
      y_axes_label = "Low vs. High SWLS score",
      p_adjust_method = "bonferroni",
      points_without_words_size = 0.4,
      points_without_words_alpha = 0.4
    )
    plot_projection$final_plot
    Supervised Bicentroid Projection of Harmony in life words
    Supervised Bicentroid Projection of Harmony in life words

    This post demonstrates how to carry out state-of-the-art text analysis in R using the text package. The package intends to make it easy to access and use transformers language models from HuggingFace to analyze natural language. We look forward to your feedback and contributions toward making such models available for social scientific and other applications more typical of R users.

    • Bommasani et al. (2021). On the opportunities and risks of foundation models.
    • Kjell et al. (2022). The text package: An R-package for Analyzing and Visualizing Human Language Using Natural Language Processing and Deep Learning.
    • Liu et al (2019). Roberta: A robustly optimized bert pretraining approach.
    • Vaswaniet al (2017). Attention is all you need. Advances in Neural Information Processing Systems, 5998–6008

    Enjoy this blog? Get notified of new posts by email:

    Posts also available at r-bloggers

    Corrections

    If you see mistakes or want to suggest changes, please create an issue on the source repository.

    Reuse

    Text and figures are licensed under Creative Commons Attribution CC BY 4.0. Source code is available at https://github.com/OscarKjell/ai-blog, unless otherwise noted. The figures that have been reused from other sources don’t fall under this license and can be recognized by a note in their caption: “Figure from …”.

    Citation

    For attribution, please cite this work as

    Kjell, et al. (2022, Oct. 4). Posit AI Blog: Introducing the text package. Retrieved from 

    BibTeX citation

    @misc{kjell2022introducing,
      author = {Kjell, Oscar and Giorgi, Salvatore and Schwartz, H Andrew},
      title = {Posit AI Blog: Introducing the text package},
      url = {},
      year = {2022}
    }



    Source link

    Blog Introducing package Posit text
    Follow on Google News Follow on Flipboard
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
    tonirufai
    big tee tech hub
    • Website

    Related Posts

    Data Reliability Explained | Databricks Blog

    October 12, 2025

    Building connected data ecosystems for AI at scale

    October 11, 2025

    Control Codegen Spend – O’Reilly

    October 10, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Editors Picks

    SVS Engineers: Who are the people that test-drive your network?

    October 12, 2025

    macOS Sequoia (version 15) is now available for your Mac with some big upgrades

    October 12, 2025

    Building a real-time ICU patient analytics pipeline with AWS Lambda event source mapping

    October 12, 2025

    The Download: Our bodies’ memories, and Traton’s electric trucks

    October 12, 2025
    Advertisement
    About Us
    About Us

    Welcome To big tee tech hub. Big tee tech hub is a Professional seo tools Platform. Here we will provide you only interesting content, which you will like very much. We’re dedicated to providing you the best of seo tools, with a focus on dependability and tools. We’re working to turn our passion for seo tools into a booming online website. We hope you enjoy our seo tools as much as we enjoy offering them to you.

    Don't Miss!

    SVS Engineers: Who are the people that test-drive your network?

    October 12, 2025

    macOS Sequoia (version 15) is now available for your Mac with some big upgrades

    October 12, 2025

    Subscribe to Updates

    Get the latest technology news from Bigteetechhub about IT, Cybersecurity and Big Data.

      • About Us
      • Contact Us
      • Disclaimer
      • Privacy Policy
      • Terms and Conditions
      © 2025 bigteetechhub.All Right Reserved

      Type above and press Enter to search. Press Esc to cancel.