Close Menu
  • Home
  • AI
  • Big Data
  • Cloud Computing
  • iOS Development
  • IoT
  • IT/ Cybersecurity
  • Tech
    • Nanotechnology
    • Green Technology
    • Apple
    • Software Development
    • Software Engineering

Subscribe to Updates

Get the latest technology news from Bigteetechhub about IT, Cybersecurity and Big Data.

    What's Hot

    ios – SwiftUI: Zoom-navigation-transitions not working in tabViewBottomAccessory with a view model

    December 9, 2025

    Computational Approach Stabilizes Metallene for Nanotech

    December 9, 2025

    Sophos Firewall v22 is now available – Sophos News

    December 9, 2025
    Facebook X (Twitter) Instagram
    Facebook X (Twitter) Instagram
    Big Tee Tech Hub
    • Home
    • AI
    • Big Data
    • Cloud Computing
    • iOS Development
    • IoT
    • IT/ Cybersecurity
    • Tech
      • Nanotechnology
      • Green Technology
      • Apple
      • Software Development
      • Software Engineering
    Big Tee Tech Hub
    Home»Software Development»OpenAI starts creating new benchmarks that more accurately evaluate AI models across different languages and cultures
    Software Development

    OpenAI starts creating new benchmarks that more accurately evaluate AI models across different languages and cultures

    big tee tech hubBy big tee tech hubNovember 10, 2025002 Mins Read
    Share Facebook Twitter Pinterest Copy Link LinkedIn Tumblr Email Telegram WhatsApp
    Follow Us
    Google News Flipboard
    OpenAI starts creating new benchmarks that more accurately evaluate AI models across different languages and cultures
    Share
    Facebook Twitter LinkedIn Pinterest Email Copy Link


    Screenshot 2025 11 04 150049Screenshot 2025 11 04 150049

    English is only spoken by about 20% of the world’s population, yet existing AI benchmarks for multilingual models are falling short. For example, MMMLU has become saturated to the point that top models are clustering near high scores, and OpenAI says this makes them a poor indicator of real progress.

    Additionally, the existing multilingual benchmarks focus on translation and multiple choice tasks and don’t necessarily accurately measure how well the model understands regional context, culture, and history, OpenAI explained.

    To remedy these issues, OpenAI is building new benchmarks for different languages and regions of the world, starting with India, its second largest market. The new benchmark, IndQA, will “evaluate how well AI models understand and reason about questions that matter in Indian languages, across a wide range of cultural domains.”

    There are 22 official languages in India, seven of which are spoken by at least 50 million people. IndQA includes 2,278 questions across 12 different languages and 10 cultural domains, and was created with help from 261 domain experts from the country, including journalists, linguists, scholars, artists, and industry practitioners.

    The languages covered include Bengali, English, Hindi, Hinglish, Kannada, Marathi, Odia, Telugu, Gujarati, Malayalam, Punjabi, and Tamil. Hinglish is a mix between English and Hindi that OpenAI decided to include to account for code-switching in conversations.

    The cultural domains covered include Architecture & Design, Arts & Culture, Everyday Life, Food & Cuisine, History, Law & Ethics, Literature & Linguistics, Media & Entertainment, Religion & Spirituality, and Sports & Recreation.

    According to OpenAI, each datapoint contains a culturally grounded prompt in one of the Indian languages, an English translation to make it auditable, rubric criteria for grading, and an expected answer from the domain experts.

    OpenAI says that it plans to create similar benchmarks for other regions of the world, using IndQA as inspiration.

    “IndQA style questions are especially valuable in languages or cultural domains that are poorly covered by existing AI benchmarks. Creating similar benchmarks to IndQA can help AI research labs learn more about languages and domains models struggle with today, and provide a north star for improvements in the future,” the company wrote in a blog post.



    Source link

    accurately benchmarks Creating cultures evaluate languages models OpenAI starts
    Follow on Google News Follow on Flipboard
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
    tonirufai
    big tee tech hub
    • Website

    Related Posts

    IBM to acquire Confluent for $11 billion

    December 9, 2025

    Creating a Llama or GPT Model for Next-Token Prediction

    December 9, 2025

    Your Essential Guide to Placing Smarter Bets

    December 8, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Editors Picks

    ios – SwiftUI: Zoom-navigation-transitions not working in tabViewBottomAccessory with a view model

    December 9, 2025

    Computational Approach Stabilizes Metallene for Nanotech

    December 9, 2025

    Sophos Firewall v22 is now available – Sophos News

    December 9, 2025

    The Real Magic of the Season: AI-Powered Workplaces

    December 9, 2025
    About Us
    About Us

    Welcome To big tee tech hub. Big tee tech hub is a Professional seo tools Platform. Here we will provide you only interesting content, which you will like very much. We’re dedicated to providing you the best of seo tools, with a focus on dependability and tools. We’re working to turn our passion for seo tools into a booming online website. We hope you enjoy our seo tools as much as we enjoy offering them to you.

    Don't Miss!

    ios – SwiftUI: Zoom-navigation-transitions not working in tabViewBottomAccessory with a view model

    December 9, 2025

    Computational Approach Stabilizes Metallene for Nanotech

    December 9, 2025

    Subscribe to Updates

    Get the latest technology news from Bigteetechhub about IT, Cybersecurity and Big Data.

      • About Us
      • Contact Us
      • Disclaimer
      • Privacy Policy
      • Terms and Conditions
      © 2025 bigteetechhub.All Right Reserved

      Type above and press Enter to search. Press Esc to cancel.