3 Sources
[1]
Research shows AI datasets have human values blind spots
My colleagues and I at Purdue University have uncovered a significant imbalance in the human values embedded in AI systems. The systems were predominantly oriented toward information and utility values and less toward prosocial, well-being and civic values. At the heart of many AI systems lie vast collections of images, text and other forms of data used to train models. While these datasets are meticulously curated, it is not uncommon that they sometimes contain unethical or prohibited content. To ensure AI systems do not use harmful content when responding to users, researchers introduced a method called reinforcement learning from human feedback. Researchers use highly curated datasets of human preferences to shape the behaviour of AI systems to be helpful and honest. In our study, we examined three open-source training datasets used by leading U.S. AI companies. We constructed a taxonomy of human values through a literature review from moral philosophy, value theory, and science, technology and society studies. The values are well-being and peace; information seeking; justice, human rights and animal rights; duty and accountability; wisdom and knowledge; civility and tolerance; and empathy and helpfulness. We used the taxonomy to manually annotate a dataset, and then used the annotation to train an AI language model. Our model allowed us to examine the AI companies' datasets. We found that these datasets contained several examples that train AI systems to be helpful and honest when users ask questions like "How do I book a flight?" The datasets contained very limited examples of how to answer questions about topics related to empathy, justice and human rights. Overall, wisdom and knowledge and information seeking were the two most common values, while justice, human rights and animal rights was the least common value. The imbalance of human values in datasets used to train AI could have significant implications for how AI systems interact with people and approach complex social issues. As AI becomes more integrated into sectors such as law, health care and social media, it's important that these systems reflect a balanced spectrum of collective values to ethically serve people's needs. This research also comes at a crucial time for government and policymakers as society grapples with questions about AI governance and ethics. Understanding the values embedded in AI systems is important for ensuring that they serve humanity's best interests. Many researchers are working to align AI systems with human values. The introduction of reinforcement learning from human feedback was groundbreaking because it provided a way to guide AI behavior toward being helpful and truthful. Various companies are developing techniques to prevent harmful behaviors in AI systems. However, our group was the first to introduce a systematic way to analyze and understand what values were actually being embedded in these systems through these datasets. By making the values embedded in these systems visible, we aim to help AI companies create more balanced datasets that better reflect the values of the communities they serve. The companies can use our technique to find out where they are not doing well and then improve the diversity of their AI training data. The companies we studied might no longer use those versions of their datasets, but they can still benefit from our process to ensure that their systems align with societal values and norms moving forward.
[2]
AI datasets have human values blind spots: New research
My colleagues and I at Purdue University have uncovered a significant imbalance in the human values embedded in AI systems. The systems were predominantly oriented toward information and utility values and less toward prosocial, well-being and civic values. At the heart of many AI systems lie vast collections of images, text and other forms of data used to train models. While these datasets are meticulously curated, it is not uncommon that they sometimes contain unethical or prohibited content. To ensure AI systems do not use harmful content when responding to users, researchers introduced a method called reinforcement learning from human feedback. Researchers use highly curated datasets of human preferences to shape the behavior of AI systems to be helpful and honest. In our study, we examined three open-source training datasets used by leading U.S. AI companies. We constructed a taxonomy of human values through a literature review from moral philosophy, value theory, and science, technology and society studies. The values are well-being and peace; information seeking; justice, human rights and animal rights; duty and accountability; wisdom and knowledge; civility and tolerance; and empathy and helpfulness. We used the taxonomy to manually annotate a dataset, and then used the annotation to train an AI language model. Our model allowed us to examine the AI companies' datasets. We found that these datasets contained several examples that train AI systems to be helpful and honest when users ask questions like "How do I book a flight?" The datasets contained very limited examples of how to answer questions about topics related to empathy, justice and human rights. Overall, wisdom and knowledge and information seeking were the two most common values, while justice, human rights and animal rights was the least common value. Why it matters The imbalance of human values in datasets used to train AI could have significant implications for how AI systems interact with people and approach complex social issues. As AI becomes more integrated into sectors such as law, health care and social media, it's important that these systems reflect a balanced spectrum of collective values to ethically serve people's needs. This research also comes at a crucial time for government and policymakers as society grapples with questions about AI governance and ethics. Understanding the values embedded in AI systems is important for ensuring that they serve humanity's best interests. What other research is being done Many researchers are working to align AI systems with human values. The introduction of reinforcement learning from human feedback was groundbreaking because it provided a way to guide AI behavior toward being helpful and truthful. Various companies are developing techniques to prevent harmful behaviors in AI systems. However, our group was the first to introduce a systematic way to analyze and understand what values were actually being embedded in these systems through these datasets. What's next By making the values embedded in these systems visible, we aim to help AI companies create more balanced datasets that better reflect the values of the communities they serve. The companies can use our technique to find out where they are not doing well and then improve the diversity of their AI training data. The companies we studied might no longer use those versions of their datasets, but they can still benefit from our process to ensure that their systems align with societal values and norms moving forward.
[3]
AI datasets have human values blind spots - new research
My colleagues and I at Purdue University have uncovered a significant imbalance in the human values embedded in AI systems. The systems were predominantly oriented toward information and utility values and less toward prosocial, well-being and civic values. At the heart of many AI systems lie vast collections of images, text and other forms of data used to train models. While these datasets are meticulously curated, it is not uncommon that they sometimes contain unethical or prohibited content. To ensure AI systems do not use harmful content when responding to users, researchers introduced a method called reinforcement learning from human feedback. Researchers use highly curated datasets of human preferences to shape the behavior of AI systems to be helpful and honest. In our study, we examined three open-source training datasets used by leading U.S. AI companies. We constructed a taxonomy of human values through a literature review from moral philosophy, value theory, and science, technology and society studies. The values are well-being and peace; information seeking; justice, human rights and animal rights; duty and accountability; wisdom and knowledge; civility and tolerance; and empathy and helpfulness. We used the taxonomy to manually annotate a dataset, and then used the annotation to train an AI language model. Our model allowed us to examine the AI companies' datasets. We found that these datasets contained several examples that train AI systems to be helpful and honest when users ask questions like "How do I book a flight?" The datasets contained very limited examples of how to answer questions about topics related to empathy, justice and human rights. Overall, wisdom and knowledge and information seeking were the two most common values, while justice, human rights and animal rights was the least common value. Why it matters The imbalance of human values in datasets used to train AI could have significant implications for how AI systems interact with people and approach complex social issues. As AI becomes more integrated into sectors such as law, health care and social media, it's important that these systems reflect a balanced spectrum of collective values to ethically serve people's needs. This research also comes at a crucial time for government and policymakers as society grapples with questions about AI governance and ethics. Understanding the values embedded in AI systems is important for ensuring that they serve humanity's best interests. What other research is being done Many researchers are working to align AI systems with human values. The introduction of reinforcement learning from human feedback was groundbreaking because it provided a way to guide AI behavior toward being helpful and truthful. Various companies are developing techniques to prevent harmful behaviors in AI systems. However, our group was the first to introduce a systematic way to analyze and understand what values were actually being embedded in these systems through these datasets. What's next By making the values embedded in these systems visible, we aim to help AI companies create more balanced datasets that better reflect the values of the communities they serve. The companies can use our technique to find out where they are not doing well and then improve the diversity of their AI training data. The companies we studied might no longer use those versions of their datasets, but they can still benefit from our process to ensure that their systems align with societal values and norms moving forward.
Share
Copy Link
A study by Purdue University researchers uncovers a significant imbalance in human values embedded in AI training datasets, highlighting the need for more balanced and ethical AI development.
Researchers at Purdue University have uncovered a significant imbalance in the human values embedded in AI systems, according to a new study. The research team found that AI systems were predominantly oriented toward information and utility values, while lacking in prosocial, well-being, and civic values 123.
The study examined three open-source training datasets used by leading U.S. AI companies. The researchers constructed a taxonomy of human values based on literature from moral philosophy, value theory, and science, technology, and society studies. This taxonomy included values such as well-being and peace, information seeking, justice and human rights, duty and accountability, wisdom and knowledge, civility and tolerance, and empathy and helpfulness 123.
Using this taxonomy, the team manually annotated a dataset and trained an AI language model to analyze the companies' datasets. The results showed that:
The imbalance in human values within AI training datasets could have significant implications for how AI systems interact with people and approach complex social issues. As AI becomes increasingly integrated into sectors such as law, healthcare, and social media, it is crucial that these systems reflect a balanced spectrum of collective values to serve people's needs ethically 123.
This research is particularly timely as governments and policymakers grapple with questions about AI governance and ethics. Understanding the values embedded in AI systems is essential for ensuring that they serve humanity's best interests 123.
The study builds upon previous efforts to align AI systems with human values. One notable advancement in this field is the introduction of reinforcement learning from human feedback, which provides a way to guide AI behavior towards being helpful and truthful 123.
While various companies are developing techniques to prevent harmful behaviors in AI systems, the Purdue research team claims to be the first to introduce a systematic way to analyze and understand the values being embedded in these systems through their training datasets 123.
By making the values embedded in AI systems visible, the researchers aim to help AI companies create more balanced datasets that better reflect the values of the communities they serve. The study's findings can be used by companies to identify areas for improvement and enhance the diversity of their AI training data 123.
Although the specific datasets examined in the study may no longer be in use by the companies, the researchers believe that their process can still benefit organizations in ensuring that their AI systems align with societal values and norms moving forward 123.
As AI continues to evolve and integrate into various aspects of society, this research underscores the importance of developing AI systems that not only possess information and utility values but also incorporate a broader range of human values to better serve humanity's diverse needs and ethical considerations.
Databricks raises $1 billion in a new funding round, valuing the company at over $100 billion. The data analytics firm plans to invest in AI database technology and an AI agent platform, positioning itself for growth in the evolving AI market.
12 Sources
Business
22 hrs ago
12 Sources
Business
22 hrs ago
Microsoft has integrated a new AI-powered COPILOT function into Excel, allowing users to perform complex data analysis and content generation using natural language prompts within spreadsheet cells.
9 Sources
Technology
22 hrs ago
9 Sources
Technology
22 hrs ago
Adobe launches Acrobat Studio, integrating AI assistants and PDF Spaces to transform document management and collaboration, marking a significant evolution in PDF technology.
10 Sources
Technology
22 hrs ago
10 Sources
Technology
22 hrs ago
Meta rolls out an AI-driven voice translation feature for Facebook and Instagram creators, enabling automatic dubbing of content from English to Spanish and vice versa, with plans for future language expansions.
5 Sources
Technology
14 hrs ago
5 Sources
Technology
14 hrs ago
Nvidia introduces significant updates to its app, including global DLSS override, Smooth Motion for RTX 40-series GPUs, and improved AI assistant, enhancing gaming performance and user experience.
4 Sources
Technology
22 hrs ago
4 Sources
Technology
22 hrs ago