Second in our Ethics and AI course interview series. Dr. Nakeema Stefflbauer interviews Irene Solaiman, Policy Director at Hugging Face, where she researches social impact and public policy. In addition to advising responsible AI initiatives at OECD and IEEE, Ms. Solaiman leads many discussions on AI value alignment, responsible releases, and combating misuse, and malicious use. Working in a number of high impact AI policy fields, she has advised decision makers at corporations on autonomous decision-making and privacy.
Irene Solaiman (left) and Dr. Nakeema Stefflbauer (right)
About 110 students attended. They had already read Ms. Solaiman’s research, heard her podcast at The Gradient, so they were very enthusiastic to read her new Wired piece on the spectrum of choices from open and closed source models. Appreciating Dr. Stefflbauer and her specific focus on global impacts, students' responses to this conversation range from issues affecting their global communities at home as well as the future of generative models on the tech industry. After the interview, Dr. Stefflbauer commented:
“It is all too rare to speak with game-changing ethicists who’ve lived the startup life as well as doing slower-moving government policy work. Irene has the unique perspective of trying to steer the new technology ship away from harm for marginalized communities and local governments, and she understands exactly what a tough balancing act that can be, especially for women of color.
Before we shared the interview, Irene told me; “I don’t want to shame students, regardless of their choices.” I think that reflects the challenge that several students also raise.
Here are some of the group’s reflections submitted by students after listening:
“Thank you, Irene Solaiman, for this great talk! We loved your nuanced view on open vs. closed source. Please allow us to tell you about ourselves. We’re a group of African American and international computer science students who understand the argument for open source as addressing the problem of lack of transparency in corporate models, especially where there are inflated corporate claims about “capabilities” and supposedly “emergent properties,”which often turn out to have been in the TRAINING data. We’d like to think we wouldn’t make such claims in industry, since we and everyone else, we hope, has been taught first thing that the data for training and testing has to be separate!!! Even more importantly, there’s the problem that these corporations scrape unconsented data for their models and no one can see it, because the models are closed. No one knows what exactly they stole, especially if they say something like “we use a combination of corpuses like Common Crawl and LAION, which like to claim they’re getting consent after the fact. That said about the importance of corporate transparency, we’re keen to keep our datasets limited access to protect our research and the populations we serve in our countries. Some of us are also working on Dr. Harriett Jernigan’s BlackEnglishes StanfordHAI project this summer and this project is limited to academic researchers to protect the authors of this data and the populations they serve. Closed source can sometimes be the most just!”
“Dear Ms. Solaiman, thank you so much for your wide-ranging talk. For us, as a group of Bangladeshi American and Bangladeshi national students, we were so excited to hear you had just been in Bangladesh and were interested to know your thoughts on the appropriateness of AI in Bangladeshi communities. We serve a variety of communities where AI has been somewhat helpful, but still a lot of the cart-before-the-horse, since development must happen first. For example, AI in agriculture is helping with planning more than actual growing. Likewise, in other fields like disability and education, it seems most appropriate to teach students coding and statistical skills needed for building AI, rather than using it on students, especially not on disabled students. One of us serves disabled students who are learning to code and build AI this summer.”
“Thank you so much for your talk. We are a group of STEM and history students who wonder a lot about what it means “to make AI better?” For our communities, it means teaching us how to use these tools to build beneficial technologies. As for “cultural value alignment,” one of our research projects this quarter has been in trying to mitigate religious bias in models, specifically anti-Muslim bias. Surely, you’re right, there are systemic issues that must be addressed, but we’re less certain of how to do that, than how to build stuff. Our study of decolonial and anti-racist histories and theory seem to have the most immediate impact in our efforts at model design and correction. Our approach might sound incremental, but this is where we feel we can best contribute. As a group of students of color, first-gen, low-income students, safety is also a concern. We still talk about when a Stanford professor encouraged low-income students to protest by shutting down the San Mateo Bridge in 2015, and all those students wound up with felony charges and huge lawyer bills. We believe deeply in the need for systemic change, but most of us need to take safe routes to get there. Too often, even trying to change models is controversial for job safety.”
“Thanks for this awesome talk! Some of us are leaving the university soon and headed to government, so we were very excited to hear your optimism that “governments are now listening.” We definitely feel like we’re at the bottom of a huge mountain we have to climb to get governments to focus on “priorities” like having experts in technology, society, and policy help with the decision-making. So far, in our experience, we’re seeing government here in the US and around the globe be more willing to organize yet another committee to study AI than to dig into much of the already excellent research. AI policy is not as new as government thinks, our mentors were talking about this 30 years ago, and since large models, there’s been a lot of work for the last decade. Also there’s a disparity in who gets heard. The technical men among us at corporations find ourselves getting many more invitations to talk than the women, who also have technical training. There’s also a very LARGE obstacle when trying to serve government: all entry positions are so poorly paid or non-paid, most low-income people of color cannot afford to take these jobs to advise politicians. Stanford has a few summer grants to support us in DC., but not nearly enough funding for the many people who want to be involved in government. There is SO much work to be done in the US alone to update our lagging policies in just about every field, because these large models change the kinds of policies we need in medicine, politics, economics, law and everything else.”
“Dr. Stefflbauer and Ms. Solaiman, thanks so much for this exciting talk. We’re a group of students working on low-resource languages in the developing world, and we really see that our respective countries are lacking policies to help us protect our data. Reading about data sovereignty in class, we know that policies that address our specific communities’ needs are needed. Also, in answer to your questions about “How to reach communities to get consent?” We need NLP researchers from our communities, who are native speakers, to be more involved in our work and educating communities. Masakhane is one great example of university researchers working together with local populations to protect their data and rights and support their languages. European policy isn’t really a model for the developing world.”
“We’re CS undergrads who are very concerned about all the different impacts of models, especially when they harm people. It’s only now that we’re also beginning to understand how these models can harm us too. One thing that struck us in your conversation is the very new danger of damage to our reputations if we’re accused of plagiarism. We think policing and surveillance for model use, especially for coding is wrong. All of us use Copilot and other coding aids now, and writing—we’ve always looked for hints on the internet to get started--policing such is impossible and unjust and we want to know how our institutions will protect us from such accusations.”
“Dear Irene Solaiman, thank you for your talk. We are interested in supporting the third-party regulation, and just want to know what can be done to make sure such regulation sticks and hurts offending corporations where it counts: in their profits. We think third-party regulation is moving way too slowly with all the harms that are being done to marginalized groups, that the technology is getting entrenched, and that power is increasingly centralized in powerful corporations. We’d be happy to work on such types of regulation if we are empowered to make things happen, are well-paid, and protected from retaliation. We all started out in Big Tech corporations. The problems with these orgs are already clear to us. But we see how hard it is to change things and how much we could lose if we raise even the smallest questions. All those fun times in tech are always clouded by not knowing who we can trust.”