September 23, 2020
Emotional content is an important part of language. There are many use cases now showing that natural language processing is becoming an increasingly important part of consumer products. We are attempting to learn more about human emotions.
In his 2006 book The Emotion Machine, legendary computer scientist Marvin Minsky (co-founder of the field of Artificial Intelligence and one of the founding faculty members of the MIT Media Lab) wrote about the central role of emotions in reasoning—reminding us that AI will only be capable of true commonsense reasoning once it has understood emotions. To Minsky, emotions are not the opposite of rational reason, something to be weeded out before we can think clearly; rather, emotions are just a different way of thinking.
But this is hardly helpful to a computer scientist trying to construct an emotional machine by programming a concrete set of rules. If you ask two people to explain what makes a particular sentence happy, sad, serious, or sarcastic, you will likely get at least two different opinions. Much of what determines emotional content is context-specific, culturally constructed, and difficult to describe in an explicit set of rules.
September 22, 2020
TikTok Wednesday revealed some of the elusive workings of the prized algorithm that keeps hundreds of millions of users worldwide hooked on the viral video app.
Why it matters: The code TikTok uses to pick your next video is a large part of what has led the two-year-old company to achieve broad popularity along with a remarkable $20-$30 billion valuation. The key asset is in play as TikTok’s Chinese parent prepares to sell its U.S. operation amid fears about its relationship with China’s government.
Driving the news: On a call with reporters Wednesday, TikTok executives said they were revealing details of their algorithm and data practices to dispel myths and rumors about the company.
- “We’re a 2-year-old company operating with the expectations of a 10-year-old company,” said Michael Beckerman, TikTok’s vice president in charge of U.S. public policy. “We didn’t have the opportunity to grow up in the golden years of the internet, when tech companies could do no wrong. We grew up in the techlash age ,where there’s a lot of skepticism of platforms, how they moderate content and how their algorithms work.”
- TikTok executives gave reporters a virtual tour its new “transparency center” in Los Angeles. The center will have areas for people to demo computer modules that showcase how TikTok’s algorithms and data practices work.
That’s assuming TikTok survives in its current form.
- President Trump has set Sept. 15 as a deadline for the company’s Chinese owner, ByteDance, to find an American purchaser, or it will face a ban in the U.S.
- China recently instituted new export restrictions on software that could prevent TikTok’s algorithm from being included in any sale.
How it works: TikTok’s algorithm uses machine learning to determine what content a user is most likely to engage with and serve them more of it, by finding videos that are similar or that are liked by people with similar user preferences.
- When users open TikTok for the first time, they are shown 8 popular videos featuring different trends, music, and topics. After that, the algorithm will continue to serve the user new iterations of 8 videos based on which videos the user engages with and what the user does.
- The algorithm identifies similar videos to those that have engaged a user based on video information, which could include details like captions, hashtags or sounds. Recommendations also take into account user device and account settings, which include data like language preference, country setting, and device type.
- Once TikTok collects enough data about the user, the app is able to map a user’s preferences in relation to similar users and group them into “clusters.” Simultaneously, it also groups videos into “clusters” based on similar themes, like “basketball” or “bunnies.”
- Using machine learning, the algorithm serves videos to users based on their proximity to other clusters of users and content that they like.
- TikTok’s logic aims to avoid redundancies that could bore the user, like seeing multiple videos with the same music or from the same creator.
Yes, but: TikTok concedes that its ability to nail users’ preferences so effectively means that its algorithm can produce “filter bubbles,” reinforcing users’ existing preferences rather than showing them more varied content, widening their horizons, or offering them opposing viewpoints.
- The company says that it’s studying filter bubbles, including how long they last and how a user encounters them, to get better at breaking them when necessary.
- Since filter bubbles can reinforce conspiracy theories, hoaxes and other misinformation, TikTok’s product and policy teams study which accounts and video information — themes, hashtags, captions, and so on — might be linked to misinformation.
- Videos or creators linked to misinformation are sent to the company’s global content reviewers so they can be managed before they are distributed to users on the main feed, which is called the “For You” page.
The briefing also featured updates about TikTok’s data, privacy and security practices.
- The company says it tries to triage and prevent incidents on its platform before they happen by working to detect patterns of problems before they spread.
- TikTok’s chief security officer, Roland Cloutier, said it plans to hire more than 100 data, security and privacy experts by year’s end in the U.S.
- He also said that the company will be building a monitoring, response and investigative response center in Washington D.C. to actively detect and respond to critical incidents in real time.
The big picture: Beckerman says that TikTok’s transparency efforts are meant to position the company as a leader in Silicon Valley.
- “We want to take a leadership position and show more about how the app works,” he said. “For us, we’re new, and we want to do this because we don’t have anything to hide. The more we’re talking to and meeting with lawmakers, the more comfortable they are with the product. That’s the way it should be.”
September 22, 2020
International organizations and corporations are racing to develop global guidelines for the ethical use of artificial intelligence. Declarations, manifestos, and recommendations are flooding the internet. But these efforts will be futile if they fail to account for the cultural and regional contexts in which AI operates.
AI systems have repeatedly been shown to cause problems that disproportionately affect marginalized groups while benefiting a privileged few. The global AI ethics efforts under way today—of which there are dozens—aim to help everyone benefit from this technology, and to prevent it from causing harm. Generally speaking, they do this by creating guidelines and principles for developers, funders, and regulators to follow. They might, for example, recommend routine internal audits or require protections for users’ personally identifiable information.
We believe these groups are well-intentioned and are doing worthwhile work. The AI community should, indeed, agree on a set of international definitions and concepts for ethical AI. But without more geographic representation, they’ll produce a global vision for AI ethics that reflects the perspectives of people in only a few regions of the world, particularly North America and northwestern Europe.
This work is not easy or straightforward. “Fairness,” “privacy,” and “bias” mean different things (pdf) in different places. People also have disparate expectations of these concepts depending on their own political, social, and economic realities. The challenges and risks posed by AI also differ depending on one’s locale.
If organizations working on global AI ethics fail to acknowledge this, they risk developing standards that are, at best, meaningless and ineffective across all the world’s regions. At worst, these flawed standards will lead to more AI systems and tools that perpetuate existing biases and are insensitive to local cultures.
In 2018, for example, Facebook was slow to act on misinformation spreading in Myanmar that ultimately led to human rights abuses. An assessment (pdf) paid for by the company found that this oversight was due in part to Facebook’s community guidelines and content moderation policies, which failed to address the country’s political and social realities.
To prevent such abuses, companies working on ethical guidelines for AI-powered systems and tools need to engage users from around the world to help create appropriate standards to govern these systems. They must also be aware of how their policies apply in different contexts.
Despite the risks, there’s a clear lack of regional diversity in many AI advisory boards, expert panels, and councils appointed by leading international organizations. The expert advisory group for Unicef’s AI for Children project, for example, has no representatives from regions with the highest concentration of children and young adults, including the Middle East, Africa, and Asia.
Unfortunately, as it stands today, the entire field of AI ethics is at grave risk of limiting itself to languages, ideas, theories, and challenges from a handful of regions—primarily North America, Western Europe, and East Asia.
This lack of regional diversity reflects the current concentration of AI research (pdf): 86% of papers published at AI conferences in 2018 were attributed to authors in East Asia, North America, or Europe. And fewer than 10% of references listed in AI papers published in these regions are to papers from another region. Patents are also highly concentrated: 51% of AI patents published in 2018 were attributed to North America.
Those of us working in AI ethics will do more harm than good if we allow the field’s lack of geographic diversity to define our own efforts. If we’re not careful, we could wind up codifying AI’s historic biases into guidelines that warp the technology for generations to come. We must start to prioritize voices from low- and middle-income countries (especially those in the “Global South”) and those from historically marginalized communities.
Advances in technology have often benefited the West while exacerbating economic inequality, political oppression, and environmental destruction elsewhere. Including non-Western countries in AI ethics is the best way to avoid repeating this pattern.
September 22, 2020
A mesmerising, unaccountable kind of algorithm – machine learning – is blinding governments to the technology’s often disastrous flaws
September 22, 2020
We’ve applied reinforcement learning from human feedback to train language models that are better at summarization. Our models generate summaries that are better than summaries from 10x larger models trained only with supervised learning. Even though we train our models on the Reddit TL;DR dataset,1 the same models transfer to generate good summaries of CNN/DailyMail news articles2 without any further fine-tuning. Our techniques are not specific to summarization; in the long run, our goal is to make aligning AI systems with human preferences a central component of AI research and deployment in many domains.