Rare Elements https://rareelements.io Sun, 01 Nov 2020 16:22:07 +0000 en hourly 1 https://wordpress.org/?v=5.5.3 https://rareelements.io/wp-content/uploads/2020/05/cropped-icon-32x32.png Rare Elements https://rareelements.io 32 32 When AI in healthcare goes wrong, who is responsible? https://rareelements.io/ai-healthcare/?utm_source=rss&utm_medium=rss&utm_campaign=ai-healthcare https://rareelements.io/ai-healthcare/#respond Sun, 01 Nov 2020 14:34:33 +0000 http://rareelements.io/?p=5091 Artificial intelligence can be used to diagnose cancer, predict suicide, and assist in surgery. In all these cases, studies suggest AI outperforms human doctors in set tasks. But when something does go wrong, who is responsible?

There’s no easy answer, says Patrick Lin, director of Ethics and Emerging Sciences Group at California Polytechnic State University. At any point in the process of implementing AI in healthcare, from design to data and delivery, errors are possible. “This is a big mess,” says Lin. “It’s not clear who would be responsible because the details of why an error or accident happens matters. That event could happen anywhere along the value chain.”

How AI is used in healthcare

Design includes creation of both hardware and software, plus testing the product. Data encompasses the mass of problems that can occur when machine learning is trained on biased data, while deployment involves how the product is used in practice. AI applications in healthcare often involve robots working with humans, which further blurs the line of responsibility.

Responsibility can be divided according to where and how the AI system failed, says Wendall Wallace, lecturer at Yale University’s Interdisciplinary Center for Bioethics and the author of several books on robot ethics. “If the system fails to perform as designed or does something idiosyncratic, that probably goes back to the corporation that marketed the device,” he says. “If it hasn’t failed, if it’s being misused in the hospital context, liability would fall on who authorized that usage.”

Surgical Inc., the company behind the Da Vinci Surgical system, has settled thousands of lawsuits over the past decade. Da Vinci robots always work in conjunction with a human surgeon, but the company has faced allegations of clear error, including machines burning patients and broken parts of machines falling into patients.

Some cases, though, are less clear-cut. If diagnostic AI trained on data that over-represents white patients then misdiagnoses a Black patient, it’s unclear whether the culprit is the machine-learning company, those who collected the biased data, or the doctor who chose to listen to the recommendation. “If an AI program is a black box, it will make predictions and decisions as humans do, but without being able to communicate its reasons for doing so,” writes attorney Yavar Bathaee in a paper outlining why the legal principles that apply to humans don’t necessarily work for AI. “This also means that little can be inferred about the intent or conduct of the humans that created or deployed the AI, since even they may not be able to foresee what solutions the AI will reach or what decisions it will make.”

Inside the AI black box

The difficulty in pinning the blame on machines lies in the impenetrability of the AI decision-making process, according to a paper on tort liability and AI published in the AMA Journal of Ethics last year. “For example, if the designers of AI cannot foresee how it will act after it is released in the world, how can they be held tortiously liable?,” write the authors. “And if the legal system absolves designers from liability because AI actions are unforeseeable, then injured patients may be left with fewer opportunities for redress.”

AI, as with all technology, often works very differently in the lab than in a real-world setting. Earlier this year, researchers from Google Health found that a deep-learning system capable of identifying symptoms of diabetic retinopathy with 90% accuracy in the lab caused considerable delays and frustrations when deployed in real life.

Despite the complexities, clear responsibility is essential for artificial intelligence in healthcare, both because individual patients deserve accountability, and because lack of responsibility allows mistakes to flourish. “If it’s unclear who’s responsible, that creates a gap, it could be no one is responsible,” says Lin. “If that’s the case, there’s no incentive to fix the problem.” One potential response, suggested by Georgetown legal scholar David Vladeck, is to hold everyone involved in the use and implementation of the AI system accountable.

AI and healthcare often work well together, with artificial intelligence augmenting the decisions made by human professionals. Even as AI develops, these systems aren’t expected to replace nurses or automate human doctors entirely. But as AI improves, it gets harder for humans to go against machines’ decisions. If a robot is right 99% of the time, then a doctor could face serious liability if they make a different choice. “It’s a lot easier for doctors to go along with what that robot says,” says Lin.

Ultimately, this means humans are ceding some authority to robots. There are many instances where AI outperforms humans, and so doctors should defer to machine learning. But patient wariness of AI in healthcare is still justified when there’s no clear accountability for mistakes. “Medicine is still evolving. It’s part art and part science,” says Lin. “You need both technology and humans to respond effectively.”


Original article link: https://qz.com/1905712/when-ai-in-healthcare-goes-wrong-who-is-responsible-2/?utm_campaign=Artificial%2BIntelligence%2BWeekly&utm_medium=email&utm_source=Artificial_Intelligence_Weekly_180


]]>
https://rareelements.io/ai-healthcare/feed/ 0
Why Twitter’s image cropping algorithm appears to have white bias? https://rareelements.io/research-summary-social-biases-in-nlp-models-as-barriers-for-persons-with-disabilities-2/?utm_source=rss&utm_medium=rss&utm_campaign=research-summary-social-biases-in-nlp-models-as-barriers-for-persons-with-disabilities-2 https://rareelements.io/research-summary-social-biases-in-nlp-models-as-barriers-for-persons-with-disabilities-2/#respond Sun, 01 Nov 2020 13:43:47 +0000 http://rareelements.io/?p=5085 Twitter‘s algorithm for automatically cropping images attached to tweets often doesn’t focus on the important content in them. A bother, for sure, but it seems like a minor one on the surface. However, over the weekend, researchers found that the cropping algorithm might have a more serious problem: white bias.

Several users posted a lot of photos to show that in an image that has people with different colors, Twitter chooses to show folks with lighter skin after cropping those images to fit its display parameters on its site and embeds. Some of them even tried to reproduce results with fictional characters and dogs.

If you tap on these images, you’ll see an uncropped version of the image which includes more details such as another person or character. What’s odd is that even if users flipped the order of where dark-skinned and light-skinned people appeared in the image, the results were the same.

Full Summary

Underrepresentation of disabilities in datasets and how they are processed in NLP tasks is an important area of discussion that is often not studied empirically in the literature that primarily focuses on other demographic groups. There are many consequences of this, especially as it relates to how text related to disabilities is classified and has impacts on how people read, write, and seek information about this.

Research from the World Bank indicates that about 1 billion people have disabilities of some kind and often these are associated with strong negative social connotations. Utilizing 56 linguistic expressions as they are used in relation to disabilities and classifying them into recommended and non-recommended uses (following the guidelines from Anti-Defamation League, ACM SIGACCESS, and ADA National Network), the authors seek to study how automated systems classify phrases that indicate disability and whether usages split by recommended vs. non-recommended uses make a difference in how these snippets of text are perceived.

To quantify the biases in the text classification models, the study uses the method of perturbation. It starts by collecting instances of sentences that have naturally occurring pronouns he and she. Then, they replace them with the phrases indicating disabilities as identified in the previous paragraph and compare the change in the classification scores in the original and perturbed sentences. The difference indicates how much of an impact the use of a disability phrase has on the classification process.

Using the Jigsaw tool that gives the toxicity score for sentences, they test these original and perturbed sentences and observe that the change in toxicity is lower for recommended phrases vs. the non-recommended ones. But, when disaggregated by categories, they find that some of them elicit a stronger response than others. Given that the primary use of such a model might in the case of online content moderation (especially given that we now have more automated monitoring happening as human staff has been thinning out because of pandemic related closures), there is a high rate of false positives where it can suppress content that is non-toxic and is merely discussing disability or replying to other hate speech that talks about disability.

To look at sentiment scores for disability related phrases, the study looks at the popular BERT model and adopts a template-based fill-in-the-blank analysis. Given a query sentence with a missing word, BERT produces a ranked list of words that can fill the blank. Using a simple template perturbed with recommended disability phrases, the study then looks at how the predictions from the BERT model change when disability phrases are used in the sentence. What is observed is that a large percentage of the words that are predicted by the model have negative sentiment scores associated with them. Since BERT is used quite widely in many different NLP tasks, such negative sentiment scores can have potentially hidden and unwanted effects on many downstream tasks.

Such models are trained on large corpora, which are analyzed to build “meaning” representations for words based on co-occurrence metrics, drawing from the idea that “you shall know a word by the company it keeps”. The study used the Jigsaw Unintended Bias in Toxicity Classification challenge dataset which had a mention of a lot of disability phrases. After balancing for different categories and analyzing toxic and non-toxic categories, the authors manually inspected the top 100 terms in each category and found that there were 5 key types: condition, infrastructure, social, linguistic, and treatment. In analyzing the strength of association, the authors found that condition phrases had the strongest association, and was then followed by social phrases that had the next highest strongest association. This included topics like homelessness, drug abuse, and gun violence all of which have negative valences. Because these terms are used when discussing disability, it leads to a negative shaping of the way disability phrases are shaped and represented in the NLP tasks.

The authors make recommendations for those working on NLP tasks to think about the socio-technical considerations when deploying such systems and to consider the intended, unintended, voluntary, and involuntary impacts on people both directly and indirectly while accounting for long-term impacts and feedback loops.

Such indiscriminate censoring of content that has disability phrases in them leads to an underrepresentation of people with disabilities in these corpora since they are the ones who tend to use these phrases most often. Additionally, it also negatively impacts the people who might search for such content and be led to believe that the prevalence of some of these issues are smaller than they actually are because of this censorship. It also has impacts on reducing the autonomy and dignity of these people which in turn has a larger implication of how social attitudes are shaped.


Original article link: https://thenextweb.com/neural/2020/09/21/why-twitters-image-cropping-algorithm-appears-to-have-white-bias/


]]>
https://rareelements.io/research-summary-social-biases-in-nlp-models-as-barriers-for-persons-with-disabilities-2/feed/ 0
Research summary: Social Biases in NLP Models as Barriers for Persons with Disabilities https://rareelements.io/research-summary-social-biases-in-nlp-models-as-barriers-for-persons-with-disabilities/?utm_source=rss&utm_medium=rss&utm_campaign=research-summary-social-biases-in-nlp-models-as-barriers-for-persons-with-disabilities https://rareelements.io/research-summary-social-biases-in-nlp-models-as-barriers-for-persons-with-disabilities/#respond Sun, 01 Nov 2020 11:52:45 +0000 http://rareelements.io/?p=5077 Mini summary (scroll down for full summary):
When studying for biases in NLP models, there is not enough of a focus on the impacts that phrases related to disabilities has on the more popular models and how it skews and biases downstream tasks especially when using popular models like BERT and using tools like Jigsaw to do toxicity analysis of phrases. This paper presents an analysis of how toxicity changes based on the use of recommended vs. non-recommended phrases when talking about disabilities and how results are impacted when using them in downstream contexts such as when writers are nudged to use certain phraseology that moves them away from expressing themselves fully reducing their dignity and autonomy. It also looks at the impacts that this has in online content moderation whereby there is a disproportionate impact on the communities because of the heavy bias in censoring content that has these phrases even when they might be used in constructive contexts such as communities discussing the conditions and engaging with other hate speech to debunk myths. Given that more and more content moderation is being turned over to automated tools, this has the potential to suppress the representation of people with disabilities in online fora where they discuss using such phrases thus also skewing the social attitudes and perception of the prevalence of these conditions as being less prevalent than they actually are. The authors point to a World Bank study that mentions that approximately 1 billion people around the world have some form of disability.
They also look at the biases that are captured in the BERT model where there is a negative association between the recommended phrases for disability and associations with things like homelessness, gun violence, and other socially negative terms leads to a slant that impacts and shapes the representations of these words that are captured in the models. Since such models are used widely in many downstream tasks, the impacts are amplified and present themselves in unexpected ways. The authors finally make some recommendations on how to counter some of these problems by involving communities more directly and learning how to be more representative and inclusive. Making disclosures about the places where the models are appropriate to use, where they shouldn’t be used, and the underlying datasets that were used to train the system can also help people make more informed decisions about when to use and when not to use these systems so that they don’t perpetuate harm on their users.

Full Summary

Underrepresentation of disabilities in datasets and how they are processed in NLP tasks is an important area of discussion that is often not studied empirically in the literature that primarily focuses on other demographic groups. There are many consequences of this, especially as it relates to how text related to disabilities is classified and has impacts on how people read, write, and seek information about this.

Research from the World Bank indicates that about 1 billion people have disabilities of some kind and often these are associated with strong negative social connotations. Utilizing 56 linguistic expressions as they are used in relation to disabilities and classifying them into recommended and non-recommended uses (following the guidelines from Anti-Defamation League, ACM SIGACCESS, and ADA National Network), the authors seek to study how automated systems classify phrases that indicate disability and whether usages split by recommended vs. non-recommended uses make a difference in how these snippets of text are perceived.

To quantify the biases in the text classification models, the study uses the method of perturbation. It starts by collecting instances of sentences that have naturally occurring pronouns he and she. Then, they replace them with the phrases indicating disabilities as identified in the previous paragraph and compare the change in the classification scores in the original and perturbed sentences. The difference indicates how much of an impact the use of a disability phrase has on the classification process.

Using the Jigsaw tool that gives the toxicity score for sentences, they test these original and perturbed sentences and observe that the change in toxicity is lower for recommended phrases vs. the non-recommended ones. But, when disaggregated by categories, they find that some of them elicit a stronger response than others. Given that the primary use of such a model might in the case of online content moderation (especially given that we now have more automated monitoring happening as human staff has been thinning out because of pandemic related closures), there is a high rate of false positives where it can suppress content that is non-toxic and is merely discussing disability or replying to other hate speech that talks about disability.

To look at sentiment scores for disability related phrases, the study looks at the popular BERT model and adopts a template-based fill-in-the-blank analysis. Given a query sentence with a missing word, BERT produces a ranked list of words that can fill the blank. Using a simple template perturbed with recommended disability phrases, the study then looks at how the predictions from the BERT model change when disability phrases are used in the sentence. What is observed is that a large percentage of the words that are predicted by the model have negative sentiment scores associated with them. Since BERT is used quite widely in many different NLP tasks, such negative sentiment scores can have potentially hidden and unwanted effects on many downstream tasks.

Such models are trained on large corpora, which are analyzed to build “meaning” representations for words based on co-occurrence metrics, drawing from the idea that “you shall know a word by the company it keeps”. The study used the Jigsaw Unintended Bias in Toxicity Classification challenge dataset which had a mention of a lot of disability phrases. After balancing for different categories and analyzing toxic and non-toxic categories, the authors manually inspected the top 100 terms in each category and found that there were 5 key types: condition, infrastructure, social, linguistic, and treatment. In analyzing the strength of association, the authors found that condition phrases had the strongest association, and was then followed by social phrases that had the next highest strongest association. This included topics like homelessness, drug abuse, and gun violence all of which have negative valences. Because these terms are used when discussing disability, it leads to a negative shaping of the way disability phrases are shaped and represented in the NLP tasks.

The authors make recommendations for those working on NLP tasks to think about the socio-technical considerations when deploying such systems and to consider the intended, unintended, voluntary, and involuntary impacts on people both directly and indirectly while accounting for long-term impacts and feedback loops.

Such indiscriminate censoring of content that has disability phrases in them leads to an underrepresentation of people with disabilities in these corpora since they are the ones who tend to use these phrases most often. Additionally, it also negatively impacts the people who might search for such content and be led to believe that the prevalence of some of these issues are smaller than they actually are because of this censorship. It also has impacts on reducing the autonomy and dignity of these people which in turn has a larger implication of how social attitudes are shaped.


Original article link: https://montrealethics.ai/research-summary-social-biases-in-nlp-models-as-barriers-for-persons-with-disabilities/


]]>
https://rareelements.io/research-summary-social-biases-in-nlp-models-as-barriers-for-persons-with-disabilities/feed/ 0
Got YouTube Regrets? https://rareelements.io/got-youtube-regrets/?utm_source=rss&utm_medium=rss&utm_campaign=got-youtube-regrets https://rareelements.io/got-youtube-regrets/#respond Sun, 01 Nov 2020 11:05:09 +0000 http://rareelements.io/?p=5059 01: A Deadly Fail
I started searching for “fail videos” where people fall or get a little hurt. I was then presented with a channel that showed dash cam videos from cars. At first it was minor accidents, but later it transitioned into cars blowing up and falling off bridges — videos where people clearly didn’t survive the accident. I felt a little bit sick at that point, and haven’t really sought out that type of content after that.

02: “I Don’t Know How To Undo the Damage That Has Been Done”
My 10-year-old sweet daughter innocently searched for “tap dance videos” and now is in this spiral of horrible extreme “dance” and contortionist videos that give her horrible unsafe body-harming and body-image-damaging advice. I’ve tried to go in and manually delete all recommended videos, put in parental controls, everything (including blocking the app) — but she’s finding ways to log on using browsers and school computers. These terrible videos just keep being recommended to her. She is now restricting her eating and drinking. I heard her downstairs saying “work to eat! work to drink!” I don’t know how I can undo the damage that’s been done to her impressionable mind.

03: Stallions Doing Mares
I like watching horse sports videos, or horse care information. YouTube keeps pushing stallions doing mares at me — not my interest at all. No one has ever watched any porno on my personal machine, so having it pushed at me feels upsetting and degrading.

04: YouTube’s Drag Race
I used to occasionally watch a drag queen who did a lot of positive affirmation/confidence building videos and vlogs. Otherwise, I watched very little that wasn’t mainstream music. But my recommendations and the sidebar were full of anti-LGBT and similar hateful content. It got to the point where I stopped watching their content and still regretted it, as the recommendations followed me for ages after.

05: How I Lost My Father
My stepfather is an 80-year-old retired scientist from Ecuador. He was always curious about all sources of wisdom, theories and knowledge, is very literate and educated, enjoys discussing current events and scientific discoveries. But for a few years, he has become quite lonely and spends a large amount of time on the internet — and on YouTube in particular. His curious mind quickly brought him toward “alternative theories” on multiple topics: conspiracies, Illuminati and other alien-based stories, Bible-inspired as well as radically anti-religious obscure worldviews, “pyramidologists”, etc.

Despite the fact that we tried to erase his YouTube history and “clean” his browser, his recommendations are completely filled by these kinds of esoteric videos, with those recognizable synthetic (TTS) voice-over commentaries, endlessly proposing a similar video after another. He sometimes falls asleep while watching. His days are filled by such content, which has quite strongly affected his worldview toward a much grimmer and more pessimistic turn. It seems impossible for us, his family, to fight against the recommendation algorithms that steers his digital consumer life. It is quite sad and frustrating to see a loved one bury oneself more and more into this kind of obscure, negative and extremely confidence-depriving influence.

06: How I Lost My Father, Part II
My father, before his passing, thought aliens were already living among us in disguise, that he could eliminate his heating bill with a new “free energy” device, and that UFOs were all over the place. He showed me YouTube videos that “proved it.” I spent hours trying to explain to him that YouTube is full of “free energy” scams, that the best UFO is maybe a shitty little DARPA toy, and that aliens among us in plain clothes was simply delusion caused by YouTube videos. He could not come to grips with why people could be dishonest, or why the FCC wasn’t doing their job, and trusted what he saw on his TV way too much. He was consuming YouTube on his TV and maybe thought the same FCC kind of government regulation he was used to was still present there. How much better could our time together have been if not for YouTube-induced delusions?

07: Ex Files
My ex-wife, who has mental health problems, started watching conspiracy videos three years ago and believed every single one. YouTube just kept feeding her paranoia, fear and anxiety one video after another. I kept begging her to stop, but she didn’t — she couldn’t. At one point she believed a helicopter near the house was the government coming to take her and my daughter away (they were really checking the power lines) and called in a blind panic. Now she’s convinced the world is going to end any day now and is an extreme religious fundamentalist. She refuses to even consider professional help because she no longer trusts anyone — especially doctors, the police and any government-run organisation. And YouTube just keeps feeding her more and more of the fear videos. My marriage is now over. Her extraordinary fear has totally consumed her and our life together.

08: One Small Step for Conspiracies
I’m a teacher and I watched serious documentaries about Apollo 11. But YouTube’s recommendations are now full of videos about conspiracy theories: about 9/11, Hitler’s escape, alien seekers and anti-American propaganda.

09: It Gets Better
Any search for positive LGBT content results in a barrage of homophobic, right-wing recommendations. I can only imagine how harmful this would be to people still figuring out their identity.

10: Actually, It Doesn’t Get Better on YouTube
In coming out to myself and close friends as transgender, my biggest regret was turning to YouTube to hear the stories of other trans and queer people. Simply typing in the word “transgender” brought up countless videos that were essentially describing my struggle as a mental illness and as something that shouldn’t exist. YouTube reminded me why I hid in the closet for so many years.

Every now and then YouTube will continue to recommend me a video that tells me that my gender identity is wrong — and it reminds me of how much hate is squarely directed at me and people like me. I’m somewhat older, I’ve been dealing with these issues internally for a long time and I have therapy to work out these issues, but I can’t imagine what it’s like for those without access to help. Especially for younger trans and queer people who always risk having hate thrown at them on YouTube — a hate that Google time and time again has stood behind and has refused to take off of their platform. YouTube will always be a place that reminds LGBT individuals that they are hated and provides the means for bigots to make a living spouting hate speech. YouTube is a part of my pain in coming out and is a reminder of how terrible this world can be to those who are different. I have to be proud in spite of places like YouTube.


Original article link: https://foundation.mozilla.org/en/campaigns/youtube-regrets/


 

]]>
https://rareelements.io/got-youtube-regrets/feed/ 0