It could learn them all. But will it?
Subscribe and turn on notifications š so you donāt miss any videos: http://goo.gl/0bsAjO
Large language models are astonishingly good at understanding and producing language. But thereās an often overlooked bias toward languages that are already well-represented on the internet. That means some languages might lose out in AIās big technical advances.
Some researchers are looking into how that works ā and how to possibly shift the balance from these āhigh resourceā languages to ones that havenāt yet had a huge online footprint. These approaches range from original dataset creation, to studying the outputs of large language models, to training open source alternatives.
Watch the video above to learn more.
Further reading:
https://ruth-ann.notion.site/ruth-ann/JamPatoisNLI-A-Jamaican-Patois-Natural-Language-Inference-Dataset-91523ec89af24bfdbcb9c1ec7e28cc3c
This is the hub for Ruth-Ann Armstrongās JamPatois NLI. You can see the dataset and read the paper.
https://arxiv.org/search/cs?searchtype=author&query=Melero%2C+M
You can read Maite Meleroās work on Catalan here.
https://huggingface.co/bigscience/bloom
This is the Hugging Face home for BLOOM, the open source large language model.
Make sure you never miss behind the scenes content in the Vox Video newsletter, sign up here: http://vox.com/video-newsletter
Vox.com is a news website that helps you cut through the noise and understand whatās really driving the events in the headlines. Check out http://www.vox.com
Support Voxās reporting with a one-time or recurring contribution: http://vox.com/contribute-now
Shop the Vox merch store: http://vox.com/store
Watch our full video catalog: http://goo.gl/IZONyE
Follow Vox on Facebook: http://facebook.com/vox
Follow Vox on Twitter: http://twitter.com/voxdotcom
Follow Vox on TikTok: http://tiktok.com/@voxdotcom
source
This is the second of 5 videos where Phil examines the ins, outs, and struggles of AI! Join us every Tuesday in April for more. And watch the first one, on the difficult task of drawing hands, here: https://youtu.be/24yjRbBah3w
Alternatively, everyone could just learn English…
It would be far more practical to teach every human on Earth English, than to make AI in every language.
Yes there are benefits to a diversity of language, but there are far MORE benefits the other way around. While it wouldn't come without sacrifice, the world would genuinely be a better place with only one language.
Gee, maybe because English is the most spoken language?
AI can learn all languages, unless it is programmed . NLP is used to learn language i guess
I literally was talking to My AI on SnapChat in Indonesian and then it replied in Indonesian that it prefers English so could I type in English instead.
Do the video editors at Vox use Adobe software? Keep up the great content
I think we need to split 2 task more clearly – translation and answer the questions. They are different.
If you take 8 – 10 years old child he already can speak and know a lot of words. If he is bilingual, he can translate live-speech quite well.
But due to lack of information he cannot answer many questions that ChatGPT can.
Hence, in order to teach ChatGPT to speak the rare language is not necessary to have a same dataset. It's enough only to make ability of correct translation.
But for teaching translation, technically enough very small dataset which exist in every lang (excluded almost died).
Yes, for now AI-translation is not ideal, BUT even translation between giant dataset language is not perfect, so the reason of bad AI-translation is not in the dataset but in technology which just not developed enough.
good. keep it that way
Thanks for this. Iāll just say that I was surprised when ChatGPT was able to translate English statements to Cree.
Within 8 days!
Within 467,000 views!
Explains why Chat GPT sounds like the BBC Pidgin website which isnāt reflective of how real Nigerians speak the language
English is the only language that matters
Go France ! Open source is the key
cause English is easier for the terminator to learn
Love VOX content. Here is a quick question – In how many languages is this video on? What is VOX doing to make it available for LRL folks and make them understand the lack of diversity in current language models?
machine translate everything. then run again. now we can relate all content across languages.
lol this video made me giggle a bit, of course those languages wouln't be inclouded or be limited, it needs human ineraction just like how google translates always asks is this translation correct.
H Siri, how to say ātaco riceā in Japanese.
– sorry I still (2023) canāt translate to this language
š
very interesting seeing how quickly this video's title is becoming obsolete.