AV transcription languages
The audio-visual (AV) transcription application recognizes audio from over 100 languages and regional variants.
See these related pages:
How transcription languages work
When you choose transcription languages, the AI model maps the sounds in the audio track to its list of known words for each language. You can select up to ten languages total: one primary and nine secondary languages. AV transcription does not censor profanity or offensive words in any language.
When selecting languages:
- Set the primary language to the most frequently spoken language in the audio track.
- Only select one locale for a language. If the audio contains two locales, choose the most frequently spoken. For example, if your audio contains mostly UK-based English speakers, plus one US-based speaker, select English (UK).
If you are not sure whether the audio track contains a specific language, try adding it as a secondary language. The processing time does not change significantly with extra languages. The main possible downside is that if a word is mumbled, it may be wrongly interpreted as that language.
For a list of supported languages, see Supported languages and locales.
Handling multiple languages
When the model maps sounds to words, it tries to find the closest match in any of the selected languages. The primary language acts as a default: If the application cannot find a perfect match for a word, it will try to match it to a word in the primary language.
The model also pays attention to the language of surrounding words. If all the surrounding words were identified as one language, the model is more likely to assume that a word in the middle belongs to that language also. For example, if the sound "si" is surrounded by English, the model may assume it's the English word "see." If it's surrounded by Spanish, the model is more likely to assume that it's the Spanish word "sí."
In general, if you did not select a language when setting up the job, the model will not identify words spoken in that language. However, there are a few locales that contain languages that are frequently mixed. For example, French (Canada) and Spanish (United States) can both identify words in English.
Choosing language locales
When you select a language, you also select the locale in which the language is spoken. For example, Portuguese has two options: Portuguese (Brazil) and Portuguese (Portugal).
The amount the locale affects the transcription depends on how much the language itself varies from region to region. For languages with significant regional differences, choosing the right locale makes it easier for the model to match pronunciations and region-specific words. For languages with fewer regional differences, the locale choice may only affect small details such as spelling.
If the audio contains two locales, choose the most frequently spoken. For example, if your audio contains mostly UK-based English speakers, plus one US-based speaker, select English (UK). If you're not sure which locale is more common in the audio, choose the one that the reviewers are most comfortable reading.
Supported languages and locales
AV transcription supports the following languages and locales:
- Afrikaans (South Africa)
- Albanian (Albania)
- Amharic (Ethiopia)
- Arabic (Algeria)
- Arabic (Bahrain)
- Arabic (Egypt)
- Arabic (Iraq)
- Arabic (Israel)
- Arabic (Jordan)
- Arabic (Kuwait)
- Arabic (Lebanon)
- Arabic (Libya)
- Arabic (Morocco)
- Arabic (Oman)
- Arabic (Palestinian Territories)
- Arabic (Qatar)
- Arabic (Saudi Arabia)
- Arabic (Syria)
- Arabic (Tunisia)
- Arabic (United Arab Emirates)
- Arabic (Yemen)
- Armenian (Armenia)
- Assamese (India)
- Azerbaijani (Azerbaijan)
- Bangla (India)
- Basque (Spain)
- Bosnian (Bosnia & Herzegovina)
- Bulgarian (Bulgaria)
- Burmese (Myanmar [Burma])
- Cantonese (China)
- Catalan (Spain)
- Chinese (China)
- Chinese (China, SHANDONG)
- Chinese (China, SICHUAN)
- Chinese (Hong Kong SAR China)
- Chinese (Taiwan)
- Croatian (Croatia)
- Czech (Czechia)
- Danish (Denmark)
- Dutch (Belgium)
- Dutch (Netherlands)
- English (Australia)
- English (Canada)
- English (Ghana)
- English (Hong Kong SAR China)
- English (India)
- English (Ireland)
- English (Kenya)
- English (New Zealand)
- English (Nigeria)
- English (Philippines)
- English (Singapore)
- English (South Africa)
- English (Tanzania)
- English (United Kingdom)
- English (United States)
- Estonian (Estonia)
- Filipino (Philippines)
- Finnish (Finland)
- French (Belgium)
- French (Canada)
- French (France)
- French (Switzerland)
- Galician (Spain)
- Georgian (Georgia)
- German (Austria)
- German (Germany)
- German (Switzerland)
- Greek (Greece)
- Gujarati (India)
- Hebrew (Israel)
- Hindi (India)
- Hungarian (Hungary)
- Icelandic (Iceland)
- Indonesian (Indonesia)
- Irish (Ireland)
- Italian (Italy)
- Italian (Switzerland)
- Japanese (Japan)
- Javanese (Indonesia)
- Kannada (India)
- Kazakh (Kazakhstan)
- Khmer (Cambodia)
- Korean (South Korea)
- Lao (Laos)
- Latvian (Latvia)
- Lithuanian (Lithuania)
- Macedonian (North Macedonia)
- Malay (Malaysia)
- Malayalam (India)
- Maltese (Malta)
- Marathi (India)
- Mongolian (Mongolia)
- Nepali (Nepal)
- Norwegian Bokmål (Norway)
- Odia (India)
- Pashto (Afghanistan)
- Persian (Iran)
- Polish (Poland)
- Portuguese (Brazil)
- Portuguese (Portugal)
- Punjabi (India)
- Romanian (Romania)
- Russian (Russia)
- Serbian (Serbia)
- Sinhala (Sri Lanka)
- Slovak (Slovakia)
- Slovenian (Slovenia)
- Somali (Somalia)
- Spanish (Argentina)
- Spanish (Bolivia)
- Spanish (Chile)
- Spanish (Colombia)
- Spanish (Costa Rica)
- Spanish (Cuba)
- Spanish (Dominican Republic)
- Spanish (Ecuador)
- Spanish (El Salvador)
- Spanish (Equatorial Guinea)
- Spanish (Guatemala)
- Spanish (Honduras)
- Spanish (Mexico)
- Spanish (Nicaragua)
- Spanish (Panama)
- Spanish (Paraguay)
- Spanish (Peru)
- Spanish (Puerto Rico)
- Spanish (Spain)
- Spanish (United States)
- Spanish (Uruguay)
- Spanish (Venezuela)
- Swahili (Kenya)
- Swahili (Tanzania)
- Swedish (Sweden)
- Tamil (India)
- Telugu (India)
- Thai (Thailand)
- Turkish (Türkiye)
- Ukrainian (Ukraine)
- Urdu (India)
- Uzbek (Uzbekistan)
- Vietnamese (Vietnam)
- Welsh (United Kingdom)
- Wu Chinese (China)
- Zulu (South Africa)