APPENDIX H - Supported Languages for Language Detection
  • 26 Nov 2024
  • 1 Minute to read
  • Dark
    Light
  • PDF

APPENDIX H - Supported Languages for Language Detection

  • Dark
    Light
  • PDF

Article summary

Discovery Manager can identify up to 3 different languages within a given file. When data is processed, it will read through the entire extracted and/or OCR text available for a given file and identify the top three languages found within the file, along with their corresponding percentages and character counts.

Note

These are the languages supported for Language Detection. To see what languages are supported specifically for OCR, please see Appendix I.

Supported Languages for Language Detection

Abkhazian

Dutch

Inuktitut

Marathi

Sesotho

Venda

Afar

Dzongkha

Inupiak

Mauritian_Creole

Shona

Vietnamese

Afrikaans

English

Irish

Mongolian

Sindhi

Volapuk

Akan

Esperanto

Italian

Nauru

Sinhalese

Waray_Philippines

Albanian

Estonian

Japanese

Nepali

Siswant

Welsh

Amharic

Faroese

Javanese

Norwegian

Slovak

Wolof

Arabic

Fijian

Kannada

Norwegian_N

Slovenian

Xhosa

Armenian

Finnish

Kashmiri

Nyanja

Somali

Yiddish

Assamese

French

Kazakh

Occitan

Spanish

Yoruba

Aymara

Frisian

Khasi

Oriya

Sundanese

Zhuang

Azerbaijani

Galician

Khmer

Oromo

Swahili

Zulu

Bashkir

Ganda

Kinyarwanda

Pashto

Swedish

Basque

Georgian

Klingon

Pedi

Syriac

Belarusian

German

Korean

Persian

Tagalog

Bengali

Greek

Kurdish

Pig_Latin

Tajik

Bihari

Greenlandic

Kyrgyz

Polish

Tamil

Bislama

Guarani

Laothian

Portuguese

Tatar

Breton

Gujarati

Latin

Punjabi

Telugu

Bulgarian

Haitian_Creole

Latvian

Quechua

Thai

Burmese

Hausa

Limbu

Rhaeto_Romance

Tibetan

Catalan

Hawaiian

Lingala

Romanian

Tigrinya

Cebuano

Hebrew

Lithuanian

Rundi

Tonga

Cherokee

Hindi

Luxembourgish

Russian

Tsonga

Chinese

Hmong

Macedonian

Samoan

Tswana

Chinese_T

Hungarian

Malagasy

Sango

Turkish

Corsican

Icelandic

Malay

Sanskrit

Turkmen

Croatian

Igbo

Malayalam

Scots

Uighur

Czech

Indonesian

Maltese

Scots_Gaelic

Ukrainian

Danish

Interlingua

Manx

Serbian

Urdu

Dhivehi

Interlingue

Maori

Seselwa

Uzbek


ESC

Eddy AI, facilitating knowledge discovery through conversational intelligence