Page 1 of 1

Optin - Languages Included

Posted: Mon Aug 31, 2020 4:18 pm
by Optilon
For the first evaluation, the 20 most spoken international languages shall be used. Data is from Wikipedia: https://en.wikipedia.org/wiki/List_of_l ... f_speakers and Wikipedia uses data from the 2019 edition of the Ethnologue https://en.wikipedia.org/wiki/Ethnologue

Not only the mother tongue but also the second language should be taken into account. However, the second language should be considered less important than the mother tongue. I decided to consider second languages as 50%.

I corrected the data from the Ethnologue 2019. I had to correct many languages.
Corrected Language Data 2020 compact.png
Corrected Language Data 2020 compact.png (105.69 KiB) Viewed 28517 times
The African language Hausa would be placed 18th. But unfortunately I did not find text corpora in Hausa. If anyone finds some, we could use it. As I had to exclude Hausa, Cantonese is now placed 20th.

Re: Optin - Languages Included

Posted: Sat Sep 12, 2020 8:03 pm
by Optilon
Current Progress:
languages_progress.png
languages_progress.png (123.7 KiB) Viewed 28473 times