English - text corpora

Post Reply
User avatar
Optilon
Site Admin
Posts: 50
Joined: Mon Aug 31, 2020 8:36 am

English - text corpora

Post by Optilon »

For English, I used the following text corpora from uni-leipzig:
https://wortschatz.uni-leipzig.de/en/download/english
eng_news_2016_10K
eng_news-typical_2016_10K
eng_newscrawl-public_2018_10K
eng_wikipedia_2016_10K
eng-au_web_2002_10K
eng-ca_web_2002_10K
eng-com_web-public_2018_10K
eng-eu_web_2015_10K
eng-uk_web_2002_10K
eng-uk_web-public_2018_10K
---
total: 100K
---
English:
English 100k - 10 files
---
Conversion to all small characters:
english_characterfrequency_with_symbols.png
english_characterfrequency_with_symbols.png (156.65 KiB) Viewed 33718 times
---
Early optimization result:
OptENG.png
OptENG.png (118.83 KiB) Viewed 33718 times
Optimization result for thumb shift:
(1s-OptENG is with thumbshift and 2s-OptENG ist with normal 2 shift keys)
OptENGthumbshift.png
OptENGthumbshift.png (157.82 KiB) Viewed 33715 times
Optimization result for normal shift:
OptENGnormalshift.png
OptENGnormalshift.png (158.59 KiB) Viewed 33715 times
Post Reply