SubtlexUS

SubtlexUS is database containing word frequencies based on English-US movies and TV series subtitles.
The main strengths of this database are the following :

Based on spoken-like language
Based on 50 million words

Search online in SUBTL database

Follow this link, select the “SubtlexUS” database and have fun ! (dynamic research, regular expressions research, sorting, etc.)

Documentation

Brysbaert, M. & New, B. (2009) Moving beyond Kucera and Francis: A Critical Evaluation of Current Word Frequency Norms and the Introduction of a New and Improved Word Frequency Measure for American English. Behavior Research Methods, 41 (4), 977-990.

Download

Here is the corpus from which we randomized the sentences for copyright issues.

Corpus

Lexique

Boris New & Christophe Pallier

Search online in SUBTL database

Documentation

Download

Recent Posts

Meta