Posts

Showing posts with the label computational linguistics

The British National Corpus (BNC)

Image
The British National Corpus (BNC) is a 100 million-word collection of samples of written and spoken language from a wide range of sources, designed to represent a wide cross-section of British English from the late 20th century. It was created by Oxford University Press (OUP) and the Longman Group Ltd (now Pearson Education) in the 1980s and 1990s. The corpus contains text from a variety of genres, including spoken conversation, fiction, newspapers, and academic texts. The BNC is an important resource for linguistic research and is widely used in the fields of corpus linguistics, computational linguistics, and language teaching. The corpus is fully searchable and is available in both a raw form, as well as a tagged form, which includes information about word class (e.g. noun, verb, adjective) and grammatical structure. It is divided into two parts: the written part (90%) and the spoken part (10%). The written part is divided into four sections: fiction, non-fiction, newspaper, and acad

Corpus

Image
A corpus is a collection of written or spoken texts that are gathered and organized for the purpose of linguistic research. These texts can come from a variety of sources, such as books, newspapers, websites, and spoken transcripts. The goal of creating a corpus is to provide a representative sample of language use in a specific context, which can be used to analyze patterns and trends in language. One of the main benefits of using a corpus is that it allows for a large-scale analysis of language. Rather than relying on the intuition or personal experience of a researcher, a corpus provides a quantitative and objective way to study a language. This can lead to more accurate and reliable results, as well as a deeper understanding of language use. Another advantage of corpus research is that it can be used to study language in a variety of contexts. For example, a corpus can be created to study the language used in a particular field, such as medicine or law, or to study language use