The British National Corpus (BNC)




The British National Corpus (BNC) is a 100 million-word collection of samples of written and spoken language from a wide range of sources, designed to represent a wide cross-section of British English from the late 20th century. It was created by Oxford University Press (OUP) and the Longman Group Ltd (now Pearson Education) in the 1980s and 1990s. The corpus contains text from a variety of genres, including spoken conversation, fiction, newspapers, and academic texts.

The BNC is an important resource for linguistic research and is widely used in the fields of corpus linguistics, computational linguistics, and language teaching. The corpus is fully searchable and is available in both a raw form, as well as a tagged form, which includes information about word class (e.g. noun, verb, adjective) and grammatical structure.

It is divided into two parts: the written part (90%) and the spoken part (10%). The written part is divided into four sections: fiction, non-fiction, newspaper, and academic. The spoken part is divided into two sections: informal conversation and formal talk.

It is also possible to access the BNC through a web-based interface, which allows users to search the corpus and view concordances (lists of words in context). The BNC is a valuable resource for researchers and language professionals, as it provides a large, representative sample of British English from the late 20th century.

 

Comments

Post a Comment

Popular posts from this blog

Demystifying SEO: A Comprehensive Guide to Search Engine Optimization

Discovering the Portuguese Dream: A Guide to Relocating and Working in Portugal

What is a headword?